Earthly
Fleetingcommand failed: docker pull 127.0.0.1:41177/sess-mwpqe0vmziaytaa3w3j4nosne/pullping:img-3: exit status 1: Error response from daemon: Get “http://127.0.0.1:41177/v2/sess-mwpqe0vmziaytaa3w3j4nosne/pullping/manifests/sha256:36be876ba6441e3d878cee7e3f816fc924d8c78d374a230ac57865672ce748c4”: EOF: exit status 1
This kind of error happens several times when running earthly in a CI.
Reading this documentation helps.
pullping Once the build function returns (passing a set of LLB references back to buildkit), the BuildKit server will execute the commands, and call the earthlyoutputs exporter, which will call back to the client (Earthly), which will be received by the pullping handler. This will cause earthly to perform a docker pull against the embedded registry dockertar The legacy approach for exporting images from BuildKit to the host via a tar file; we try to use pullping instead, since it only pulls the needed layers — https://github.com/earthly/earthly/tree/main/docs-internals
When the earthly command finishes, it needs to pulls the built images from the buildkit server. Pulling images is a well known bottleneck in the container world and some projects like stargz or kaniko were created. This is also why the tilt live update mechanism is so powerful.
In earthly 8, a mechanism called pullping was implemented to download only the difference of layers. Prior to that a tar.gz was downloaded each time.
To implement the pullping algorithm,
- buildkitd was forked to implement the client part of the algorithm
- a local registry is used in the earthly client
- in order to get the image inside the local docker storage, it runs “docker pull localhost:4xxxx/sess-xxxxx/pullping:img-0”
We can see by running earthly –verbose on a target that saves an image.
output | [———-] 100% exporting outputs frontend | Running command: docker pull 127.0.0.1:34859/sess-c79nfu3a1205h7nop2odoz8qv/pullping:img-0 frontend | Running command: docker image inspect 127.0.0.1:34859/sess-c79nfu3a1205h7nop2odoz8qv/pullping:img-0 frontend | Running command: docker tag 127.0.0.1:34859/sess-c79nfu3a1205h7nop2odoz8qv/pullping:img-0 foobar frontend | Running command: docker image rm -f 127.0.0.1:34859/sess-c79nfu3a1205h7nop2odoz8qv/pullping:img-0 output | –> exporting outputs
— earthly –verbose +sometarget that saves the image into foobar
It seems like when trying to pull the images from this local registry, sometimes it simply fails to answer.
This local registry appears to be linked to a few issues, leading to using the
feature flag --no-use-registry-for-with-docker
or in our case the command
line option --disable-remote-registry-proxy
, that falls back into the
dockertar mechanism.
When I run a target that saves several images, I get all the docker pull
at
the beginning, therefore, the fact the error is about the img-3 (and not
img-0) does not mean that img-0, img-1 and img-2 when processed before img-3.
By following the onPull
method into the call to docker pull in commandContextOutput
, we can see that the log written
return output, errors.Wrapf(err, "command failed: %s %s: %s: %s", sf.binaryName, strings.Join(args, " "), err.Error(), output.string())
The “Error response from daemon:” part comes from the docker client, saying that the docker daemon complains about
Get "http://127.0.0.1:41177/v2/sess-mwpqe0vmziaytaa3w3j4nosne/pullping/manifests/sha256:36be876ba6441e3d878cee7e3f816fc924d8c78d374a230ac57865672ce748c4": EOF: exit status 1
That means that the local registry most likely failed. Let’s dive into the code to find out where this may happen.
Getting into the code of startRegistryProxy
, we can see that it creates an
insance of regproxy.Controller that will eventually run the serving code. EOF
seems to indicate that the server answered but closed the connection before
giving more information.
The serving code is the following.
conn, err := r.ln.Accept()
if err != nil {
if !r.done.Load() {
r.errCh <- errors.Wrap(err, "failed to accept")
}
return
}
wg.Add(1)
go func() {
defer wg.Done()
r.errCh <- r.handle(ctx, conn)
}()
If the code panicked, we would see it in the log of the earthly client, so it is most likely that nothing was done with the accepted connection. This may happen
- if the return part of that code is reached
- if the handle did nothing with the connection
the handle part appear to deal with all the errors and simply redirect the requests to the gRPC. But when an error occurs, it is captured by the controller and shown only in verbose mode.
go func() {
for err := range p.err() {
if err != nil && !errors.Is(err, context.Canceled) {
c.cons.VerbosePrintf("Failed to serve registry proxy: %v", err)
}
}
doneCh <- struct{}{}
}()
I suppose that increasing this level with Warnf
would help undertand what
went wrong in the future.
Building earthly can easily be done with earthly +for-linux
.
some more details to understand what --disable-remote-registry-proxy
does
Using --disable-remote-registry-proxy
makes startRegistryProxy
return
false and prevent LocalRegistryAddr
from being populated. Then, inside
convertAndBuild
, the gatewaycrafter gets populated with export-image
instead of export-image-local-registry and eventually in newSolveOptMulti
,
the Output
entry will run onImage
that runs the dockertar export. This is
the solveropt that will be run by the buildkit client fork. At the same time,
the OutputPullCallback
will call onPull
that will do nothing in case
LocalRegistryAddr
is empty, and will run the pullPing algorithm otherwise.
beware that save artifact is not robust at all
behave like –if-exists sometimes
this won’t fail
test:
FROM alpine
SAVE ARTIFACT nothing
But it will correctly save the artifact if it exists
does not undertand absolute path when using –if-exists, WORKDIR
https://github.com/earthly/earthly/issues/2014
When using –if-exists, save artifact interpret the path as relative, /a/b/c becomes ./a/b/c. This messes up with WORKDIR.
This does not save the file
VERSION 0.8
test:
FROM alpine
WORKDIR /app
RUN echo foo > /app/something
SAVE ARTIFACT --if-exists /app/something AS LOCAL something
While this does
VERSION 0.8
test:
FROM alpine
RUN mkdir /app
RUN echo foo > /app/something
SAVE ARTIFACT --if-exists /app/something AS LOCAL something
And this as well
VERSION 0.8
test:
FROM alpine
WORKDIR /app
RUN echo foo > /app/something
SAVE ARTIFACT --if-exists /app/../something AS LOCAL something
Use the pattern IF + SAVE ARTIFACT, to be sure.
#+BEGIN_SRC earthfile VERSION 0.8
test: FROM alpine WORKDIR /app RUN echo foo > /app/something IF test -e “/app/something” SAVE ARTIFACT /app/something AS LOCAL something END #+END_ARC
connecting to a docker registry from within
Not yet, but planned and discussed in https://github.com/earthly/earthly/issues/1722
try finally save artifact does not work well
If we want to save a directory, it won’t work -> https://github.com/earthly/earthly/issues/2817
We have to use a workaround like https://github.com/earthly/earthly/issues/2817#issuecomment-1536766279 to actually save a file
BUT…
- this is not to be copy pasted as-is for we also want to save the artifact in case of success,
- the file size is limited -> https://github.com/earthly/earthly/issues/2452
So far, the best I can think of is to
- allow the failing test to pass,
- save its exit code as an artifact,
- add a check in the command that runs earthly about the exit code,
This is an example of this workaround.
mytest:
...
RUN --no-cache mycommand ; echo $? > res.txt
IF test "$(cat res.txt)" != "0"
# to undertand this better -> https://konubinix.eu/braindump/posts/56c711e5-6d29-4b9d-b630-f75c92800c61/?title=earthly_try_finally_save_artifact_does_not_work_well
RUN echo "$(tput setaf 1)BEWARE THAT THE TEST FAILED! DON'T BE FOOLED BY FACT EARTHLY WILL RETURN IN GREEN!"
END
SAVE ARTIFACT --if-exists somewantedartifact AS LOCAL somewantedartifact
SAVE ARTIFACT res.txt AS LOCAL res.txt
runtest() {
earthly +mytest && return "$(cat res.txt)"
}
runtest
caching
-
External reference: https://docs.earthly.dev/docs/caching/caching-in-earthfiles
disable layer caching is to use the RUN –push flag. This flag is useful when you want to perform an operation with external effects (e.g. deploying to production). By default Earthly does not run –push commands unless the –push flag is also specified when invoking Earthly itself (earthly –push +my-target). RUN –push commands are never cached.
— https://docs.earthly.dev/docs/caching/caching-in-earthfiles
cache bust
-
External reference: https://docs.earthly.dev/docs/caching/caching-in-earthfiles#cache-size
If the configured cache size is too small, then Earthly might garbage-collect cached layers more often than you might expect. This can manifest in builds randomly not using cache for certain layers. Usually it is the biggest layers that suffer from this (and oftentimes the biggest layers are the most expensive to recreate).
— https://docs.earthly.dev/docs/caching/caching-in-earthfiles#cache-size
wait
-
External reference: https://docs.earthly.dev/docs/earthfile#wait
The WAIT clause executes the encapsulated commands and waits for them to complete. This includes pushing and outputting local artifacts – a feature which can be used to control the order of interactions with the outside world.
Notes linking here
- a python runtime on android (blog)
- Access ECR with Earthly secrets (and no need for credential helper / aws CLI installed)
- clk k8s and earthly in a local dev env (blog)
- devops
- docker looses the dns configuration
- don’t create a CI that is hard to run on your machine anymore
- earthly buildkit docker server gave HTTP response to HTTPS client
- earthly buildkitd won’t work if started in network=host
- how I debug MTU issues in k3d in docker in earthly in docker in k8s (blog)
- how I debug my k8s tests not running
- my issues with literate programming
- setting up my raspberry pi fleet
- several flavors of testing one’s code
- with earthly