Hello all, Has anyone encountered this issue as w...
# deployment-ecs
a
Hello all, Has anyone encountered this issue as well on AWS ECS Fargate:
Cannotpullcontainererror: check schema1 manifest size has been retried 1 time(s): pulling from host <http://registry-1.docker.io|registry-1.docker.io> failed with status code [manifests 1.0]: 429 Too Many Requests
This happened when I activated a schedule that fired off 1500 jobs, where some jobs that are classified as "failed to start" got the error message above as stopped reason. The weird thing here is that it happens at the ResolvConf_InitContainer, which is a sidecar that is constructed by docker-compose (correct me if I'm wrong) and not the "normal" container on which the job runs. Is there a way to fix this issue? Or a simple explanation of why this error occurs would already be very helpful! Thank you in advance 🙏
There was a different error message as well in some cases:
Cannotpullcontainererror: ref pull has been retried 1 time(s): failed to copy: httpReadSeeker: failed open: unexpected status code <https://registry-1.docker.io/v2/docker/ecs-searchdomain-sidecar/manifests/sha256:d7fb297faf83229eb460a595d9fa316899cb6c09564927ca2be827ec153f736c>: 429 Too Many Requests
tldr: Docker.io registry doesn't let you hammer them. you could push a copy of this image to a private ECR repo and pull it as much as you want that way, or you could pay for a docker hub account and use those credentials to avoid being rate limited
this isn't really a Dagster issue per se
as for ECS, it's too bad there's no image caching or layer torrenting type thing so that hammering the registry 1500 times all in one go isn't necessary...
m
I ran into something vaguely similar with EKS / Fargate. I had hoped that Fargate would make auto-scaling easy, but each Fargate node pulls a fresh Docker image. In our case, the image is 1.6GB, and our Dagster steps are pretty quick, so the image pull substantially increased the per-step time. (We're just using a fixed-size EKS cluster for now.) https://github.com/aws/containers-roadmap/issues/649 and https://github.com/aws/containers-roadmap/issues/696#issuecomment-996917490 are related tickets.
a
Hi Mike, thanks for the fast reply! What image does Dagster exactly grab from Docker? As far as I know, we have the Daemon, Dagit and gRPC images stored in ECR already. Only the ResolvConf_InitContainer I cannot place, as this one is created automatically when deploying the Dagster infrastructure on AWS. Correct me if I am wrong! :)
@Mark Fickett yes, been waiting quite some time already on the cache functionality for Fargate, would really speed up the process...
This link covers the issue pretty well I think. The sidecar image uses a Docker pull: https://github.com/docker/compose-cli/issues/2190
m
Whether each run should use the same sidecars as the task that launches it. Defaults to False.
Not sure if you need that sidecar actually,
ecs-searchdomain-sidecar
so maybe that would be all you needed
a
No! I do not specify
include_sidecars
, I tried to play with that parameter but it did not solve the issue unfortunately