https://dagster.io/ logo
Title
l

Leo Qin

05/15/2023, 6:37 PM
Hello - our main deployment is encountering errors -
FileNotFoundError: [Errno 2] No such file or directory: 'dagster/assets_dbt_python/repository.py'
- we observed this error first in several jobs starting between 11:02 and 11:25am pacific time. We then attempted a re-deploy and now this is happening for the entire deployment. The deployment didn't change (we last deployed this morning around 9:30am pacific time)
We recently did a branch deployment and i notice that the id associated to the agent is the same as the one in our main deployment...
d

daniel

05/15/2023, 6:46 PM
Hey Leo - i'm wondering if there might be a problem in your CI/CD causing the image tag to be applied incorrectly? I see that the tag being associated with your jobs and code locations is something like :
dagster/image:<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-<REDACTED>:None|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-<REDACTED>:None>
the tag being None is definitely unexpected, it should be something that uniquely identifies a particular commit
Could you possibly share your github action yaml? Wondering if something is misconfigured there
The reason I think that could be related is that if the tag for everything is None, it would be very easy for, say, a branch deployment to mess up the prod deployment, since they both have the same incorrect tag
l

Leo Qin

05/15/2023, 6:47 PM
we're using
dagster-cloud serverless deploy
to build the image
d

daniel

05/15/2023, 6:49 PM
Got it - do you have a sample CLI command that you're running, and do you know what version of the
dagster-cloud
package you have installed in the environment where its happening?
l

Leo Qin

05/15/2023, 6:49 PM
"dagster-cloud==1.3.3",
dagster-cloud serverless deploy \
  --location-name dagster \
  --working-directory dagster/ \
  --agent-timeout 600 \
  --python-file dagster/assets_dbt_python/repository.py \
  --deployment $BRANCH_DEPLOYMENT_NAME \
  --base-image $DOCKER_IMAGE:$GIT_SHA  \
d

daniel

05/15/2023, 6:50 PM
How about for that non-branch-deployment case?
l

Leo Qin

05/15/2023, 6:51 PM
same command for both, that's the version of the main deployment
d

daniel

05/15/2023, 6:51 PM
ah ok, I saw BRANCH_DEPLOYMENT_NAME
ok, will do a bit of digging and report back - i would think that redeploying in an environment where that file exists should fix it
(and that the event that caused this might have been a bad push to the None tag until we sort that out)
l

Leo Qin

05/15/2023, 6:55 PM
the branch deploy was also on 1.3.3
d

daniel

05/15/2023, 6:56 PM
I have a possible lead. I'd be curious if the tag goes back to what it was before if you take dagster-cloud back to 1.3.2?
(which is something I could check on my side once you do a deploy)
l

Leo Qin

05/15/2023, 6:58 PM
I'm re-deploying right now just to get back up and running, but I see that as of 6 days ago the tag was
prod-dagster-<tag>
and we were on
1.2.2
d

daniel

05/15/2023, 7:00 PM
Or actually, here's something that won't require a version change - try adding:
--image $BRANCH_DEPLOYMENT_NAME:dagster:$GIT_SHA
(don't forget the
\
after the
--base-image
line)
if that fixes it, we'll get a real fix out shortly
l

Leo Qin

05/15/2023, 7:01 PM
i'm doing a re-deploy to get back and running, but will look into this later today
:ack: 1
the
--image
argument is distinct from and compatible with the
--base-image
argument, right?
d

daniel

05/15/2023, 7:58 PM
that's right, --image-tag would be a better name
We just merged in a real fix for this that should go out in 1.3.5 this week - thanks for reporting
(and sorry for the trouble)