I'm getting ```dagster._core.errors.DagsterUserCod...
# dagster-plus
z
I'm getting
Copy code
dagster._core.errors.DagsterUserCodeUnreachableError: Timed out waiting for call to user code GET_EXTERNAL_EXECUTION_PLAN [497304d0-36be-4b85-b91e-70e078eb1e00]
New or dormant Branch Deployments can take time to become ready, try again in a little bit.
on a branch deployment that I just ran a job on about an hour ago. Any suggestions on how to trouble shoot? It's unclear to me if there's any way to figure out what ECS service / task is currently serving a specific branch deployment as they're all named using UUIDs and we have 9 different branch deployments being served right now (it'd be nice if they were at least tagged with the branch or something like that - maybe I'll make a PR for that). I'm not seeing any new tasks trying to spin up in our branch deployment ECS cluster when I try to launch a job
🤖 1
d
Hey Zach - the UUID for the branch deployment should be in the URL in the Dagster Cloud UI. That should be in the name of the relevant ecs service (and is also in the dagster/deployment_name tag in the Tags tab in the ECS console for that servcie)
z
Ah okay that makes sense!
Hmm yeah it seems like no new tasks for runs are being spun up from any of our branch deployments
d
The services are up and running though?
z
Yes
d
How's CPU/memory looking on the Health tab?
any chance they are overloaded?
z
I tried redeploying a code location and it seemed to go okay. <1% cpu, ~25% mem usage
d
and this was working fine until recently?
z
Yeah I ran one about an hour ago just fine
Hmm the agent stopped reporting
d
Any logs or errors from the agent?
z
Hmm interesting cpu utilization plummeted and memory usage slightly increased after being completely stable about an hour ago
No logs in the agent for the last 90 minutes
I think I'll try just redeploying the agent
Weird, the agent on our prod deployment also stopped reporting about an hour ago. I did some Cloudformation stuff around then, starting to suspect I messed something up there
d
Same agent serving both prod and branch deployments? or two different agents?
z
Two different agents
d
Two going down at once is certainly unusual
z
Yeah it must be some dependency that got removed when I deleted a cloudformation stack. Just weird because all our deployments have their own stacks, but maybe some resources got changed outside of IaaC... Still learning some discipline there
It's interesting that the agents aren't producing any logs though