haha so many questions for you guys today! it's de...
# announcements
c
haha so many questions for you guys today! it's deploy day. i'm also setting a bunch of "This pipeline run has been marked as failed from outside the execution context"
a
lots of questions are good! helps us fix a lot of stuff. that message should only occur when you are using dagit to do executions and it loses the subprocess unexpectedly that it was using for the execution
you may have something in stdout/stderr where ever dagit is running - this should only happen if the process crashes
c
hm i'm not seeing anything in the logs
also having items just get stuck in the "starting" status
a
hmm what type of machine is dagit running on ?
c
ECS fargate w/ 8gb memory, 1vCPU
to be fair i am programmatically spawning about 50-100 pipeline runs at a time which i don't think is exactly a use case you guys are planning around
a
oo ok tell me more about how youre kicking these off
you may want to set
max_concurrent_runs
if they are executing via dagit https://docs.dagster.io/latest/deploying/instance/#dagit
c
nope they are executing only in separate celery workers via redis
a
well theres where the “run” is happening and where the “steps” are happening
so i assume you are using the celery engine which executes each step (based on the solid it came from) via celery
c
oh
hm
a
but there is also the overall run which is coordinating and putting stuff in to the queues
c
right
a
so how are you kicking off the runs?
c
i believe it is coordinated via dagit and each step is running on celery
i created a pipeline that spawns a bunch of new pipeline runs
using the RemoteDagitRunLauncher
a
ahhhh i seeeeeee
c
a
ok cool cool cool
well im guessing the issue is that even though actual execution isn’t happening in those processes - its too much for that one vcpu and 8g of ram to have hundreds of subprocesses going at the same time
some options are: * set
max_concurrent_runs
on the dagit instance settings - this will use an in memory queue so isnt the safest but should allow things to proceed without crashing * give the dagit box more resources * write your own run launcher that does whatever you can dream for where to handle these processes
c
hm ok
i'll try max_concurrent_runs
what is the default for that?
a
when not set no queue is used
so unbounded
c
ah ok
a
you could also try to introduce some delay between each run submission
since once the pipeline is running it just sleeping and checking on celery - the majority of contention will be at start up time
c
that makes sense
i'm gonna give both of those a shot