Hello, I’ve a question regarding dagit bottleneck ...
# ask-community
k
Hello, I’ve a question regarding dagit bottleneck on OSS deployment. Are there any possibilities that launching many executors (about 400-500 executors) will cause performance bottleneck because of log streaming from executor back to dagit?
d
Hi - I wouldn't expect dagit to be a bottleneck here. The way this works is: • ops write to the dagster database • dagit (when you happen to have a run loaded) streams data from the database for the particular run that you're looking at So if anything is going to be the bottleneck, it will be the database handling all the writes from the 300-400 ops at once - but there are lots of quite scalable cloud database solutions out their like RDS, so I would be surprised if that was a bottleneck as long as you have a beefy enough database to handle that many connections
k
Thanks a lot! Then it might be because of our database 😱. The problem we faced is 1. Running one or two jobs on separate fargate, the job will be quite fast (1-2 minute per job) 2. Running 400-500 on separate fargate, the job will be super slow (10-15 minutes per job) 3. from logging, the bottleneck seems to be something unobservable during the beginning and the end of the op’s (not the logic inside ops)
d
What kind of database are you using?
k
RDS with two cpu cores
d
I could imagine that needing some more CPUs for that many ops running at once, yeah
RDS has some pretty good performacne monitoring tools I think - you could check if its getting capped out on CPU
k
ohh you’re right! Thanks!
Yeah, definitely RDS problem! The RRS hit 100% constantly during peak load 😂. Thanks again!
condagster 1
🎉 1