What is the difference between a Dynamic Graphs an...
# ask-community
s
What is the difference between a Dynamic Graphs and Partitions? I am exploring the best way to distribute a workload across our ECS-Fargate instances. We will chunk several ids from a database, make API calls based on these ids, then write results to S3. In reading the documentation, both Partitions and Dynamic Graphs seem appropriate for the job. I'm fairly new to Dagster, so wondering what is the difference and which should I choose?
🤖 1
z
One key difference is that a Dynamic Graph will do all the parallel operations in a single run (and if you're using the ECS run launcher that means they'll occur in a single ECS task), which requires a lot less overhead. Dynamic partitions will execute a different run for every partition, but you get additional metadata tracking each individual run. This produces a significant amount of additional overhead as each run will take 1-3 minutes to spin up on Fargate. So if your tasks are small and you have a lot of them I'd definitely recommend dynamic graphs. What you lose with dynamic graphs vs. partitions is the ability to easily backfill specific portions of the job or execute the job incrementally (although you can build some of that into your chunking logic.
s
Thank you @Zach; Leaning in on your statement about running in a single ECS task, will the behavior still be as if it were running async or will each of my chunks run one after another? Since my waits are IO bound (calling an API), I'd like to shorten my runs through threading behavior.
z
All the ops in a dynamic graph fan-out will run in separate processes in parallel within the ECS task when using the default multi-process executor (up to the
max_concurrent
config value you set for the executor).
s
Sounds exactly like what I need, thank you so much for your help.
🎉 1