Is there a way to scale Dagster across multiple servers? Like flask behind haproxy.
m
max
12/03/2021, 3:40 PM
hi george, which component of the system are you interested in scalin?
g
George Pearse
12/03/2021, 5:46 PM
The pipeline execution, but I guess that's the container that the dagster_pipelines container spins up when a pipeline starts running. Guess Kubernetes or vertical scaling are the only ways to improve the performance of that part?
m
max
12/03/2021, 6:01 PM
execution is handled by the RunLauncher and Executor abstractions -- these are completely pluggable, so you can pick from the available options in open source or write your own to their APIs
basically they let you control execution at two levels -- RunLaunchers govern where new runs are launched, and Executors govern how the individual steps in a new run are launched
👍 1
there are options that launch in Docker, Kubernetes, managed container runtimes like ECS, as well as using OS processes, and also options that let you use Celery for a queue/worker model
which you pick will depend on exactly how you want to scale and what tradeoffs you want to make
@max I currently use DockerRunLauncher but would I get better performance if I just used DefaultRunLauncher ? Don't know if I get any benefits from that containerisation.
m
max
12/04/2021, 4:12 PM
ymmv: there is some overhead associated with using containers for anything but the trade-off is that you get isolation and repeatability