Hi, to all dagsterians. I am doing a POC on Dagster, I would like to run let's say 50 I/O bounded operations as fast as I can, with minimal resources. I cannot use Kubernetes. Seems that there is no async executor so far. Multiprocessing executor would need a lot of CPU. I was not able to find any multithreadding executor or ECS executor. Do you have any tip how to tackle that? Maybe write a custom Greenlets executor? Using celery with some autoscaling... Thanks a lot!!
02/21/2023, 7:35 PM
hi @Radek Tomšej! you're correct that there's no multithreading executor, and there's no ECS executor (although there is an ECS run launcher, which launches an ECS job per-run rather than per-op). Depending on what exactly those I/O bounded operations are, sometimes it can make sense to just group that computation into a single op (within the bounds of an op, you can use whatever multithreading constructs you want). Of course, you lose the observability benefits of separate ops but it also gives you maximal flexibility
i'd definitely avoid trying to implement a custom executor as your first forray into dagster (that can be quite complex, and is not really something we provide a lot of support for in docs)
02/23/2023, 12:45 PM
That is actually a pretty good idea to group computation into a single op. Even thought implementing a custom thread executor seems kind of easy (I just briefly looked into code). Thanks a lot!