Hi all. Is there a way to limit concurrency across...
# ask-community
a
Hi all. Is there a way to limit concurrency across ops based on a tag? I've seen that the QueuedRunCoordinator has this for runs. Is there a built-in way to achieve this for all ops being run? Ideally this would work with k8s as that is the orchestration we are planning to use with dagster.
p
Hi Amadou! Would it work to set up celery queues to set up concurrency limits using op tags? You might find this useful: https://docs.dagster.io/_apidocs/libraries/dagster-celery-k8s#dagster_celery_k8s.celery_k8s_job_executor
a
Thank you for linking to that. If we were to use the celery queues where would the concurrency limits be configured? I don't see mention of that functionality.
p
The concurrency limits would be set by the number of workers and the concurrency options set per worker
(where worker is celery worker)
a
Okay I think I'm following. Below are a few follow up questions. Some of this might be that I need a deeper understanding of celery+dagster and I'm happy to go read more if you have resources that might fill in the gaps in my understanding. 1. Is there a way to specify what queue a task gets put on based on tags or other op metadata? 2. Is the celery worker deployment something we have to manage or is this a dagster config? 3. When a celery worker pulls a task and launches a k8s job, does it not pull another task until that job is done? I.e. will a worker pull task A and then launch job A and then pull task B and launch job B before job A is complete? Or will it pull task A and launch job A and then wait until job A is done before pulling task B?
p
Forgive me, I’m not as familiar with the specifics of the celery/dagster specifics either. I do know that some users are using it with some success. (cc @johann) Also, I think that this functionality might come directly to the k8s executor soon. See https://github.com/dagster-io/dagster/issues/6580
a
No worries. Thanks for sharing that issue. It is slightly different from this use case as it is a limitation across all spawned k8s jobs but is also something I am interested in! The Per-Op limits in Kubernetes doc has good information on the above questions 1. You can specify the queue by using the
dagster-celery/queue
tag. doc 2. It looks like the celery workers can be configured with the rest of dagster via a helm chart 3. Based on the content in the deployment architecture I believe the celery task launches the k8s job and waits for it to complete but I am still uncertain on that. It would be helpful to get confirmation on this if anyone knows.
dagsir 1
j
I believe the celery task launches the k8s job and waits for it to complete
that’s correct
a
Fantastic. Thank you for the help Johann and Phil!