https://dagster.io/ logo
Title
a

Arun Kumar

08/25/2021, 6:47 PM
Hi team, we want to set tag concurrency limit to set max quota for each user code deployments that are running in our dagster deployment to make sure one of them don't take all the allowed max limit. We also want to apply limits to certain use cases under each of the user code. Does the team recommend any specific structure for this use case? We are currently trying something similar to below
queuedRunCoordinator:
  enabled: true
  config:
    max_concurrent_runs: 25
    tag_concurrency_limits:
      - key: "user-code-1"
        limit: 15
      - key: "user-code-1"
        value: "usecase-1"
        limit: 10
      - key: "user-code-1"
        value: "usecase-2"
        limit: 5

      - key: "user-code-2"
        limit: 15
      - key: "user-code-2"
        value: "usecase-1"
        limit: 10

      - key: "databricks"
        limit: 10
j

johann

08/26/2021, 7:40 PM
Hi @Arun Kumar, this structure seems reasonable to me. There may be some tricks for setting tags on every pipeline and sensor/schedule within a repo- cc @chris
a

Arun Kumar

08/26/2021, 7:50 PM
Thanks @johann for the response. Quick follow ups: 1. When a particular job satisfies multiple conditions, does the coordinator satisfy all the limits. In the above example, we have a global limit on
databricks
tag, so if we get a
databricks
job from
user-code-1
, will it still be limited to 10 runs irrespective of the limit on
user-code-1
? 2. If I understand currently, if any user code job missed to set the appropriate user-code tag, it will fall under the global runs and can possibly exhaust the global limit (25 runs in the above example). Is there any way to avoid this situation?
j

johann

08/26/2021, 8:04 PM
1. When a job falls under all limits, it will only launch once it won’t exceed any of them. So to your question, the databricks job from user-code-1 will launch if both of those limits can be respected. If for example the databricks limit was met but the user-code-1 limit was not, then other jobs from user-code-1 that don’t have a databricks tag may proceed
:thankyou: 1
🙏 1
Hi @Arun Kumar - following up that this should become slightly easier after our release on Thursday. We added an
applyLimitPerUniqueValue
flag to the concurrency constraints, so you could have a default limit for each of your use cases in the above example instead of needing to enumerate them all
I forgot to respond to your second point- yes unfortunately all the pipelines need to be tagged. There are some patterns that can help with that, though they’re not very clean at the moment- one example is https://dagster.slack.com/archives/C01U954MEER/p1626124210089600?thread_ts=1625551374.397900&cid=C01U954MEER. I also filed https://github.com/dagster-io/dagster/issues/4682 for us to make this easier in the future
a

Arun Kumar

09/01/2021, 11:24 PM
Thanks @johann I was looking for some way to make sure that every user deployment will always have a valid tag. Currently we do not have any control and anyone with a new user code can just exhaust all the global limits affecting other user code deployments.