I have a partitioned asset. When running back fill...
# ask-ai
m
I have a partitioned asset. When running back fills. Is it possible to regulate how many jobs will be running concurrently. I have a problem where one database is a replica. Meaning I can only make one query at a time.
s
Yes, it is possible to regulate how many jobs will be running concurrently when performing backfills on a partitioned asset in Dagster. You can achieve this by configuring a run queue and using
tag_concurrency_limits
in the run queue configuration to control concurrency. Here's how you can set up concurrency limits for backfills: 1. Configure a Run Queue: Depending on whether you are using Dagster Cloud or Dagster Open Source, you will need to set up a run queue. This run queue will manage the execution of runs. 2. Use
tag_concurrency_limits
: Within the run queue configuration, you can specify
tag_concurrency_limits
to control how many runs can execute concurrently. You can set limits across all backfills or within each backfill. For example, to limit to one concurrent run for a specific backfill, you could use a configuration like this:
Copy code
yaml run_coordinator:   module: dagster.core.run_coordinator   class: QueuedRunCoordinator    config:     tag_concurrency_limits:       - key: "my_limited_backfill_tag"         limit: 1
3. Supply Tags When Launching a Backfill: When you launch a backfill from the Dagster UI, you can supply a set of tags in the backfill dialog. These tags will be used in conjunction with the
tag_concurrency_limits
to enforce the concurrency limits. By setting the concurrency limit to 1 for the tag associated with your database backfill, you can ensure that only one query is made at a time, thus respecting the limitation of your replica database. For more detailed guidance and examples, you can refer to the documentation on limiting concurrency in data pipelines.