Joram Sbrisny
03/07/2024, 8:44 PMScout
03/07/2024, 8:44 PMmax_concurrent
configuration within the job's execution config. Here is an example from the documentation that sets the maximum concurrent subprocesses to 4
for a job: python @job( config={ "execution": { "config": { "multiprocess": { "max_concurrent": 4, }, } } } ) def my_job(): ...
In this example, the max_concurrent
setting ensures that at most four assets execute at once within a single run of the job. This is useful for controlling resource utilization and preventing overloading your system. Additionally, you can use tag_concurrency_limits
to specify limits on the number of assets with certain tags that can execute at once within a single run. Here's an example that sets a concurrency limit based on tags: python @job( config={ "execution": { "config": { "multiprocess": { "max_concurrent": 4, "tag_concurrency_limits": [ { "key": "database", "value": "redshift", "limit": 2, } ], }, } } } ) def tag_concurrency_job(): ...
In this case, the job will execute at most two assets at once with the database
tag equal to redshift
, while also ensuring that at most four assets execute at once in total. Remember that these limits are applied on a per-run basis. If you need to apply concurrency limits across multiple runs, you would need to use executors like celery_executor
or celery_k8s_job_executor
that support such functionality.