Chris Anderson
02/16/2023, 8:26 PM.collect
call eventually, but i'd like to have a limit of only one of those pipelines within a job running at a time. For a visual, in the example photo below i'd like the first mapping index of downstream ops to run and finish execution before the next mapping index is started. I tried per-op prioritization but it seems (correct me if i'm wrong) that it's a prioritization of launching the op, not actually the complete execution of it. Is there an easy way to offer concurrency limits on a per-mapping index baseline, or some ideas about possible method of going about this problem?Chris Anderson
02/16/2023, 8:34 PMowen
02/16/2023, 9:53 PM@job(config={"execution": {"config": {"multiprocess": {"max_concurrent": 1}}}})
Another option would be to apply per-tag concurrency limits. If you add a tag to downstream_one of {"group": "downstream_one"}
and a similar tag to downstream_two, you should be able to set per-tag concurrency limits:
@job(
config={
"execution": {
"config": {
"tag_concurrency_limits": [
{"key": "group", "value": "downstream_one", "limit": 1},
{"key": "group", "value": "downstream_two", "limit": 1},
]
}
}
}
)
...
(apologies if the formatting is somewhat off)
this will limit the concurrency purely for those ops, meaning you can still run more stuff concurrently in cases where the order doesn't matter