Hi, I'm working on an op that processes a datafram...
# ask-community
Hi, I'm working on an op that processes a dataframe and make several API calls for each row. My initial implementation was to use DynamicOutput to manage the downstream ops. However, the APIs have rate limits and managing a few hundred thousand of parallel ops at one time seems quite expensive. Any other ideas on how I can approach this?
🤖 1
Hi YH. You can limit the number of parallel ops that are running at a time by specifying an op concurrency limit on the default multiprocess executor: https://docs.dagster.io/concepts/ops-jobs-graphs/job-execution#default-job-executor
Yes thanks @claire I found that after reading some other threads. On a side note, I might also redesign the way I chunk the data, so that each parallel op handles more than one row of data. I can then use pyrate to rate limit the subsequent calls. Since I would be using dagster-serverless, I think there would already be a natural limit based on cpu cores too