Got a situation where I have a list of hundreds of...
# ask-community
j
Got a situation where I have a list of hundreds of databases distributed among dozens of instances all in a list. This list changes each day. I need to extract data from each database, but I can’t run queries against databases in the same instance in parallel because the instance will give an error. However, I need to parallelize the entire process as much as possible. Option 1a: (not possible) Create a DynamicOutput solid that gives the list of databases for each instance and then loop over each database in the mapping step, but you can’t do loops in
@pipelines
and the process of extraction here requires a sequence of several solids. Option 1b: (not possible) Create a DynamicOutput solid that gives the list of databases for each instance then another DynamicOutput solid that effectively loops over the databases within that instance. However, you get an error when you have a dynamic output downstream of another dynamic output. Option 2: (not good) Setup a new pipeline for each instance, but then you have a whole lot of duplicate code and also doesn’t take into account the dynamic list of instances (an instance could be added or removed on any given day). Any thoughts on how to accomplish this?
Here’s a diagram of what I’m trying to accomplish