john eipe

05/16/2021, 8:00 AM
Hi Team, Is it possible to create dynamic pipeline based off some metadata.
task-group   seq
A             1
B             1
C             2

task   seq task-group
a1       1.  A  
a2       1.  A
b1       1.  B
b2       2.  B
c3       1.  C

task-group is only a logical wrapper that helps to group and control the execution flow, maybe composite solids are a fit here
tasks are essentially solids, say

def a1():

Execution flow:
Task group A and B starts parallely (seq=1) and the task a1, a2 run parallel (seq=1) but b2 runs only after b1 is complete. c3 starts after tasks in Task group A and B are complete.
Now based off this metadata that resides in file or DB - is it possible to create a dynamic pipeline?


05/16/2021, 4:49 PM
Hey John this is totally possible. We have an example in our documentation for constructing a pipeline from a yaml file: You could change this from using a yaml file to querying a database to get the equivalent information. There are existing users that have done this. Hope that helps!

john eipe

05/16/2021, 5:03 PM
thank you. I will keep this thread updated on the progress.
👍 1

Megan Beckett

10/06/2022, 11:43 AM
Hi there! I am busy trawling through the comments to try find out how to have a more generic pipeline that is static, but can have lots of permutations at run time, based on some config. OR, be able to create a pipeline automatically from some config. This is also ML/data science focused where we are building out a core product, but the implementation of each pipeline will need to vary slightly, depending on the data source and steps required. The above thread sounded promising but points to links that don't exist anymore. Do you have any updated resources about programmatically generated pipelines using a config file or some specification to do so?


10/06/2022, 12:28 PM