I'm working on a pipeline that's supposed to use a...
# ask-community
y
I'm working on a pipeline that's supposed to use a list of URLs from an environment variable (I have this as a resource), and for each of the URLs we need to run multiple steps: 1. for each URL we need to get different data sets (the datasets to get are also an environment variable/resource), for each data set we get 2. push data to Kafka (to a topic with the same name as the data set). The process for each URL/data set can/should run in parallel. the entire pipeline should run on a schedule. The process looks something like the image bellow. what would be the best way to tackle this? where should I be using assets/graphs/jobs/ops? Thanks for the insights!
j
Hi @Yehuda Ornstein you’ll likely wan to use ops/graph/jobs for this, specifically dynamic graphs https://docs.dagster.io/concepts/ops-jobs-graphs/dynamic-graphs
y
Thanks @jamie. I'll take a look