Hi all I m trying to build a workflow where I combine base l dagster #ask-community

Hi all, I'm trying to build a workflow where I com...

Matt Herling

04/28/2023, 1:30 PM

Hi all, I'm trying to build a workflow where I combine base layers in a tree like fashion. I also want to see my results every day. So imagine that I have a job get_data which is multi partitioned by a static partition (A1,A2,A3,B1,B2 etc.) and a dailyPartition. The next step in the graph would be a combine_data asset which would also be multi partitioned by a static partition (A,B,...) and dailyPartition. The goal of the combine_data job is to combine A1 with A2 and A3, B1 with B2, etc. Is there currently support for that type of partitioning mapping? Been having trouble using a MultiPartitionMapping with a StaticPartitionMapping (for the static partitions) and an IdentityPartition (for the dates)

chris

04/28/2023, 5:01 PM

So it sounds like there’s really almost a 3 dimensional partition here with rollup into two dimensions… do you actually care about A1, A2, … A_n or is that just used to get parallelism when processing? Bc if so I’m wondering if you could just use a dynamic graph-backed asset to achieve said parallelism, and avoid the need for the somewhat nasty partition mapping between two sets of static partitions (which I’m reasonably sure we don’t support)

Matt Herling

04/28/2023, 5:29 PM

Thanks for the reply Chris. I have all of these dependencies sketched out in a json config file - I do care about the individual data in a1, a2, etc. I've implemented a factory function which loads the config and uses the

ins

field to explicitly define the tree. I actually find this to be a lot cleaner than the way I was trying with partitions, since now my only partition is date and I can bundle other data (like tags) in the config which is more human readable.

chris

04/28/2023, 8:54 PM

makes sense - glad it’s working for you

Open in Slack

Previous Next