how do you specify a range of partitions for an upstream dependency (e.g., i want my op to proceeds then last 30 days of data)
c
claire
08/08/2022, 2:33 PM
Hi Chris. One immediate way I can think of is to use the standard date-partitioned configuration, and within your op, use context.op_config to get the current date. Then, within the op, you can do computation on the last 30 days based on the current date value.
c
Chris Hansen
08/08/2022, 2:36 PM
how do i block that op from running if the past 30 days of data do not exist?
c
claire
08/08/2022, 3:42 PM
You could use two ops, the first to check for the existence of the past 30 days, and the second to run the computation you want.
You could output an optional output from the first op if the last 30 days exist. If the optional output isn't yielded, the second op won't run.
c
Chris Hansen
08/08/2022, 3:53 PM
dooesn’t that break lineage if i’m using ops to check for dependencies?
c
claire
08/08/2022, 7:39 PM
Hi Chris. What do you mean by breaking lineage?
c
Chris Hansen
08/08/2022, 8:20 PM
i guess i would still define the AssetIn() as the upstream thing i want 30 dyas of. does the AssetIn definition presume a specific partition?