https://dagster.io/ logo
Title
s

sri raghavan

03/14/2023, 7:53 PM
hi y'all! I'm trying to think through the conceptual modeling for a partitioned asset we're producing, and how to make sure we materialize assets for the appropriate partitions on the appropriate interval. here's what we've got: • a multi-partitioned asset, by
account_Id
(tenant) and
day
• per-tenant configuration that lives in our production DB (RDS), and determines the timezone for each tenant, and how long to wait after midnight in that timezone before the run for a given day can be started We're currently thinking to use a single sensor, that checks frequently (in a relatively tight evaluation loop) whether or not each tenant is ready to 'advance' to the next day. I don't think Dagster has support for "partitioned cursors", though, which is the amount of state I imagine we'd need to track this. Ideas?
c

claire

03/16/2023, 10:40 PM
does each tenant run at the same time each day? If that's the case, you could set up a schedule per tenant that executes at the desired time. If not, you could create your own cursor object that stores a dict mapping tenant id to the latest day it was run. Then, you can do something like
context.update_cursor(json.dumps(cursor_dict))
to update the cursor after evaluation.
s

sri raghavan

03/17/2023, 2:02 AM
hi @claire - thanks for the follow-up. we ended up going with the cursor solution, because there's a secondary consideration - we have a system external to dagster that's ingesting data into redshift, and we need to check that "all the data for a given tenant is ingested" before we run the report for that tenant and day
(a fixed schedule doesn't allow for lag in that system, so a sensor with a dict cursor made more sense for us)
c

claire

03/17/2023, 4:48 PM
I see. makes sense, glad the cursor is working for you