https://dagster.io/ logo
#ask-community
Title
# ask-community
n

Nikolaj Galak

08/18/2023, 6:17 PM
Hi! I'm creating time-partitioned asset B with non-partitioned parent A ,which pulls data from the source system. Asset B it is partitioned by month. Data for asset A is constantly updated, updates are captured via CDC. Assets A and B have lazy AutoMaterialize policy assigned and asset B has Freshness policy to update it twice a day. I'd like AMP to trigger materialization A -> B based on B's Freshness policy. If I understand it correctly, AMP will trigger materialization if latest partition of B is missing or A gets fresh data. For the "pull" approach for A, it means that AMP will trigger materialization once a month because B is partitioned by month. How to make B updated based on it's Freshness policy (twice a day)?
o

owen

08/18/2023, 11:05 PM
hi @Nikolaj Galak! In this case, is A running essentially continuously? or is A also intended to run just twice a day (pulling in whatever upstream changes it can at each point)
n

Nikolaj Galak

08/19/2023, 11:50 AM
@owen
A
should pull data twice a day. The motivation is to do as little as possible but still satisfy freshness policy for B
o

owen

08/21/2023, 8:26 PM
got it -- that makes sense. I think in this particular case, I'd recommend just going with the non-freshness based approach (AutoMaterializePolicy.eager()), which will cause the latest partition of B to update as soon as A updates
n

Nikolaj Galak

08/21/2023, 10:08 PM
@owen thank you for the input. The challenge is that A is used in multiple models, not only in B. So non-freshness based approach will result in unnecessary runs of heavy B. I'm looking for a way to control materialization rule for (monthly) partitioned assets, maybe override/extend current logic It is possible to just remove partitioning from B, but that will remove backfills as well..
o

owen

08/24/2023, 12:26 AM
got it -- we're actively working on features to extend the current eager AMP system to add more flexibility, ideally with the goal of deprecating the (fairly inflexible) freshness-based system. In this world, you could imagine specifying a policy something like:
Copy code
AutoMaterializePolicy.eager().with_rules(
    AutoMaterializeRule.skip_on_cron("<twice daily>")
)
where B would only be materialized once per cron schedule tick (just an example). In that world, what would be the most ergonomic way for you to define the constraint you have on B, in your opinion?
2 Views