Marvin Rösch
08/22/2023, 10:47 AMusers
and subscriptions
that are backed by dbt models to pump data from one data store to another with some transformations applied. subscriptions
depends on users in addition to some source assets. Both of them have a freshness policy that requires them to be materialized by 02:15am with a 2 hour max lag. However, neither asset gets an auto-materialization with reason "Required to meet this asset's freshness policy" triggered. users
happens to get triggered due to a downstream freshness policy at some later point in time, but subscriptions
becomes and remains overdue. The auto-materialize history is entirely empty for that asset.
The expected behaviour would be that both assets get materialized around the same time around 02:15am. We have some other assets that are also affected by this, but this is the simplest case I could find. I unfortunately have not been able to reproduce this locally.Marvin Rösch
08/22/2023, 10:47 AMusers
has several more source assets it depends on):owen
08/23/2023, 10:36 PMusers
depend on any observable source assets, or are they all just regular source assets?Marvin Rösch
08/24/2023, 5:25 AMusers
depends on definitely do not get any observation events from dbt tests, howeverMarvin Rösch
08/30/2023, 7:23 AMdaniel
08/30/2023, 3:50 PMMarvin Rösch
08/31/2023, 5:59 AMdaniel
08/31/2023, 1:01 PMMarvin Rösch
09/05/2023, 8:19 AMMarvin Rösch
09/12/2023, 10:43 AMowen
09/12/2023, 5:27 PMdefinitely agree that having more in-depth logging here would be useful, I'm looking into adding some freshness-specific information to those logs. are the evaluations you're showing there for thebut to get to the heart of the issue, I think you're correct that a freshness-based solution is potentially overkill for your specific situation. In terms of that specific issue, this is something we are investigating, and is certainly something that we want implemented, but ideally in a way that does not conflict too heavily with the existing scheduling system. with that in mind, one pattern that we've recommended for similar situations is a combination of traditional schedules for your root assets ("run these dbt models at x time every morning"), and then eager policies for the downstreams. would this work for your specific usecase?asset or theusers
asset?subscriptions
Marvin Rösch
09/13/2023, 5:27 AMsubscriptions
asset in that it has a dependency on another asset, but in this case that dependency got materialized just fine. It does not depend on any additional source assets, unlike subscriptions
.
We are already considering creating schedules for the root assets given these issues, but mainly avoided it due to the overhead from adding the schedules for the dbt-generated assets.owen
09/14/2023, 8:54 PMMarvin Rösch
09/28/2023, 11:14 AM