https://dagster.io/ logo
#integration-dbt
Title
# integration-dbt
s

Shangwei Wang

08/11/2023, 3:40 AM
👋 question about auto-materilizing
dbt_metrics_default_calendar
which is something coming from using dbt “metrics” (i.e. this is not something we manage/define ourselves). This model practically never changes (it is a date spine through 2030), and many of our models depends on it. if i configure auto-materialization with freshness policy for the models that depend on this date spine, how do i configure these models without having to waste time/resource on materializing this date spine (since the data won’t change until 2030)? Thanks!
r

rex

08/11/2023, 3:58 PM
@owen is this possible with AMP’s?
o

owen

08/11/2023, 8:56 PM
hi @Shangwei Wang -- with freshness-based scheduling this is not currently possible (you'd need to create an observable source asset upstream of
dbt_metrics_default_calendar
, which it sounds like you would not be able to do as this is a built-in model). However, we're looking into expanding options on the eager auto materialize policy (which would not be impacted by a root asset like your calendar, which never updates) to enable more things that people currently do via freshness-based scheduling. For example, putting limits on how many materializations a given asset can have in a given hour. Mind saying a bit more about how you'd like your AMPs to function?
n

Nikolaj Galak

08/12/2023, 7:17 PM
Could "one-time" assets be solved by policy with
AutoMaterializePolicy(
on_missing=True,
on_new_parent_data=False,
for_freshness=False)
o

owen

08/14/2023, 8:27 PM
That would work for making sure that particular asset would execute just a single time, but assets downstream of that one would end up being unable to fire, as dagster would assume that this asset's data was out of date (and therefore assets downstream of it would not be able to get up to date by consuming data from it)
s

Shangwei Wang

08/14/2023, 8:42 PM
thanks for the response. i just started integrating dagster into our pipeline, so am still learning. one use case we are trying out is to have the downstream dbt models configured as
lazy
(including all the models along the way), one of the downstream model has a pretty low freshness policy, say 10min for example. So i’m curious if i can save some building time from eliminating the re-build of this built-in model, by say, tricking the downstream models to see this built-in model as “always fresh”?
o

owen

08/15/2023, 4:54 PM
the way to do that "tricking" would be by setting up an observable source asset upstream of your never-rematerializing asset, but it sounds like you might not be able to add an upstream dependency (even a fake one) to this built-in model
s

Shangwei Wang

08/15/2023, 4:59 PM
thanks! just for my learning, is the idea that by using “observable source asset” i can leverage the “data version” feature?
o

owen

08/15/2023, 10:12 PM
that's right yep! the observable source asset could emit a never-changing data version (you'd just need to run it a single time), which would let dagster know that dbt_metrics_default_calendar had consumed the latest available version
3 Views