https://dagster.io/ logo
#integration-dbt
Title
# integration-dbt
d

Dennis Gera

05/22/2023, 8:46 PM
Hi team! We are trying to create a dbt asset selection using
DbtManifestAssetSelection
that allows us to select only the models that have fresh data. In dbt CLI we do this using the source_status method (
source_status:fresher+
). However, this method requires having the
source.json
artifact for comparison. We want to know what would be the recommended way of creating, storing and then retrieving this
source.json
file given that creating it only for the docker image is not a viable option (our data would be stale pretty quickly). We thought about generating the
sources.json
file and then storing it to the pod's temp memory and comparing it to our prod
source.json
file in s3. Is this possible or is there a better way to do this? @owen @Gabriel Montañola
thinking with blobs 1
o

owen

05/23/2023, 12:25 AM
hi @Dennis Gera! this is not a workflow that we currently have recommended patterns for, so at a high level I'd say that we don't have a definitive answer here, and would be interested in knowing how whatever solution you end up with works for you. However, I do think that using
DbtManifestAssetSelection
might not be the ideal way to go about this, as AssetSelections in general should generally be static once your code is deployed (i.e. they should not resolve differently based on anything other than code changes). Another potential way of going about this might be to take advantage of the (not-yet-released)
@dbt_assets
decorator. It'll go out in this week's release, but essentially it allows you to write whatever compute function you want for your dbt assets, rather than relying on the prebuilt function that we provide. In short, you could do something along the lines of:
Copy code
@dbt_assets(manifest=my_manifest)
def my_dbt_assets(context: OpExecutionContext, dbt: DbtClient):
    # get an up-to-date view of which sources are fresh
    dbt.cli(["source freshness"]).stream()
    # now just execute the ones with this status
    yield from dbt.cli(["run", "--select", "source_status:fresher+"]).stream()
dagster spin 1
d

Dennis Gera

05/23/2023, 12:25 PM
Thanks @owen! I look forward to reading how the
dbt_assets
decorator works and trying it out on our project. I'll then advise what solution we come up with
2 Views