https://dagster.io/ logo
Title
d

Dennis Gera

04/17/2023, 8:56 PM
Hey team, I want to use the
load_from_manifest
function to load my dbt assets. However, I don’t commit dbt’s target file to my github repo. How can I use s3 to load and update my manifest.json file through dagster’s dbt runs?
t

Tim Castillo

04/17/2023, 9:15 PM
Hey! Assuming you can get the
manifest.json
into S3 somehow (ex CI/CD): In your
__init__.py
, make a raw fetch from S3 to get the
manifest.json
and save it to a temp file in the Dagster instance's file system. Then, you'd have the
manifest.json
accessible to your dagster instance, so you can now read it into Dagster and pass it into
load_assets_from_dbt_manifest
. You wouldn't use a resource or op for this because this should happen when things are defined. Let me write out some pseudocode really quickly
❤️ 1
:rainbow-daggy: 1
d

Dennis Gera

04/17/2023, 9:18 PM
And how do I get dagster to overwrite that s3 file with the newly generated manifest.json after a dbt run?
t

Tim Castillo

04/17/2023, 9:22 PM
What's your use case where you'd like to update the
manifest.json
after runs? Usually, you'd only need to update the
manifest.json
whenever dbt models change. In which case, I've seen users have CD for dbt look like this: • merge PR • generate
manifest.json
and write to S3 • tell Dagster the dbt project changed and have it refresh its definitions. sometimes done through the GraphQL API to trigger the
reloadWorkspace
mutation. ◦ If you don't need super tight robustness, tbh you can just do it from the UI.
from dagster import Definitions

from dagster_dbt import load_assets_from_dbt_manifest

import json
import boto3

MANIFEST_PATH = '/tmp/manifest.json'

s3 = boto3.client('s3')
s3.download_file('dbt-artifacts', 'manifest.json', MANIFEST_PATH)

with open(MANIFEST_PATH) as f:
    manifest = json.load(f)

dbt_assets = load_assets_from_dbt_manifest(
    manifest,
    key_prefix="best_target",
)

defs = Definitions(
    assets=dbt_assets,
)
Something like this. But after typing that out, you probably don't even need to download the file from S3 completely and instead just grab it directly into memory.
d

Dennis Gera

04/17/2023, 9:33 PM
Thanks Tim! Looking into this
Hey @Tim Castillo!
tell Dagster the dbt project changed and have it refresh its definitions. sometimes done through the GraphQL API to trigger the
reloadWorkspace
mutation
Do you have any examples on how to do this?
t

Tim Castillo

04/28/2023, 3:19 PM
lol just saw your Q in #dagster-support and started drafting up an example
d

Dennis Gera

04/28/2023, 3:20 PM
awesome, thanks! sorry for posting in two places, wasn’t sure where would be most appropriate