Dennis Gera
04/17/2023, 8:56 PMload_from_manifest
function to load my dbt assets. However, I don’t commit dbt’s target file to my github repo. How can I use s3 to load and update my manifest.json file through dagster’s dbt runs?Tim Castillo
04/17/2023, 9:15 PMmanifest.json
into S3 somehow (ex CI/CD):
In your __init__.py
, make a raw fetch from S3 to get the manifest.json
and save it to a temp file in the Dagster instance's file system.
Then, you'd have the manifest.json
accessible to your dagster instance, so you can now read it into Dagster and pass it into load_assets_from_dbt_manifest
.
You wouldn't use a resource or op for this because this should happen when things are defined.
Let me write out some pseudocode really quicklyDennis Gera
04/17/2023, 9:18 PMTim Castillo
04/17/2023, 9:22 PMmanifest.json
after runs?
Usually, you'd only need to update the manifest.json
whenever dbt models change. In which case, I've seen users have CD for dbt look like this:
• merge PR
• generate manifest.json
and write to S3
• tell Dagster the dbt project changed and have it refresh its definitions. sometimes done through the GraphQL API to trigger the reloadWorkspace
mutation.
◦ If you don't need super tight robustness, tbh you can just do it from the UI.from dagster import Definitions
from dagster_dbt import load_assets_from_dbt_manifest
import json
import boto3
MANIFEST_PATH = '/tmp/manifest.json'
s3 = boto3.client('s3')
s3.download_file('dbt-artifacts', 'manifest.json', MANIFEST_PATH)
with open(MANIFEST_PATH) as f:
manifest = json.load(f)
dbt_assets = load_assets_from_dbt_manifest(
manifest,
key_prefix="best_target",
)
defs = Definitions(
assets=dbt_assets,
)
Something like this. But after typing that out, you probably don't even need to download the file from S3 completely and instead just grab it directly into memory.Dennis Gera
04/17/2023, 9:33 PMtell Dagster the dbt project changed and have it refresh its definitions. sometimes done through the GraphQL API to trigger theDo you have any examples on how to do this?mutationreloadWorkspace
Tim Castillo
04/28/2023, 3:19 PMDennis Gera
04/28/2023, 3:20 PM