Hi all, I have a question regarding `dagster_dbt`...
# ask-community
d
Hi all, I have a question regarding
dagster_dbt
. We
load_assets_from_dbt_project()
and have all our dbt assets in Dagster. We can materialize (dbt) assets and everything works great! However we use
K8sRunLauncher
so asset materialization runs in a separate (ephemeral) pod. Each dbt asset materialization generates a
manifest.json
, saved locally on the pod. We want to grab this file and store in on our s3 bucket, however since the pod is ephemeral it’s killed before we can retrieve the file. What’s the best way to solve this?
/sub @Pieter Custers
p
There is also a #dagster-dbt channel
s
cc @owen
o
hi @Dane Linssen! just taking a step back -- depending on your specific setup, it might make sense to actually generate the
manifest.json
file before deploying your code (i.e. have a dbt compile step as part of your Dockerfile, then use
load_assets_from_dbt_manifest
instead of
load_assets_from_dbt_project
). With basic usage of dbt, it's unlikely for the contents of the manifest to change in meaningful ways between runs as long as the project doesn't change.
load_assets_from_dbt_manifest
is also significantly faster. if the manifest.json file is part of the image, you could then set up a separate process (could even be another asset) that reads that manifest file (which will be local to the Docker image) and persists it to your s3 bucket. if this workflow doesn't work for you, let me know! would be happy to talk about other alternatives (although they'd likely require a bit of hacking)
p
Hey @owen thanks for your response, that helped a lot! (@Dane Linssen is on a holiday, so I’m answering in his absence.)
With basic usage of dbt, it’s unlikely for the contents of the manifest to change in meaningful ways between runs as long as the project doesn’t change.
This is and important one, we didn’t realize it before. But generating the manifest.json at or before deploy time is not desirable / does not fit well in our cicd setup. So what we eventually did was
load_assets_from_dbt_project
and right after that upload the manifest.json to s3 once. So at dagster repo load time basically. Works well 🙂
d
Thanks @owen! That's really insightful. Appreciate you taking the time to explain it to us! In accordance with what @Pieter Custers said I'm sure this will work for us. Happy Holidays!