Hey I m working with the new ` dbt assets` syntax and I have dagster #integration-dbt

Hey I'm working with the new `@dbt_assets` syntax ...

Marek Vigaš

07/26/2023, 4:14 PM

Hey I'm working with the new

@dbt_assets

syntax and I have to provide path to manifest. Before migration to new syntax I was using

load_assets_from_dbt_project

and manifest.json was in gitignore and not commited to repo. With the new syntax I have to commit the manifest and after few days of using it like this it is not ideal and manifest always causes merge conflicts. Is there any plan to add similar functionality like

load_assets_from_dbt_project

to new

@dbt_assets

decorator ? I'm thinking of reimplementing

_load_manifest_for_project

and running it before

@dbt_assets

. I only have like 70 models. Is there any obvious disadvantage to this approach?

plus1 1

Duke

07/26/2023, 4:19 PM

following. but another way is to save the

manifest.json

file in a different location (S3 etc.)

Marek Vigaš

07/26/2023, 4:24 PM

Good tip, I like the simplicity of

load_assets_from_dbt_project

It runs

dbt ls

on each deploy and creates manifest for it self without dependency to any external service.

rex

07/26/2023, 6:21 PM

dagster-dbt project scaffold

, we emulate the behavior of

load_assets_from_dbt_project

by creating the manifest at runtime. You can see the implementation here. • We use the presence of an env var

DAGSTER_DBT_PARSE_PROJECT_ON_LOAD

to determine whether to build a manifest. • This allows for workflows in development where you run

DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1 dagster dev

locally so that you can work while simultaneously updating your dbt and dagster code. And in production, you can produce the manifest in your build system, when you deploy your code. You don’t need to commit it to your repository. • We recommend using a precompiled manifest because it is more efficient. Otherwise, everytime the code location loads, or whenever a Dagster job is kicked off to materialize your assets, you have to compile your manifest. Depending on the scale of your project, this could be an expensive operation. • By compiling it in the build system, you just do this compilation operation once for your deployed code.

Nicolas Parot Alvarez

07/27/2023, 10:14 AM

We're not yet using the new DBT API, but we've been having much better performances on job loading times and less loading errors by using

load_assets_from_dbt_manifest

compared to

load_assets_from_dbt_project

. • Our DBT project is an independent Git project. Its

target

directory, that receives the manifest, is git-ignored. • It is included in our Dagster git project as a git submodule. • If DBT models are updated, we create a new release of the DBT project and the Dagster project with its updated submodule. • Our Dagster project deployment/installation script runs

dbt ls

to generate the manifest in the

target

folder of the submodule, which remains ignored. The only difficulty was to find the correct git commands to update the git submodules

git submodule sync && git submodule update --init --force --recursive

, but since this finding (based on heavy archeological and experimental work) it has been working perfectly.

👀 2

Open in Slack

Previous Next