< here> Dagster dbt users would you mind helping us out with dagster #integration-dbt

<!here> Dagster-dbt users - would you mind helping...

sandy

06/20/2023, 8:30 PM

<!here> Dagster-dbt users - would you mind helping us out with a quick poll: Is your dbt project in the same git repository as the Dagster project that loads it? Respond with: 1️⃣ - my dbt project and Dagster project are in the same git repo. 2️⃣ - my dbt project and Dagster project are in separate git repos. We're trying to put together some utilities and guides for Dagster-dbt best practices. If you're up for replying with your rationale on why 1️⃣ or 2️⃣ , that would be even more helpful. Thank you!

1️⃣ 26

2️⃣ 10

Adam Bloom

06/20/2023, 8:32 PM

we have all of our data pipeline related code in a mono-repo. We don't have a large enough team to require separate permissions boundaries between orchestration, transform logic (dbt), and reporting/visualizations. We benefit massively in terms of discoverability by keeping everything colocated.

👍 4

👀 1

Binoy Shah

06/20/2023, 8:32 PM

We have adopted a Mono-Repo Architecture and all our workflows ( a.k.a Asset Modules ) have enough cohesiveness to be together and still distinct to make it more discernable. Also we have discovered a decent design pattern to enable / disable individual workflow modules ( assets+jobs+sensors+ops) in our mono-repo. We achieve this by conditionally adding them to a global arrays of assets/jobs/sensors/ops etc, which finally gets added to the

Definitions

object. This is advantageous by giving us good control over our feature release. Each feature being a new module added to the mono-repo

👀 1

Binoy Shah

06/20/2023, 8:44 PM

Just an example of module level setup… == conditional enableing of workflows

__init__.py

Riccardo Tesselli

06/20/2023, 8:46 PM

different repos for historical reasons. First it came the repo for the dbt project, which grew in complexity, then we’ve adopted Dagster and we’ve placed it in an another repo for consistency with company policies and because they have a different lifecycle and visibility (data analysts vs data engineers)

👀 1

Brendan Jackson

06/20/2023, 9:27 PM

Same repo - there's no difference in the release cycle for the two, nor any other good justification for splitting the two out.

👀 1

Rob Sicurelli

06/20/2023, 10:41 PM

we have one repo for our data warehouse (stored in BQ), where all assets in the repo are BQ tables. we ingest from a variety of sources, and allow folks to ingest upstream tables, or process downstream with either Dagster software-defined assets, or DBT models. lets us pick the right tool for the job, and stitch everything together in a consistent way (Dagster partitions, Assets dependencies)

👀 1

Vinnie

06/21/2023, 7:28 AM

Also running a mono-repo with separate projects. The dbt portion has its own code location to keep Docker images a little smaller, but everything lives in the same git repo.

👀 1

geoHeil

06/21/2023, 11:45 AM

We have different code locations per team. In case a team is using DBT for their code location, DBT lives directly together with their Dagster code

👀 1

22 Views

Open in Slack

Previous Next