<!here> Dagster-dbt users - would you mind helping...
# integration-dbt
s
<!here> Dagster-dbt users - would you mind helping us out with a quick poll: Is your dbt project in the same git repository as the Dagster project that loads it? Respond with: 1️⃣ - my dbt project and Dagster project are in the same git repo. 2️⃣ - my dbt project and Dagster project are in separate git repos. We're trying to put together some utilities and guides for Dagster-dbt best practices. If you're up for replying with your rationale on why 1️⃣ or 2️⃣ , that would be even more helpful. Thank you!
1️⃣ 26
2️⃣ 10
a
we have all of our data pipeline related code in a mono-repo. We don't have a large enough team to require separate permissions boundaries between orchestration, transform logic (dbt), and reporting/visualizations. We benefit massively in terms of discoverability by keeping everything colocated.
👍 4
👀 1
b
We have adopted a Mono-Repo Architecture and all our workflows ( a.k.a Asset Modules ) have enough cohesiveness to be together and still distinct to make it more discernable. Also we have discovered a decent design pattern to enable / disable individual workflow modules ( assets+jobs+sensors+ops) in our mono-repo. We achieve this by conditionally adding them to a global arrays of assets/jobs/sensors/ops etc, which finally gets added to the
Definitions
object. This is advantageous by giving us good control over our feature release. Each feature being a new module added to the mono-repo
👀 1
Just an example of module level setup… == conditional enableing of workflows
r
different repos for historical reasons. First it came the repo for the dbt project, which grew in complexity, then we’ve adopted Dagster and we’ve placed it in an another repo for consistency with company policies and because they have a different lifecycle and visibility (data analysts vs data engineers)
👀 1
b
Same repo - there's no difference in the release cycle for the two, nor any other good justification for splitting the two out.
👀 1
r
we have one repo for our data warehouse (stored in BQ), where all assets in the repo are BQ tables. we ingest from a variety of sources, and allow folks to ingest upstream tables, or process downstream with either Dagster software-defined assets, or DBT models. lets us pick the right tool for the job, and stitch everything together in a consistent way (Dagster partitions, Assets dependencies)
👀 1
v
Also running a mono-repo with separate projects. The dbt portion has its own code location to keep Docker images a little smaller, but everything lives in the same git repo.
👀 1
g
We have different code locations per team. In case a team is using DBT for their code location, DBT lives directly together with their Dagster code
👀 1