Hi All I recently built a POC for my company using Dagster a dagster #announcements

Hi All, I recently built a POC for my company usin...

Deveshi

03/04/2021, 2:14 PM

Hi All, I recently built a POC for my company using Dagster and got a thumbs up for setting it up to cover more data flows. The POC was a bit rushed, so I didn't organise the code very well. I am wondering if there is a best practices guidelines for greenfield projects? I have questions like - • preferred deployment methods (VM, kubernetes, etc.) • support for gitOps style deployment • preferred environments (linux, windows, although I assume linux 🙂 ) • best practices to store SQL, as most of data transformations are done using SQL (Redshift) • preference for specifying config: yaml Vs python API Thanks in advance!

👍 3

cat

03/04/2021, 8:26 PM

Hi Deveshi 👋 our recommended deployment method is kubernetes with our “out-of-the-box” helm chart (however, many dagster users do deploy on ec2 / vm directly — so that is definitely possible)

cat

03/04/2021, 8:27 PM

“support for gitOps style deployment” <- not totally sure what this means, but you can deploy all dagster components via the helm chart in ci/cd

cat

03/04/2021, 8:27 PM

linux! but we do run tests on windows too

cat

03/04/2021, 8:31 PM

“preference for specifying config: yaml Vs python API” <- yaml is often used for the dagster instance and workspace. internally, we use python apis for everything else to make it easier to compose/modify/read/test configs (although this is a bit of personal/team preference)

cat

03/04/2021, 8:36 PM

“best practices to store SQL” <-- i’m not as sure about this one. i’ve seen this either in-line in the solid, in-line a method of the redshift resource. many users use dbt to version their sql

Deveshi

03/04/2021, 8:38 PM

Thanks @cat 🙂 This is very helpful

2 Views

Open in Slack

Previous Next