In a greenfield environment where I only need some...
# integration-airflow
f
In a greenfield environment where I only need something that can schedule non-parameterized interdependent tasks, provides a GUI, and bonus points for being easy to set up and operate on K8s: Airflow or Dagster? I'm leaning towards Airflow because it's more battle-tested and has a bigger ecosystem. How do the two compare in this use case? Also, Airflow requires a DB. And Dagster only needs a FS, which is an advantage IMO. Can Dagster be configured to use any S3-compatible object store, not just AWS S3?
s
For greenfield environments we do not recommend using our Airflow integration. That library exists primarily for enabling migrations.
In the greenfield case it just adds another layer which complicates deployment and operations
We recommend just deploying to kubernetes
f
Can Dagster be used stand-alone instead of Airflow?
s
yes
f
That's what I'm considering
s
great! so in terms of dagster it does require a db, but the default one is just on-disk sqllite
if you want to run it distributed on a cluster you will need a database accessible from all nodes in that cluster
f
Alright I see. Makes sense. Is there anything I could end up missing if I go with Dagster instead of Airflow for a purely parmameterless-task-scheduling use case?
s
I’m not sure of the all the quirks of different s3 compatible implementations. Our S3 intermediate store stacks on top of boto3
i think the biggest missing thing would be if there are airflow operators you depend on
but curious to see what you think!
s
Yes, dagster can be configured to run with any s3 compatible store–we a few users using minio etc