https://dagster.io/ logo
#ask-community
Title
# ask-community
d

Danylo Korostil

02/08/2024, 12:37 PM
Hi everyone! I'm a bit new to dagster and I need an advice. I'm building an ETL orchestrator tool and don't know what abstraction is the best. What do we have: • Templates of DataSources (SQL, s3, etc) - resources? • DataSources (connection string) - io_manager? • Dataset (table, file) - asset? • Transformation (sql, dataframe, etc) - op? Basic task: Get some file form s3, transform it, save to SQL. DataSources and transformations should be parametrized. Maybe you know good examples for similar use case. Thanks in advance!
z

Zach

02/08/2024, 3:15 PM
Just a clarification - transformation occurs within an asset, so an asset represents a dataset and the transformation required to create the dataset from its dependencies (other datasets). Here's a pretty cool example a user from the community put together using mostly sql-based transformations
d

Danylo Korostil

02/08/2024, 3:26 PM
Thanks, looks pretty similar.
2 Views