Hello! Adopting Dagster has allowed <PUDL> to dist...
# random
b
Hello! Adopting Dagster has allowed PUDL to distribute more help data tables. This has encouraged us revisit our data modeling and naming conventions. We have some ideas on how to update our table naming conventions to accommodate our adoption of dagster and all of our new tables but we’d love some additional input. If you have experience with data warehouses and are interested in supporting an open source data engineering project I’d love to talk to you! Thanks!
❤️ 4
daggy love 2
a
Hi Bennet. I stumbled across the PUDL project as I was searching for projects that were implementing dagster - nice to see your post on here! I notice that PUDL implements its own CLI as opposed to making use of the Dagster one - what was the motivation behind this? I’m curious as I’m running into some limitations with the dagster CLI and wondering if these are actual CLI limitations or just limitations in my current understanding of Dagster.
b
Hi Ali! Prior to adopting dagster our primary method for running our ETL a cli tool we implemented. We kept it around after adopting dagster to support some backwards compatibility. We’ll probably move towards removing the cli command and use the dagster CLI eventually. We mostly need to feed a cli command a yaml file with Resource confnigurations.
a
Hi Bennet, thanks for the context and further info - really useful. I’ve been building from the ground up with Dagster and aiming to execute various jobs from the CLI. I have the full job which materialises all assets as well as other jobs subsets of assets. My plan was to have one config with all my parameters and provide this to each call to
dagster job execute ...
However, when executing a job and providing a config file, if there are op parameters defined in the config that aren’t required for the specified job (e.g. when executing a subset) then I get an error due to the “invalid” extra params. So it seems to me that i’d have to define many separate config yaml files even if the actual parameters remain the same. Trying to find a way around this at the moment.