https://dagster.io/ logo
#ask-community
Title
# ask-community
o

Olivier Oswald

04/27/2023, 2:13 PM
I noticed that the dagster-gcp-pandas library uses upper case column names. is there a specific reason for this? if not I'll do a PR, so that the library uses the column names as defined in the DataFrame.
t

Tim Castillo

04/27/2023, 2:18 PM
Hi! Let me check with someone from the team and get back to you on this! Thanks for offering to raise the PR.
j

jamie

04/27/2023, 4:58 PM
This is largely done for consistency with other io managers that store data in databases. Some of those databases require upper case column names in order for queries to work correctly, so we uppercase all column names across the board to preserve the ability to swap io manager implementations
👍 1
b

Bertrand Nouvel

04/28/2023, 10:49 AM
I came accross the issue with snowflake and did a internal patch to have case-sensitive column names working correctly, I think users should have the option to choose the behaviour they prefer here : In many cases when we introduce Dagster in existing pipelines we don't want to have to fix/change columns because io_managers normalise them to uppercase - so if the backend IO manager supports it, it is good to have.
o

Olivier Oswald

04/28/2023, 12:08 PM
That was also my intention, existing datasets are lower case. I ended up with my own
BigQueryPandasIOManager
variant, which also allows me to control the table schema.
j

jamie

04/28/2023, 8:41 PM
thanks for the input! I’ll keep this in mind for future work on these io managers
2 Views