Hello, I dont understand something about assets a...
# ask-community
s
Hello, I dont understand something about assets and io_manager. Maybe you can help me. I have datas to read from some MySQL instances, I have to tranform and load result into another MySQL instance. What io_manager use for source datas ? What io_manager use to load result into MySQL ? Regards, (Sorry for noob question level)
j
Hey @Sebastien we don’t have a dagster-authored MySQL I/O manager, so that’s something you would need to write yourself. Basically it boils down to subclassing the IOManager base class and implementing two methods (one to store outputs in the db and one to load values from the db). You can then make those methods as simple or complicated as you need for your use case https://docs.dagster.io/concepts/io-management/io-managers#defining-an-io-manager
🤖 1
s
I understand I have to write it. I dont see atm how to do it, how to manage data stream, I dont want to get an entire copy of any source table in my workstation. I think about some pd.DatraFrame method (Sqlalchemy engine to pd.DataFrame as Output value) but this df should be in memory, not stored as a persistent file (Maybe 1Tb🤷) Well, I will try something asap next days.
(And maybe Pandas to_sql/append method to store datas 🤔)
j
yeah it’s hard to give specific advice without knowing the details of your requirements and setup. I would recommend looking into partitions if there are logical ways you can divide your data so that you don’t load the entire table at a time. You might also find that directly creating a connection to your DB and running the SQL you need to modify tables might be a better approach, then you don’t need to load the tables into memory
🙏 1
for that second option, you basically wouldn’t use io managers at all
👍 1
https://docs.dagster.io/tutorial/managing-your-own-io sort of like this but with connecting to you db and running SQL
🤖 1