https://dagster.io/ logo
Title
a

Alex Remedios

11/01/2022, 4:12 PM
hi there, I’m interested in merging a dagster postgres database into another pre-existing dagster postgres database. Are there any docs/tips/checklists that can help with this? My current plan is: 1. try ensure they are on the same alembic version 2. pg_dump old db 3. insert the contents into the target database
a

alex

11/01/2022, 4:47 PM
hmm, i think theres a good chance that you are the first person to try to do this. • the only thing i think that has a risk of colliding is if you used the same asset keys in both - “most recent asset event” data may get weird if you insert/overwrite the asset table • you will have to deal with the auto increment id columns - if the date ranges overlap between the two dbs this may be troublesome. We default to sorting by this id column in various places - so you may end up with some weird product experiences if you have to just add all the inbound records after the existing records depending on how their date ranges interact
a

Alex Remedios

11/01/2022, 4:56 PM
thanks, I sense this may not be a fruitful endeavor
a

alex

11/01/2022, 5:10 PM
yea 😕 if you provide more context on the overall situation i can try to advise
a

Alex Remedios

11/01/2022, 5:11 PM
we created a fork dagster environmment to re-architect some pipelines and now want to kill off the fork and return to a single instance. Long-term it helps having access to assets and logs, so we either (a) merge into a single db, or (b) maintain our dead fork potentially forever
a

alex

11/01/2022, 5:29 PM
how many runs are on the fork
a

Alex Remedios

11/01/2022, 5:30 PM
in the region of 1000
a

alex

11/01/2022, 5:32 PM
we do have
dagster debug export
/
dagster debug import
which wasn’t really designed for this but worth checking out
a

Alex Remedios

11/01/2022, 5:33 PM
thanks this looks like the right sort of solution. Any known limitations? (docs are minimal here)
a

alex

11/01/2022, 5:35 PM
dagster debug --help
is marginally better, not sure whats up on the docs page truncation
https://github.com/dagster-io/dagster/blame/master/python_modules/dagster/dagster/_cli/debug.py#L70-L97 the import uses the same methods as when the run is happening so i think • the “most recent event for asset X” will get overwritten by imports and be weird until the “real” overwrite happens. • the autoincrement sorts will show the imported runs as happening relative to the time you imported them not when they ran. I’m not certain off the top of my head where we fall back to this ordering though - maybe the list of asset events for a given key?
a

Alex Remedios

11/01/2022, 5:44 PM
I see. Ordering is probs a nice to have. The essential stuff would be the asset catalog (and correct partition dates associated with them)
a

alex

11/01/2022, 5:52 PM
I believe the partition stuff should all work
a

Alex Remedios

11/01/2022, 6:02 PM
thanks this is looking promising
hey Alex, just confirming that metadata all looks correct but order seems incorrect as you predicted. Do you expect any issues beyond this cosmetic stuff?
a

alex

11/03/2022, 3:46 PM
not that i can think of - let me know if anything comes up
a

Alex Remedios

12/15/2022, 12:46 PM
just confirming that we migrated 500 runs in order to merge two dagster instances. I reckon this is a use case we should develop for enterprise users. Thanks for the help!
:nice: 1
:dagsir: 1