Hi all, Looking for some advice on my implementati...
# ask-community
m
Hi all, Looking for some advice on my implementation. I'm running Dagster on a Windows server to assist us in migrating from one ERP system to another. I'm using Dagster to synch data from our current ERP to our new one, which works great. For our new ERP system, I have two databases to maintain: a test and production environment. The database updates need to be done through the vendors proprietary software (it's unfortunately not just a database update), but I can start that software with Dagster. I've written various assets/jobs/schedules in Dagster to update our test environment and I could change the resource definition to prod to update our production data, which is great as well. However, I'm struggling to find a nice design to allow new assets/jobs/schedules to be tested in our test database while keeping our production up to date. Basically we want our test database and prod database in our new environment to stay synched, but the test database has different assets/jobs/schedules that are currently being tested. I've considered copying the folder with my test deployment to a separate production deployment - and as tests are succesful, copy more functions to prod, but Dagit warns me that multiple assets with the same name are found if I have both deployments on the same server. It's really unfortunate for my usecase that assets in different deployments aren't considered 'unique' even if they have the same name. All solutions I come up with are inconvenient: • I figure I could use Docker to host both deployments of Dagster and expose them on different port. One issue would be having to somehow expose the ERP import software which is installed on the host machine, which I figure would be a pain to access through Docker (I'm not too familiar with Docker). • I could install Dagster (+ the ERP import software) on a separate server - not sure if I could get that arranged with our IT provider (+additional costs of separate Windows server environment). • I could rename each asset/graph when copying from test to prod to solve the duplicate name conflicts, but this would be a pain to maintain. The benefit is that both TEST and PROD would be visible in the same Dagit window and I wouldn't have to deal with Docker or arranging a new server. I'm guessing most users may not have to deal with this as I guess most have separate systems for test/prod and/or just deal with database updates, but I'm wondering if I'm missing an obvious solution to deal with the naming conflicts. I hope all of the above made sense, somehow. 🙂 Thanks in advance for any advice anyone might have!
a
What database are you using for dagster itself? The default SQLite or something like postgresql? If you run separate test/prod dagster databases, then you can reuse asset keys
m
I'm using the default SQLite database at the moment. I understand that I might need to customize my io_manager so prod would not take materialized assets from test, but even when I customize my io_manager to have different base_dirs, I would still get the error in dagit that the asset name is not unique. Would switching to PostgreSQL help with that?
a
So, looks like you’re trying to run a single instance of dagster with two code locations. That will require unique asset names. What I was thinking was running two separate instances of dagster (can be on the same machine as long as the ports are different) that use separate database. If you’re using SQLite, that’d just be two DAGSTER_HOMEs I believe
m
That'd be a great solution! I'm trying this now in a test environment without a 2nd dagster_home, but it seems to be working. I'm just getting errors that Dagster seems I have multiple daemons running, but I wonder if those errors are gone if I manage to get a 2nd dagster_home working.
a
Yeah, as long as they're pointed at different databases, they'll have no way of knowing about each other. workspace.yaml has some options for this sort of thing too I believe
m
Awesome, thanks a lot for your help!