Is there a way to have a Hybrid deployment which h...
# dagster-plus
l
Is there a way to have a Hybrid deployment which has some jobs running in cloud VMs and other jobs running on prem on specific computers (i.e. computer_A_job on ComputerA, computer_B_job on ComputerB, etc.) which are upstream of the cloud assets/jobs? I want to be able to track/initiate job runs/materializations across the on prem computers from a central Dagit instance. Do I just set up Docker Agents, for example, on each on prem machine and create a Code Location for each of them in a similar way to the Cloud VM agent/code location?
b
Hi Lyle, one way to do this would be to run different Dagster deployments for on-prem vs cloud VMs - this way each deployment can be served by an agent in a different environment
l
Hi Ben, thanks for responding! Would I need separate Deployments for each on-prem computer? Or, just separate Agents for each on-prem computer? Also, if each on-prem computer is really executing the same work just with different sets of data unique to each computer, do I need to do anything special to uniquely identify the Assets/Jobs in the deployment (assuming they share a deployment). In some sense, the work they're doing can be considered materializing partitions of the same asset, but I don't think I'm ready to convert these to partitioned assets quite yet.
b
Right now we only support running multiple agents for a single deployment as a form of redundancy- we don’t have a way to route particular jobs to particular agents within a single deployment (this is an often requested feature though)
In this case if each on prem machine has access to different data and is computing on different data then you would want to create a deployment for each
You could have shared code loaded in each deployment if the computation is the same
l
Hmmm. Is the targeting of individual agents anywhere within a single deployment on the roadmap for anytime this year?
Thanks for info, Ben. To give more background on what I'm trying to accomplish, I'd like to have the on-prem computers perform initial processing of data which exist on their respective local storage devices (same code, different data), and then upload their respective results to a common cloud storage. The cloud VMs would then take over processing of the combined dataset. I'm definitely interested in unifying the command and control of all of these assets from end to end (local to cloud) as much as possible, but can see there are limitations. I guess if I go the "deployment per on-prem computer" route, I would have the cloud VMs treat the common data bucket as a source asset. I would then either observe that location for changes and/or have the on-prem instances send GQL mutations to the cloud instance to create materialization records and/or instigate downstream runs to process that data? Am I understanding correctly that I cannot create in a single deployment multiple unique Code Locations (and agents) per on-prem computer with "feigned unique" job definitions (i.e. they are actually the same logical code, but appear as differently named entities to Dagster) so that the deployment instance will route those "unique" jobs to execute only through the only code location (on-prem computer) that has them? In other words define JobA in CodeLocationA which is running on ComputerA, etc. As another alternative, would it make any sense to have local dagster (open source) deployment instances running on each local computer, and they simply use GQL to communicate with the cloud deployment? Potentially the cloud instance can run jobs (one per on-prem computer) which simply send run requests to the on-prem instances and poll for status (mapping on-prem events to cloud). I understand that this may not benefit for the integrated/automated CICD of the cloud deployment, but that may be an acceptable tradeoff. Is Alex's answer to another post by Zach about "_Is it possible to construct a DagsterInstance which points to a Dagster Cloud deployment from outside of a user code deployment_" relevant to constructing a solution for my use case?: https://dagster.slack.com/archives/C02LJ7G0LAZ/p1685721897394309?thread_ts=1685663863.672669&cid=C02LJ7G0LAZ
b
This routing behavior is on the roadmap - no exact ETA but likely this quarter.
I would have the cloud VMs treat the common data bucket as a source asset. I would then either observe that location for changes and/or have the on-prem instances send GQL mutations to the cloud instance to create materialization records and/or instigate downstream runs to process that data?
This seems like a reasonable approach to accomplish what you’re looking for before agent routing within a deployment is available. Having a single agent which ferries/brokers requests to the various on-prem computers could also work, somewhat like the second alternative you describe.
l
Thanks for following up! I'll see what I can implement in the meantime before the routing behavior is officially available.