Hi, i have a question regarding using Dagster `on-...
# deployment-kubernetes
b
Hi, i have a question regarding using Dagster
on-prem
situation: • Assume that I hosted Dagster in the cloud, maybe running under K8s or AWS EC2 • Assume that I have a Dask cluster hosted
on-prem
in house. • I don't want any sensitive raw data or assets/artifacts to flow out to the cloud from my
on-prem
. That is to say, its ok for logs, DAG metadata to be going to the cloud but not ok for other kind of data/assets that should be kept on-prem. So the $64K question is does Dagster allowed a
hybrid deployment model
for privacy-sensitive industry like healthcare, finance industries?
s
Hey thanks for the question! Just wanted to clarify the specific use case. This is, within the same company, hosting Dagit, instance, and other system components in AWS and business logic and raw data within an on-prem data center?
b
yes. That is to say, we can allow some serialized form of DAG to be sent to DAGIT, or Dagster server in AWS but not code logic or anything else. I think
Prefect
has this model. One of our customer(big name) won't allow data to move out of on-prem.
s
Yes this is a natural configuration and we are thinking about this quite a bit.
With the current system it is possible but not super natural
Yeah it’s common modality for a bunch of systems. Buildkite, Databricks, etc
Hosted control plane separated from user compute and storage.
🙌 1
💯 1
So the approach you would have to take currently would require a bit of contortion in “user space”, where all the solids do is remotely invoke the dask tasks. You would have to define the API/interchange format that would go over the public internet. Otherwise your on-prem computations would have to have direct access to the hosted database, which I suspect is a non-starter.
If that is true you would have to take an approach similar to what Noah did (he spoke in our last community meeting) where he built a user-space abstraction that handled this nicely, where a solid on the the control plane side was match up with a corresponding function across the process boundary.
👀 1
b
Can you send that link that Noah did?
s

https://youtu.be/lodcK3Z3TUs?t=994

b
@schrockn thanks
n
👋