Dagster _serverless_ vs _hybrid for privacy_ Ponde...
# dagster-plus
j
Dagster serverless vs hybrid for privacy Pondering a potentially silly idea here as a followup from this post: if confidentiality is my main concern, instead of using dagster hybrid, could I instead use dagster serverless but make external API calls within my assets into my sensitive code which populates assets in my AWS account? that would also allow me to implement my short lived tasks as AWS lambdas which there is currently no agent for, instead of needing to maintain a ECS cluster. apart from IOManagers, what would be the disadvantages of doing the above vs dagster hybrid approach?
j
https://docs.dagster.io/dagster-cloud/deployment/serverless#when-to-choose-serverless
Serverless works best with workloads that primarily orchestrate other services or perform light computation. Most workloads fit into this category, especially those that orchestrate third-party SaaS products like cloud data warehouses and ETL tools.
In this case, if you squint a little, the lambdas where your compute is happening is just a third party tool that’s being orchestrated. A few caveats: • There’s still built in ECS latency for run startup. Not sure if that matters to you. • You lose the benefits of having your compute tightly integrated with the Dagster ecosystem (emitting events, IO Managers, resources, running locally via dagster dev, etc.)
j
My eyes are already very tiny, if I squint I might not see anything 🤣. Sounds like my suggested approach is a viable pattern, I just need bear the loss or "tight integration" in mind. Can you elaborate a little on emitting events with an example?
j
you can emit events directly to dagster’s event log (instead of just relying on stderr/stdout) • https://docs.dagster.io/concepts/logging/loggershttps://docs.dagster.io/concepts/ops-jobs-graphs/op-events
🌈 1