Hi What are Your experience about how big servers to create dagster #announcements

Hi, What are Your experience about how "big" serv...

Kaspars

10/23/2020, 9:57 AM

Hi, What are Your experience about how "big" servers (to create kubernetes cluster) we should need for dagster? Which component needs what min/optional/max resources (CPU, RAM)? I understand that this could depend from operations inside pipelines. We want to achieve: -get data from source (postgresql); -do data tranformations with pandas/numpy ... Cleaning Filtering Joining Sorting Splitting Deduplication Summarization; -save to db;

sashank

10/23/2020, 3:14 PM

You’re right that this depends on your operations within the solids and pipelines, that would be the biggest factor here – the setup and processes that are able to run on your laptop is pretty similar to the one that runs in k8s so Dagster processes themselves are pretty lightweight.

sashank

10/23/2020, 3:14 PM

cc @cat or @nate who might have more insight here

sashank

10/23/2020, 3:16 PM

I’ve have been fine with using nodes that have low specs (such as

t2.medium

on AWS) , but I’ve also been running light workloads.

nate

10/23/2020, 5:56 PM

yeah, I think @cat has thought about this more than I have, but as Sashank mentioned the overhead of Dagster components is fairly small, so this will mostly be relative to the size of your workloads. Unless you’re working on really massive datasets in pandas/numpy, I think you’ll be able to get away with fairly small nodes

Kaspars

10/25/2020, 12:46 PM

Thanks @sashank and @nate 😉

Open in Slack

Previous Next