Hi there. I have been trawling through the documentation but I can’t seem to find much about how to control what environments the run runs in. If I first want to use 100 machines to distribute the preparation of a dataset using a custom docker image and then want to train a neural network on a single multi-gpu machine. How do I define such environments and system requirements? And might this use-case be a bad fit for Dagster?
01/28/2023, 3:53 PM
How are you planning the deployment? Kubernetes? Bare metal with docker?
Helge Munk Jacobsen
01/28/2023, 4:04 PM
Kubernetes is a possibility
01/28/2023, 4:16 PM
You can configure the kubernetes jobs associated with each of your dagster jobs to select nodes that meet your criteria. For example, for your large parallel job, choose nodes (using labels or taints) that support autoscaling, for instance. For your training job, configure to go to a node with GPU support. You could ask for some help over at #dagster-kubernetes. If you know who is going to supply your kubernetes service (GCP, AWS, on-prem, etc), that information could be useful to supply.