1: Could dagster run composite several graphs ? 2:...
# ask-community
a
1: Could dagster run composite several graphs ? 2: If so, are these sub-graphs placed on different clusters ? My purpose is to make a loosely coupled graphs with different kinds of tasks and place these graphs on different machine resources. I am wondering if the following snippet is possible.
Copy code
@dagster.graph
def preprocess(context):
    # do ELT in multi nodes CPU cluster at this graph
    # Push data into a s3 bucket

@dagster.graph
def train_model(context, s3bucket_to_preprocessed_data):
    # Train in GPU or TPU cluster at this graph
    # Save model file and push into s3bucket and then  publish s3bucket path of model files into pub/sub messaging for each epoch
    # e.g. A value of a topic is "<s3://model-bin/epoch_001>" 

@dagster.graph
def infer_model(context, s3bucket_to_validation_data, s3bucket_to_model_file)
    # Subscribe s3bucket path from the above messaging topic.
    # Evaluate the model in a cluster which is not the one for preprocess graph and train_model graph.

@dagster.graph
def entire_train_and_eval_job(context):
    # Run the above three graphs once in a while.
daggy success 1
If dagster.graph could allow only composition of dagster.op instances, my question is stupid…
z
You can nest graphs in sub graphs , but I'm not aware of a way to configure separate hardware resources on the graph-level. This could be achieved though using a custom step launcher though at the op-level, then configuring the step launcher resources at the job-level. Essentially each of the ops in a graph that you want to launch on a particular cluster would have a required_resource_key that maps to a StepLauncher (configured on the job) that does the work of moving / submitting the code to your cluster and monitoring the job / notifying dagster when the job has completed or failed. You could have different graphs operating on different clusters by having different groups of ops (your sub graphs) use different StepLaunchers. StepLaunchers can be a little complicated to write but I've done a couple now and it's not too bad once you get the concepts. Here's another thread where @owen give some explanations around StepLaunchers, and here's a github discussions that provides some details as well. I found that working from the databricks_pyspark_step_launcher as an example was super helpful for understanding the different pieces involved.
a
Thanks so much! 😁 and It’s a bit actually complicated as you mentioned.
z
yeah it's just the nature of coordinating remote code execution... the StepLauncher abstraction is pretty helpful in getting it set up though. let me know if you decide to go down that rabbit hole and I'd be happy to provide any insight I can for your journey dagster spin