is it possible to fan out multiple times for instance we re dagster #ask-community

is it possible to fan out multiple times? for inst...

Danny Steffy

03/02/2023, 11:01 PM

is it possible to fan out multiple times? for instance, we're chunking a list of keys out to 60 docker instances to run. We then want to run a for loop over the batch of keys and run additional ops. Could I potentially call that like this:

Copy code

result = (
        chunking_op(recruiter_teams_to_score)
        .map(
             lambda batch: append_scores(
                    (
                        keys_in_batch(batch)
                        .map(
                            lambda key: score_batch(
                                scored_data_for_key(key),
                                trained_model,
                            )
                        )
                        .collect()
                    )
                )
        )
        .collect()
    )
    return append_scores(result)

Danny Steffy

03/02/2023, 11:04 PM

it looks like this fails for this reason:

Copy code

dagster._core.errors.DagsterInvalidDefinitionError: op 'scored_data_for_key' cannot be downstream of more than one dynamic output. It is downstream of both "chunking_op:result" and "keys_in_batch:result"

Danny Steffy

03/02/2023, 11:04 PM

is there a different way to attempt what I'm trying?

sandy

03/02/2023, 11:19 PM

Hi @Danny Steffy - alas, this isn't currently possible. Here's where we're tracking it: https://github.com/dagster-io/dagster/issues/4364 Would it be possible to flatten these into a single level of keys?

Danny Steffy

03/02/2023, 11:39 PM

hm, like instead of chunking it into batches initially, just run it on each individual key?

Danny Steffy

03/02/2023, 11:40 PM

we have 30k+ keys, I was hoping to batch them out, and then run them one at a time inside the step container. But if that's not possible I can rethink this strategy

sandy

03/03/2023, 12:57 AM

right - because you'll end up with 30k steps either way, right?

Danny Steffy

03/03/2023, 12:58 AM

Right. Is there a way to specify certain steps should be processed with a different run launcher? I don't want every step to be in it's own docker container, I think that would be too much overhead

sandy

03/03/2023, 10:56 PM

Is there a way to specify certain steps should be processed with a different run launcher? I don't want every step to be in it's own docker container, I think that would be too much overhead

currently, all steps within a run are launched in the same way

6 Views

Open in Slack

Previous Next