is it possible to fan out multiple times? for inst...
# ask-community
d
is it possible to fan out multiple times? for instance, we're chunking a list of keys out to 60 docker instances to run. We then want to run a for loop over the batch of keys and run additional ops. Could I potentially call that like this:
Copy code
result = (
        chunking_op(recruiter_teams_to_score)
        .map(
             lambda batch: append_scores(
                    (
                        keys_in_batch(batch)
                        .map(
                            lambda key: score_batch(
                                scored_data_for_key(key),
                                trained_model,
                            )
                        )
                        .collect()
                    )
                )
        )
        .collect()
    )
    return append_scores(result)
it looks like this fails for this reason:
Copy code
dagster._core.errors.DagsterInvalidDefinitionError: op 'scored_data_for_key' cannot be downstream of more than one dynamic output. It is downstream of both "chunking_op:result" and "keys_in_batch:result"
is there a different way to attempt what I'm trying?
s
Hi @Danny Steffy - alas, this isn't currently possible. Here's where we're tracking it: https://github.com/dagster-io/dagster/issues/4364 Would it be possible to flatten these into a single level of keys?
d
hm, like instead of chunking it into batches initially, just run it on each individual key?
we have 30k+ keys, I was hoping to batch them out, and then run them one at a time inside the step container. But if that's not possible I can rethink this strategy
s
right - because you'll end up with 30k steps either way, right?
d
Right. Is there a way to specify certain steps should be processed with a different run launcher? I don't want every step to be in it's own docker container, I think that would be too much overhead
s
Is there a way to specify certain steps should be processed with a different run launcher? I don't want every step to be in it's own docker container, I think that would be too much overhead
currently, all steps within a run are launched in the same way