I m receiving the following error when trying to re run a jo dagster #ask-community

I'm receiving the following error when trying to r...

Zach

06/26/2023, 5:28 PM

I'm receiving the following error when trying to re-run a job from failure:

Copy code

TypeError: 'NoneType' object is not iterable
  File "/opt/venv/lib/python3.9/site-packages/dagster/_grpc/impl.py", line 372, in get_external_execution_plan_snapshot
    create_execution_plan(
  File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/api.py", line 1003, in create_execution_plan
    return ExecutionPlan.build(
  File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 1023, in build
    return plan_builder.build()
  File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 210, in build
    executable_map, resolvable_map = _compute_step_maps(
  File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 1461, in _compute_step_maps
    _update_from_resolved_dynamic_outputs(
  File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 1176, in _update_from_resolved_dynamic_outputs
    resolved_steps += resolvable_step.resolve(dynamic_mappings)
  File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/step.py", line 337, in resolve
    for mapped_key in mappings[self.resolved_by_step_key][self.resolved_by_output_name]:

This graph has conditional branching logic like this: op1 -> (op2, op3) -> op4 where op2, op3 are conditional in that one or both may get executed. In this case only op2 was executed. Is it not possible to re-run from failure when one of the upstream ops was conditionally not executed?

Zach

06/26/2023, 5:54 PM

I know I've ran into this before but I was under the impression that it had been fixed

sean

06/26/2023, 10:33 PM

Hey Zach, I’m going to try to repro this

👍 1

sean

06/30/2023, 2:57 PM

Hey Zach in trying to repro this I’ve realized I need some more info about what exactly is failing.

Is it not possible to re-run from failure when one of the upstream ops was conditionally not executed?

I’m not understanding this since an op will only be executed if its inputs are supplied. So, if only 1 of (op2, op3) gets executed, then op4 should never execute in the first place, so I don’t understand how it could be reexecuted.

Zach

07/05/2023, 9:49 PM

Hey Sean, thanks for following up. I was on vacation for a bit and am just getting back to things. I think I may have mis-described the graph we're using as it was created by a former coworker who had a knack for hacking things together in ways that were maybe not originally intended. It looks like the graph isn't using conditional outputs quite in the way they are described in the literature - instead the

.outs

property of the op definition is used to create a dictionary of output definition names to the actual outputs. I'm not sure I quite understand how it works but it appears that this allows the graph to map over the output definition names, regardless of whether the outputs were actually produced - in the case that the outputs weren't produced, they essentially get skipped, and the subsequent

.collect()

call just produces an empty list which is passed to the downstream op (op4). This empty list is then passed as an input to op4, so that it is able to have all its inputs and get executed. Here's the code for the graph to try to make it a bit more clear:

Copy code

@graph
def htp_cross_platform_graph():
    generated_params = dict(
        zip(
            generate_params_cross_platform.outs.keys(),
            generate_params_cross_platform(),
        )
    )

    mapped_databricks = (
        generated_params["databricks"]
        .map(htp_cross_platform_databricks.alias("htp_databricks"))
    )

    mapped_dnanexus = (
        generated_params["dnanexus"]
        .map(htp_cross_platform_dnanexus.alias("htp_dnax"))
    )

    output_file_paths = postprocess(
        analysis_name=generated_params["analysis_name"],
        databricks_values=mapped_databricks[0].collect(),
        dnanexus_values=mapped_dnanexus[0].collect(),
        output_location=generated_params["output_location"],
        base_spec=generated_params["base_spec"],
        databricks_result_metadata=mapped_databricks[1].collect(),
        dnanexus_result_metadata=mapped_dnanexus[1].collect(),
    )
    return output_file_paths

I'll also see if I can't throw together a simple version of the underlying ops so you can have a working example

Open in Slack

Previous Next