Zach
06/26/2023, 5:28 PMTypeError: 'NoneType' object is not iterable
File "/opt/venv/lib/python3.9/site-packages/dagster/_grpc/impl.py", line 372, in get_external_execution_plan_snapshot
create_execution_plan(
File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/api.py", line 1003, in create_execution_plan
return ExecutionPlan.build(
File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 1023, in build
return plan_builder.build()
File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 210, in build
executable_map, resolvable_map = _compute_step_maps(
File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 1461, in _compute_step_maps
_update_from_resolved_dynamic_outputs(
File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/plan.py", line 1176, in _update_from_resolved_dynamic_outputs
resolved_steps += resolvable_step.resolve(dynamic_mappings)
File "/opt/venv/lib/python3.9/site-packages/dagster/_core/execution/plan/step.py", line 337, in resolve
for mapped_key in mappings[self.resolved_by_step_key][self.resolved_by_output_name]:
This graph has conditional branching logic like this:
op1 -> (op2, op3) -> op4
where op2, op3 are conditional in that one or both may get executed. In this case only op2 was executed. Is it not possible to re-run from failure when one of the upstream ops was conditionally not executed?Zach
06/26/2023, 5:54 PMsean
06/26/2023, 10:33 PMsean
06/30/2023, 2:57 PMIs it not possible to re-run from failure when one of the upstream ops was conditionally not executed?I’m not understanding this since an op will only be executed if its inputs are supplied. So, if only 1 of (op2, op3) gets executed, then op4 should never execute in the first place, so I don’t understand how it could be reexecuted.
Zach
07/05/2023, 9:49 PM.outs
property of the op definition is used to create a dictionary of output definition names to the actual outputs. I'm not sure I quite understand how it works but it appears that this allows the graph to map over the output definition names, regardless of whether the outputs were actually produced - in the case that the outputs weren't produced, they essentially get skipped, and the subsequent .collect()
call just produces an empty list which is passed to the downstream op (op4). This empty list is then passed as an input to op4, so that it is able to have all its inputs and get executed. Here's the code for the graph to try to make it a bit more clear:
@graph
def htp_cross_platform_graph():
generated_params = dict(
zip(
generate_params_cross_platform.outs.keys(),
generate_params_cross_platform(),
)
)
mapped_databricks = (
generated_params["databricks"]
.map(htp_cross_platform_databricks.alias("htp_databricks"))
)
mapped_dnanexus = (
generated_params["dnanexus"]
.map(htp_cross_platform_dnanexus.alias("htp_dnax"))
)
output_file_paths = postprocess(
analysis_name=generated_params["analysis_name"],
databricks_values=mapped_databricks[0].collect(),
dnanexus_values=mapped_dnanexus[0].collect(),
output_location=generated_params["output_location"],
base_spec=generated_params["base_spec"],
databricks_result_metadata=mapped_databricks[1].collect(),
dnanexus_result_metadata=mapped_dnanexus[1].collect(),
)
return output_file_paths
I'll also see if I can't throw together a simple version of the underlying ops so you can have a working example