Hey guys, Running into some issues on 0.14.9 afte...
# ask-community
c
Hey guys, Running into some issues on 0.14.9 after upgrading from 0.14.6 when testing a dynamic output job w/ an inner graph locally (running dagit as a python process; no other infra). The same job works when deployed on K8s on 0.14.9. The job works locally on 0.14.6. Any thoughts?
Copy code
dagster.check.CheckError: Invariant failed.
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/utils.py", line 47, in solid_execution_error_boundary
    yield
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/inputs.py", line 607, in _load_input_with_input_manager
    value = input_manager.load_input(context)
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/storage/fs_io_manager.py", line 152, in load_input
    context.add_input_metadata({"path": MetadataValue.path(os.path.abspath(filepath))})
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/context/input.py", line 325, in add_input_metadata
    if self.asset_key:
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/context/input.py", line 216, in asset_key
    check.invariant(len(matching_input_defs) == 1)
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/check/__init__.py", line 1167, in invariant
    raise CheckError("Invariant failed.")
o
hmm not off the top of my head. @Chris Evans do you have a code snippet showing the structure of this job/graph? I can dig in a bit
c
Graph looks like the following:
Copy code
@graph
def execute_x_di(
    chunk: Any,
) -> bool:
    apis = Factory.build(start_after=chunk)
    Manager.execute(apis, chunk)


@graph(
    description="",
)
def x_graph():
    chunks = Manager.orchestrate()
    chunks.map(execute_x_di)
Additional error msg:
Copy code
dagster.core.errors.DagsterExecutionLoadInputError: Error occurred while loading input "chunk" of step "execute_x_di.build[0]":
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/execute_plan.py", line 232, in dagster_event_sequence_for_step
    for step_event in check.generator(step_events):
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/execute_step.py", line 306, in core_dagster_event_sequence_for_step
    for event_or_input_value in ensure_gen(step_input.source.load_input_object(step_context)):
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/inputs.py", line 295, in load_input_object
    yield from _load_input_with_input_manager(input_manager, load_input_context)
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/inputs.py", line 607, in _load_input_with_input_manager
    value = input_manager.load_input(context)
  File "/Users/chrisevans/.pyenv/versions/3.9.7/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/chrisevans/Repositories/data-platform/dags/bi/.venv/lib/python3.9/site-packages/dagster/core/execution/plan/utils.py", line 73, in solid_execution_error_boundary
    raise error_cls(
o
I tried to replicate this error, without much luck:
Copy code
def get_dynamic_j():
    @op(out=DynamicOut())
    def things():
        for i in range(3):
            yield DynamicOutput(str(i), str(i))

    @op
    def do(thing):
        return thing

    @graph
    def dos(thing):
        x = do(thing)
        do(x)

    @job
    def j():
        ts = things()
        ts.map(dos)

    return j


def test_dynamic_with_file_manager():
    from dagster.core.test_utils import instance_for_test
    from dagster import reconstructable, execute_pipeline

    with instance_for_test() as instance:
        result = execute_pipeline(
            reconstructable(get_dynamic_j),
            instance=instance,
        )
        assert result.success
I'm thinking whatever's going on might have to do with this Factory.build() thing. Does this construct an op, or is build just a property that returns some static op definition?
I ask because the error seems to imply that this op definition has something strange going on with the names of its inputs (my guess is that it didn't have an input with a name that we expected it to have, but it could also mean that there were multiple inputs with the same name)
so if there's something dynamic going on inside the build() function, that could potentially explain how it got into that weird state
cc @claire as this hits add_input_metadata path
c
The build is an op (staticmethod decorated w/ op). I took the build out of the equation and just ran the other execute op in the inner graph w/ no luck. What is really odd to me is that it works w/ K8s w/ version 0.14.9 and w/ 0.14.6 locally
o
yeah that's extremely odd to me as well... I assume if you run the code I linked it would work for you, so I'm struggling to figure out the difference between the test and reality
when you say taking build out of the equation didn't work, what exactly did you change/what was the resulting error?
c
I moved
Factory.build(start_after=chunk)
to
Factory.build()
in the inner graph.
start_after
is an
ins=In(Nothing)
kwarg. This resulted in the Factory.build running top level and only the
Manager.execute
op running dynamically. The
Manage.execute
op failed w/ the same error.
o
hi @Chris Evans! I was finally able to replicate this issue -- it turns out that the root cause of this issue has actually been fixed for the version that's going out tomorrow (0.14.10), which was part of the reason it was hard to replicate 🙂. This bug also only seems to crop up in your specific case of dynamic outputs + ops as static methods
but long story short, this issue should be gone in tomorrow's release
c
Awesome! Thanks for the deep dive man