https://dagster.io/ logo
#dagster-support
Title
# dagster-support
d

David Erukhimovich

06/06/2022, 6:38 AM
Hello dagsters, a question for you: Let's say I am ru-running a job from certain op, call it op A. Downstream of op A I have op B, which has another input - from op C. Now op C has nothing to do with op A, so when re-running from A, it won't be executed and op B will fail (
dagster.core.errors.DagsterInvariantViolationError: Attempting to access run_id, but it was not provided when constructing the OutputContext
) Any idea how to get around it? Thanks!
y

yuhan

06/06/2022, 5:34 PM
Hi David, you can try
+B
which should re-execute B with both A and C.
d

David Erukhimovich

06/06/2022, 6:05 PM
And... If I have several of such Bs? And if I use the rerun from failure feature? (Sorry about all the questions, but I really want to avoid them all being on one line, just because of that...)
y

yuhan

06/06/2022, 7:00 PM
If C has succeeded, B should automatically load the inputs from previous result — you don’t need to re-execute C every time. Could you provide more details about your DAG?
d

David Erukhimovich

06/07/2022, 4:58 AM
You are actually right. I implemented the hypothetical example I gave you, and everything works as you said
y

yuhan

06/07/2022, 6:46 PM
Glad it worked!
d

David Erukhimovich

06/07/2022, 6:48 PM
Thanks! Appearently I had an unrelated bug in my code 🙃
Hi @yuhan Sorry for the delayed response, apparently I was wrong in presenting the issue, but now I am able to reproduce in in a simple DAG. The problem is the following: Assume there is an early op has a non required output that goes into later op. In this particular run, the output has not returned. Then we re-execute the DAG from the middle, the later op will fail on
No previously stored outputs found for source StepOutputHandle(step_key='if_op', output_name='enable_b', mapping_key=None). This is either because you are using an IO Manager that does not depend on run ID, or because all the previous runs have skipped the output in conditional execution.
Copy code
@op
def before1():
    return 1


@op
def before2(arg):
    return arg


@op(
    out={
        "enable_a": Out(is_required=False),
        "enable_b": Out(is_required=False)
    }
)
def if_op():
    yield Output(0, "enable_a")


@op
def a(_enable, arg):
    return arg


@op
def b(_enable, arg):
    return arg


@job
def test():
    arg = before1()
    arg = before2(arg)
    enable_a, enable_b = if_op()
    a(enable_a, arg)
    b(enable_b, arg)
After re-execute from o"before2":
y

yuhan

06/14/2022, 5:49 AM
I believe this is expected, because
b
’s input wasn’t previously persisted (i.e.
a
didn’t yield anything so the re-execution can’t load it)
d

David Erukhimovich

06/14/2022, 5:50 AM
But this is true for the first run as well, no?
I would expect that re-execution will behave the same as the original execution if nothing has changed
y

yuhan

06/14/2022, 5:52 AM
what would you expect in this re-execution case? do you expect
b
to follow the current `a`’s output? meaning when the current
a
doesn’t yield anything,
b
should skip instead of failing; when
a
in the re-execution yields something,
b
will run regardless of its previous state?
I would expect that re-execution will behave the same as the original execution if nothing has changed
i see - yea that makes sense. i think i was mistaken - you are right. this seems to be a bug as you included
a
in the re-execution so
b
shouldn’t error.
d

David Erukhimovich

06/14/2022, 5:54 AM
Original execution: before1 -> before2 -> a (because if_op returned
enable_a
). b is skipped re-exeution: before2-> a. b is skipped
y

yuhan

06/14/2022, 5:55 AM
@Dagster Bot issue re-execution with conditional branching results in unexpected “can’t load” state
d

Dagster Bot

06/14/2022, 5:55 AM
d

David Erukhimovich

06/14/2022, 5:55 AM
Cool bot :)
🎉 1
y

yuhan

06/14/2022, 5:55 AM
cool - filing an issue for bug tracking
btw which dagster version are you on?
d

David Erukhimovich

06/14/2022, 5:56 AM
14 or 15 I think
Do you believe it is a good candidate for fixing soon?
y

yuhan

06/14/2022, 5:59 AM
I fixed a similar bug related to re-execution in 0.14.17 - which may be tangential but let me see if i can repro this issue in recent version
d

David Erukhimovich

06/14/2022, 6:00 AM
Pretty sure I saw it in the latest version as well
y

yuhan

06/14/2022, 6:17 AM
yea - i was able to repro it on master.
i just realized
b
depends on both
before2
and
if_op
, so in order to have
b
correctly being skipped in re-execution, you’ll need to include
if_op
as well. otherwise, in the re-execution, the execution machinery would try to find `if_op`’s output for
b
to load, and in your case it failed.
if_op*, before2*
should work in your case.
d

David Erukhimovich

06/14/2022, 6:41 AM
But I can't really do it from dagit right?
And - do you consider it as the expected behavior or a workaround?
y

yuhan

06/14/2022, 6:44 AM
oh you can do it in dagit. you can use the same selection syntax here and the
Re-execute
button would reflect whatever you selected
i’d consider it as an expected behavior because only so dagster knows exactly what the
b
’s inputs are, e.g. when the source of the input is in the same run, it would load from the same run; when the source is not being executed, it would try load from the previous run which in your case the output didn’t exist. but all that being said, i do think the error message is confusing and we should improve that.
d

David Erukhimovich

06/14/2022, 6:49 AM
Why wouldn't you just assume that if output not exist, the op should be skipped? why raise an exception?
y

yuhan

06/14/2022, 6:59 AM
actually you are right - this is expected as in a known limitation in the system, not expected as in we think it’s the right thing to do. detailed explanation about this limitation is here: https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/core/execution/context/system.py#L662-L676
we recently made some changes to the core which makes me think that the path to improving this behavior isn’t as hard as before. let me take a look and will get it back to you soon. to answer your question, i believe it’s a good candidate to fix in our next bug bash. but the team doesn’t have too much bandwidth at the moment so i can’t promise a quick fix tho.
d

David Erukhimovich

06/14/2022, 7:05 AM
I understand, thank you. I just wonder , I would expect it would be a common use case, and this will bother more users. What is special in what I do that is different from most? Is it the conditions, or the re-exections?
y

yuhan

06/14/2022, 7:10 AM
my hunch is this is a bit of an edge case where you had conditional branching and re-execution together. also, conditional branching isn’t a commonly known execution pattern to my knowledge - which i believe attributed to lack of good docs around it blob smile sweat2 for users who do encounter it, they might be re-executing with different selection such as
+b
or
if_op*, before2*
i think what you’re doing is a solid use case, once more users know of conditional branching, i believe this would become a more outstanding limitation of the system. — so, we should fix this.
🎉 1
d

David Erukhimovich

06/14/2022, 7:23 AM
I am glad to hear that. Thank you!