https://dagster.io/ logo
d

Danny

07/03/2020, 3:34 PM
I created a general purpose fan-in solid:
Copy code
# Adapted from:
# <https://github.com/dagster-io/dagster/blob/e59a93a5d5e7e15a19086a7e773e2357f174e9b1/examples/legacy_examples/dagster_examples/toys/fan_in_fan_out.py>
#
@solid(
    input_defs=[
        InputDefinition('main_fan', Any),
        InputDefinition('other_fans', Optional[List[Any]], default_value=None),
    ],
    output_defs=[
        OutputDefinition(name='main_fan_return_value', dagster_type=Any)
    ],
)
def fan_in(_, main_fan, other_fans):
    yield Output(main_fan, 'main_fan_return_value')
To be used like this:
Copy code
@pipeline
def my_pipeline():
    main_id, other_id = solid2(solid1()) # other_id is an optional output
    fan1 = solid3(main_id)
    fan2 = solid4(other_id)
    main_id = fan_in(main_id, [fan1, fan2])
    solid5(main_id)
This almost works. Only problem is that if
other_id
is not outputted by solid2,
fan_in
doesn't run because Dagster complains its dependencies (fan2 from solid4) didn't produce an output. Clearly I'm using Optional incorrectly here, but can't really see how to fix this. Any advice?
The intended result is that
fan_in
should only not execute due to dependencies not satisfied if
main_fan
isn't there, but for any of the
other_fans
, it's ok if they never executed and
fan_in
should just ignore them
l

Leor

07/03/2020, 3:44 PM
maybe Optional[List[Optional[Any]]?
(or maybe List[Optional[Any]]
d

Danny

07/03/2020, 4:20 PM
Neither one worked blob frown
l

Leor

07/03/2020, 4:20 PM
hmmmmm, unsure then
ahh, you probably need an is_required?
d

Danny

07/03/2020, 4:27 PM
Looks like that's only for OutputDefinition, not InputDefinition https://docs.dagster.io/_apidocs/solids#dagster.InputDefinition
l

Leor

07/03/2020, 4:28 PM
That takes a default value, though -- I think the issue is that it's not quite a default?
d

Danny

07/03/2020, 4:28 PM
You mean the problem is that it's being set to
None
? I'll try changing, one sec
l

Leor

07/03/2020, 4:28 PM
No -- the list overall is present, it's just missing a value
so that default value isn't being used
d

Danny

07/03/2020, 4:36 PM
Not sure I follow. The
other_fans
list in this my_pipeline example would always be present as you say, since even if solid4/fan2 doesn't run, the list will still contain fan1. So I don't really see
default_value
coming into play at all? What appears to be happening is that either I'm using Optional wrong (tried all the combinations you suggested, still not working), or Optional can't denote that the items in a list input are optional, only that the input itself is optional. But once the input is present, like the list is in this case, Dagster seems to enforce that any items in that list are actually produced. If the upstream solid that was supposed to produce them didn't run, the fan_in solid doesn't get run.
l

Leor

07/03/2020, 4:36 PM
My suspicion is it's the latter
(you can't currently denote the optionality of elements in a list trivially)
d

Danny

07/03/2020, 4:37 PM
Got it
Oh, when you mentioned
is_required
above, did you mean I should put that onto the outputs of the solids that feed into that list of fans? i.e. if solid4's fan2 output was `is_required=False`` it should work? Trying now....
l

Leor

07/03/2020, 4:39 PM
It may, unsure
d

Danny

07/03/2020, 4:42 PM
(you can't currently denote the optionality of elements in a list trivially)
Do you mean its hard to do because this hasn't yet been implemented in the current InputDefinition API (but might be easy to implement in Dagster), or because implementing this into the API would be hard to do?
l

Leor

07/03/2020, 4:42 PM
Unsure
d

Danny

07/03/2020, 4:42 PM
👍
@Leor I tried to construct a bunch of fan_in solids at runtime for various numbers of fans.
fan_in_2
,
fan_in_6
, etc. Implementation worked, and dagit correctly recognizes that all fans except the first are optional
This still has the same exact problem
l

Leor

07/03/2020, 6:10 PM
weird, even though they're recognized as optional the solid breaks if they're missing?
d

Danny

07/03/2020, 6:11 PM
Yes. This confirms our suspicion above: Dagster enforces that if any of those optional inputs come from solids that were skipped, the fan_in solid gets skipped as well.
Should I create a bug named "solids are skipped when optional inputs are missing due to upstream solids being skipped" or is this by design?
l

Leor

07/03/2020, 6:12 PM
Unsure
d

Danny

07/03/2020, 6:12 PM
Ok, I'll create it then