Hi all! Very basic question as I'm just getting st...
# ask-community
b
Hi all! Very basic question as I'm just getting started with Dagster, but what does Output() do? I'm seeing it used very often in the Hacker News example, but I can't seem to get it to work (it tells me it's looking a 'results' but my data is called something else) nor understand what it does. How is it different to simply returning say a dataframe?
🤖 1
s
It's kind of confusing, but it's basically a wrapper around the result of an op execution. By default, you can just return your
df
with messing with Output:
Copy code
@op
def foo_op() -> pandas.DataFrame:
    df = ...
    return df
Dagster will infer from this code that you have one
Output
of type
pandas.Dataframe
. When it passes
df
to your next op, it's going to by default name it
Result
. So you'll see in the logs, "pickling Result from `foo_op`", etc. The only time I've had to mess with manually calling the
Output
is when using multiple outputs or DynamicOutputs. In those cases, you may have to add names/keys/additional metadata when returning it from your function, like this:
Copy code
@op(out=DynamicOut(pandas.DataFrame))
def foo_op() -> list:
    list_of_dfs = ...
    for ii, df in enumerate(list_of_dfs):
        yield DyanmicOutput(df, mapping_key=str(ii))
j
hey @Barry Sun! Stephen is 100% correct about what Out and Output do! I would be curious to see what you are trying that isn't working to help get you to a working solution! In general, if you are returning multiple outputs from an op you can do any of the following: The most verbose
Copy code
@op(
    out={"first_out": Out(dagster_type=int), "second_out": Out(dagster_type=str)}
)
def this_will_work():
    return Output(value=1, output_name="first_out"), Output(value="a", output_name="second_out")
slightly less verbose
Copy code
@op(
    out={"first_out": Out(dagster_type=int), "second_out": Out(dagster_type=str)}
)
def this_will_also_work():
    return 1, "a"
even less verbose
Copy code
@op(
    out={"first_out": Out(), "second_out": Out()}
)
def this_will_also_also_work():
    return 1, "a"
For the first case, being explicit about the names of the outputs when you return them allows you to return them in any order. For example, i could
return Output(value="a", output_name="second_out"), Output(value=1, output_name="first_out")
and dagster would still ensure that the correct value is attached to the right output name. This really becomes helpful if you want to use
yield
syntax to emit outputs at different points in the op, but that's a more complex use case, and you probably won't need to worry about it for a while. in the second case, by not returning the outputs wrapped in the
Output
class you need to make sure you return the values in the order specified in the
out
dict The first and second cases include the
dagster_type
arg in
Out
which dagster will use to run type checks on the outputs that are returned to ensure that they are the type you expect them to be. But that is optional, as seen in the third case
happy to clarify any of this more! it's a lot of information!
b
Thanks both Stephen and Jamie 😄 That's a lot of information - a lot more than I was expecting 😂. I think essentially, for single outputs (which is most of my outputs) I can do without using
Output
. For the example I was trying, I didn't specify the
out
dict so I had an error saying the name of my object wasn't found...
j
cool! glad you got it working! no worries about not digging into all the gritty details of outs and outputs right away, just wanted to make sure the information was available depending on what kind of issue/use case you were running into
🎉 1
b
Yep, the information is very useful and I'm sure I'll refer back to it in the future! Thanks a lot Jamie and Stephen 🙂
a
Hi there! @jamie @Stephen Bailey I'm ressurecting this topic to ask about this behaviour: I know recent py versions rely on dict order, but I wonder if that would be good to add an option to output a dict with outs keys instead of a list.