Barry Sun
05/19/2022, 8:27 AMStephen Bailey
05/19/2022, 11:07 AMdf
with messing with Output:
@op
def foo_op() -> pandas.DataFrame:
df = ...
return df
Dagster will infer from this code that you have one Output
of type pandas.Dataframe
. When it passes df
to your next op, it's going to by default name it Result
. So you'll see in the logs, "pickling Result from `foo_op`", etc.
The only time I've had to mess with manually calling the Output
is when using multiple outputs or DynamicOutputs. In those cases, you may have to add names/keys/additional metadata when returning it from your function, like this:
@op(out=DynamicOut(pandas.DataFrame))
def foo_op() -> list:
list_of_dfs = ...
for ii, df in enumerate(list_of_dfs):
yield DyanmicOutput(df, mapping_key=str(ii))
jamie
05/19/2022, 3:31 PM@op(
out={"first_out": Out(dagster_type=int), "second_out": Out(dagster_type=str)}
)
def this_will_work():
return Output(value=1, output_name="first_out"), Output(value="a", output_name="second_out")
slightly less verbose
@op(
out={"first_out": Out(dagster_type=int), "second_out": Out(dagster_type=str)}
)
def this_will_also_work():
return 1, "a"
even less verbose
@op(
out={"first_out": Out(), "second_out": Out()}
)
def this_will_also_also_work():
return 1, "a"
For the first case, being explicit about the names of the outputs when you return them allows you to return them in any order. For example, i could return Output(value="a", output_name="second_out"), Output(value=1, output_name="first_out")
and dagster would still ensure that the correct value is attached to the right output name. This really becomes helpful if you want to use yield
syntax to emit outputs at different points in the op, but that's a more complex use case, and you probably won't need to worry about it for a while.
in the second case, by not returning the outputs wrapped in the Output
class you need to make sure you return the values in the order specified in the out
dict
The first and second cases include the dagster_type
arg in Out
which dagster will use to run type checks on the outputs that are returned to ensure that they are the type you expect them to be. But that is optional, as seen in the third caseBarry Sun
05/20/2022, 6:31 AMOutput
. For the example I was trying, I didn't specify the out
dict so I had an error saying the name of my object wasn't found...jamie
05/20/2022, 3:46 PMBarry Sun
05/23/2022, 12:21 AMAirton Neto
11/09/2022, 6:04 PM