https://dagster.io/ logo
n

Noah Trueblood

08/22/2019, 9:39 PM
Hi everyone, I ran into an error that I am having trouble understanding: The error occurs when I yield an OutputDefinition with a dagster_type of DataFrame (from dagster_pandas) from a solid. For example, when executing the following solid:
Copy code
@solid(output_defs=[OutputDefinition(name='df',dagster_type=DataFrame)])
def get_df(context):
  yield OutputDefinition(pd.DataFrame([1, 2, 3]), 'df')
The following error occurs:``` ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/errors.py", line 104, in user_code_error_boundary yield File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/engine/engine_inprocess.py", line 568, in _user_event_sequence_for_step_compute_fn for event in gen: File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/execution/plan/compute.py", line 75, in _execute_core_compute for step_output in _yield_compute_results(compute_context, inputs, compute_fn): File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/execution/plan/compute.py", line 52, in _yield_compute_results for event in user_event_sequence: File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/definitions/decorators.py", line 343, in compute for item in result: File "/Users/noahtrueblood/dappr/data/transform/derive.py", line 57, in get_df yield OutputDefinition(pd.DataFrame([1, 2, 3]), 'df') File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/definitions/output.py", line 31, in init self._runtime_type = check.inst(resolve_to_runtime_type(dagster_type), RuntimeType) File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/types/runtime.py", line 538, in resolve_to_runtime_type dagster_type = remap_python_type(dagster_type) File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/dagster/core/types/mapping.py", line 13, in remap_python_type if type_annotation == int: File "/Users/noahtrueblood/.local/share/virtualenvs/dappr-Fw8AAVno/lib/python3.7/site-packages/pandas/core/generic.py", line 1478, in nonzero .format(self.class.name))
Copy code
I see in `dagster/python_modules/dagster/dagster/core/types/mapping.py` that an equality is performed. For example: `type_annotation == int`. Which makes me think that type_annotation is of type pd.DataFrame instead of type DataFrame. Any ideas?
Interestingly, the error does not occur when I use a return instead of a yield:
@solid(output_defs=[OutputDefinition(name='df',dagster_type=DataFrame)]) def get_df(context): return pd.DataFrame([1, 2, 3]) ```
n

nate

08/22/2019, 9:51 PM
It might be just that Dagster expects you to
yield Output(pd.DataFrame([1, 2, 3]), 'df')
but, we should clearly give a better error here 🙂
n

Noah Trueblood

08/22/2019, 9:55 PM
Oh, right, thank you! That is embarrassing haha
n

nate

08/22/2019, 9:56 PM
ha no worries! We should definitely provide an error along the lines of “you yielded something else X but we expect an
Output
object” - will take a look
👍 1
m

max

11/05/2019, 9:49 PM
thanks for this report @Noah Trueblood, we have a diff out to make the error message in this case more useful
n

Noah Trueblood

11/08/2019, 5:35 PM
@max thank you!
2 Views