https://dagster.io/ logo
d

dwall

02/14/2020, 9:43 PM
I see a ton of these awesome dataframe constraints built out in the dagster-pandas library, but I'm unclear on how to properly leverage them in a real world pipeline: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-pandas/dagster_pandas_tests/test_constraints.py#L37
m

max

02/14/2020, 9:45 PM
@abhi are you on a plane 🙂
d

dwall

02/14/2020, 10:25 PM
so it looks like my DataFrame is failing a type check, but here is the full traceback:
Copy code
dagster.core.engine.engine_inprocess.DagsterTypeCheckDidNotPass: Type check failed for step output test_dataframe of type TestDataFrame.
  File "/Users/dwall/.local/share/virtualenvs/dataland-dagster-Z2VR7MFq/lib/python3.8/site-packages/dagster/core/engine/engine_inprocess.py", line 274, in dagster_event_sequence_for_step
    for step_event in check.generator(_core_dagster_event_sequence_for_step(step_context)):
  File "/Users/dwall/.local/share/virtualenvs/dataland-dagster-Z2VR7MFq/lib/python3.8/site-packages/dagster/core/engine/engine_inprocess.py", line 556, in _core_dagster_event_sequence_for_step
    for evt in _create_step_events_for_output(step_context, user_event):
  File "/Users/dwall/.local/share/virtualenvs/dataland-dagster-Z2VR7MFq/lib/python3.8/site-packages/dagster/core/engine/engine_inprocess.py", line 581, in _create_step_events_for_output
    for output_event in _type_checked_step_output_event_sequence(step_context, output):
  File "/Users/dwall/.local/share/virtualenvs/dataland-dagster-Z2VR7MFq/lib/python3.8/site-packages/dagster/core/engine/engine_inprocess.py", line 502, in _type_checked_step_output_event_sequence
    raise DagsterTypeCheckDidNotPass(
am I missing something or is it not telling me which type check failed?
m

max

02/14/2020, 10:27 PM
hm, is that in a test using execute_pipeline
d

dwall

02/14/2020, 10:28 PM
yes
but traceback looks identical in dagit as well
m

max

02/14/2020, 10:29 PM
you should be able to look at
metadata_entries
on the exception object
and those should display in dagit.. are they not?
d

dwall

02/14/2020, 10:30 PM
let me take another look
ah!!!! okay they appear before the exception
Copy code
2020-02-14 14:27:06 - dagster - DEBUG - test_gsheet_client - 5668136f-0f45-46ec-9942-be014cbcae95 - STEP_OUTPUT - Yielded output "test_dataframe" of type "TestDataFrame". Warning! Type check failed.
 event_specific_data = {"intermediate_materialization": null, "step_output_handle": ["gsheet_to_dataframe.compute", "test_dataframe"], "type_check_data": [false, "test_dataframe", "Violated ColumnTypeConstraint (Column dtype must be {'datetime64[ns]'}) for Column Name (date) ", []]}
nice
a

abhi

02/14/2020, 10:44 PM
Glad everything got sorted out here. When you get something working in your investigations, I would love to chat about your thoughts on the API and improvements we could make there.
d

dwall

02/14/2020, 11:34 PM
@max btw the metadata attached to the type check output is NOT showing up in dagit
I see this in console:
Copy code
2020-02-14 15:32:21 - dagster - DEBUG - test_gsheet_client - dfed1330-31aa-45fc-ade9-1fe05f5af8bb - STEP_OUTPUT - Yielded output "test_dataframe" of type "TestDataFrame". Warning! Type check failed.
 event_specific_data = {"intermediate_materialization": null, "step_output_handle": ["gsheet_to_dataframe.compute", "test_dataframe"], "type_check_data": [false, "test_dataframe", "Violated UniqueColumnConstraint (Column must be unique.) for Column Name (id) The offending (index, row values) are the following:         date  amount   id           notes  approved category\n3 2020-04-01  600.34  345  another string      True   type_1", []]}
and this in Dagit:
Copy code
Output
Yielded output "test_dataframe" of type "TestDataFrame". Warning! Type check failed.
m

max

02/14/2020, 11:43 PM
that's... not intended behavior
thanks for this
will have a diff up shortly
d

dwall

02/15/2020, 1:22 AM
👍 👍