https://dagster.io/ logo
#ask-community
Title
# ask-community
v

Vipul Shekhawat

07/25/2023, 3:57 PM
hi there, I’m trying to figure out the best way to set up Dagster Types to check the schemas of pandas dataframes that are inputs and outputs to ops in my pipeline. in the docs, I see that using dagster types in python type annotations is discouraged: https://docs.dagster.io/concepts/types#dagstertypes-vs-python-types-mypy-type-checking. what’s the reason for this? this seems contradictory to the way dagster types are inferred from type hints, which suggests that you can type check results using python type annotations
s

sandy

07/25/2023, 8:20 PM
in general, Python expects type annotations to be valid Python types, but
DagsterType
instances are not valid python types, and thus don't work well with most Python static type checkers
v

Vipul Shekhawat

08/09/2023, 4:39 PM
for posterity, I wanted to share a solution we implemented at my company in case it’s helpful to anyone else. we created a simple wrapper class that automatically converts pandera schemas to dagster types for use in type checking. the end result is you can just use this wrapper for type annotations in dagster ops, and the inputs and outputs will be type-checked automatically without needing to manually convert pandera schemas to dagster types everywhere, and without having to specify the
ins=
and
out=
decorator parameters:
Copy code
@op()
def my_compute_op(input_df: DataFrame[InputSchema]) -> DataFrame[OutputSchema]:
    """
    input_df will be validated using InputSchema, and the result will be
    validated using OutputSchema. All typing information parsed from the
    schemas will also automatically show up in dagit
    """
    pass
👌 1
s

sandy

08/09/2023, 5:42 PM
thanks for sharing Vipul! fyi if you put this in a Github discussion, it will be more likely to show up in google search results
👍 1
2 Views