Hi folks, I have an asset materializing a moderate...
# ask-community
j
Hi folks, I have an asset materializing a moderately sized Pandas DataFrame. I'm using
dagster-pandera
to define a Dagster type based on a Pandera scheme that should validate the materialization result. When I save the DataFrame to a pickle, the size is something over 800MB. The problem is, when I add a
dagster_type
argument to my
asset
decorator with the Dagster type, the process consumes all of my 64GB of RAM very quickly and then my machine freezes. So I wonder, is this a known problem? Is there any way to make the validation more efficient? Thanks for any help in advance!
By the way, the same happens when I load the DataFrame in a downstream asset if I set the Dagster type as an input validation. If I remove the validation, the DataFrame is loaded with no issues.