Hello, want to check if I am misunderstanding, or ...
# ask-community
s
Hello, want to check if I am misunderstanding, or seeing a bug. My understanding is auto-materialization only works on latest partition, i.e. it never tries to backfill for us. Given •
weather_data_df
has the latest partition materialized •
latest_weather_data_df
is set to eager •
latest_weather_data_df
has never been materialized before I expect
Materialization is missing
to be checked. Though that is not the case after 29 evaluations. I suspect this is due to usage of
LastPartitionMapping
to clear the partition. For completeness, the previous 2 skips were due to
Waiting on upstream data
Extra context: I have • dynamic partitioned
domain_model
• daily partitioned
weather_data_df
I want
domain_model
to build with the latest materialized partition of
weather_data_df
I believe this is currently not possible with partition mapping. What I end up doing is creating a
latest_weather_data_df
using
LastPartitionMapping
that eagerly updates.
d
you can set
max_materializations_per_minute
to a higher value to achieve the backfill effect
s
Thanks Daniel, I tried backfilling the first partition of
weather_data
, but I wasn’t able to get the
weather_data_df
to trigger any Materialization condition. Given •
weather_data
’s first partition is there • `weather_data_df`’s first partition has not materialized before • I waited for the
Evaluation History
count to increment I expect
Materialization is missing
to be true, but that is not the case.
2 questions mixed in there 1. Why is
Materialization is missing
not triggered for backfill case, and 2. Why is
Materialization is missing
not triggered from the most recent partition for
latest_weather_data_df
, which I suspect is related to
LastPartitionMapping
usage
I solved the 2nd Q. It was a PEBCAK. I either neglected to set
LastPartitionMapping
or lost it through aggressive ctrl-z-ing. Setting
LastPartitionMapping
made it work with only the latest partition available as expected. Without specifying
LastPartitionMapping
in the
ins
, the default behavior is to pass all partitions in a dict. So I can appreciate why it didn’t start, but hard to figure that out by staring. Trying to figure out 1 now.