When I run a multi-partition backfill as a single ...
# ask-community
c
When I run a multi-partition backfill as a single job, the asset materializes for each partition - does this mean the asset code is being executed for each partition? My goal was to only run the asset op once, using the partition range, so I'm not making multiple requests to the API (which accepts a date range). My asset is using context.asset_partition_key_range_for_output(). To make things more confusing, each materialization has the records for all assets. e.g. each partition should have 26 rows, but when I backfill for a 3-day date range, instead of getting a single materialization of 78 rows, I get 3 separate materializations of 78 rows, which I assume means the asset code is being run (and the API is being hit) three times.
j
when you run the backfill, are you selecting the launch as a single run option?
c
yep, it was ran in a single run, but with three materializations inside that run
image.png
j
hmm ok, that might be expected - @claire do you know if the single run backfill will still emit AssetMaterialization events for the number of partitions being materialized as part of that single run?
c
@jamie I actually think it is expected. I just added a print statement inside my asset code, and when I ran the same single-run backfill for three days, it only printed to stdout once - meaning my code is only run once.
b
This does seem to happen - running an incremental dbt model over a time range in a single run generates lots of those events
j
cool. it makes sense that we would still log the three AssetMaterialization events to communicate with the db that all the partitions were materialized by this run. This chunk of the code base is not my area of expertise though, so good to get other validation that that’s what’s going on
c
Agreed, it looks like the return statement is being re-ran because I have a markdown preview of my result dataframe as metadata, which displays five random rows using df.sample(), and when I look at that preview for each materialization, the rows displayed are different. Nevermind, I was wrong about that - they are the same! So the code isn't being re-ran at all, which makes sense. The materialization is just being displayed three times in the UI.
c
Yes--it is expected. You will get a separate asset materialization event for each partition that is selected for a single run backfill