I have an asset for which I can fetch a daily file...
# ask-community
p
I have an asset for which I can fetch a daily file all the way up to 2021. Before that, the same data is provided in a zip file quarterly. So the whole 2020 daily data is provided in 4 zip files. I can create 2 separate assets, the first using
DailyPartitionsDefinition
and the second using
StaticPartitionsDefinition
but how can I "hide" this implementation detail in downstream assets? Is it possible to "merge" these 2 assets into a single one that spans the whole time range using
DailyPartitionsDefinition
?
Alternatively, I could use a single asset and use the new "Pass partition ranges to single run", but that makes my asset implementation a lot more brittle since it needs to figure out which zip file(s) to download depending on what's being requested.
It's also kind of a one-off thing since once the "historical" asset has been materialized, I don't really need this code anymore.
s
if it's just a one-off thing, might it make sense to write a job that fills in that historical data? i.e. writes the data and then logs an
AssetMaterialization
for each partition that it's filling in here's a potentially useful piece of code: https://github.com/dagster-io/dagster/discussions/12561
p
Ah! yeah, that makes sense. Is there a way for a job to interact directly with an IO Manager (this would be for the part that writes the data out)?
s
op outputs can be handled by an IO manager, but a challenge is that the context object that's passed to it won't have the right attributes (e.g. will be missing the relevant asset key and partition). you could instantiate your IO manager directly and invoke
handle_output
directly, passing a context created with
build_output_context
?
p
Alright, thanks for the insights, I'll fiddle around with this, it shouldn't be too bad to mimic what the IO manager is doing