What is the analog for Airflow's `LatestOnlyOperat...
# ask-community
d
What is the analog for Airflow's
LatestOnlyOperator
in Dagster? for example, I have a job that inferences an ML model and pushes some values to Redis. I only want to push the result to Redis for the latest partition (or, in Airflow terms, for the latest execution_date), so the production won't be affected by backfills, only the partitioned assets along the job. What would be a solution in Dagster?
đź‘€ 1
s
Good question! My first thought is to do something like
Copy code
if context.get_tag("backfill"):
    <http://context.log.info|context.log.info>("Is backfill run. Skipping...")
else:
    ... do not-backfill things
if you could write some code to get the latest partition for the job, you could also do a comparison of
context.get_tag("partition") = latest_partition
, but I'm not sure how you would do that exactly.
d
the problem here is accessing
latest_partition
... @sandy here is another reason to expose the exact partitions set (both global and mapped) to the execution context
s
I'm not familiar with the LatestOnlyOperator, so trying to wrap my head around this a little more - are you saying that, when you run a backfill, there are some assets that you want to update every partition of, but other assets that essentially are unpartitioned and thus you only want to execute a single step for?
d
Right! For example, usually in the very end of a model inference pipeline we have an asset with a “latest only” execution, like deploying model predictions for the current day to a key-value storage. we want to always make sure only the latest partition of this task is actually runnable.