Is there a way to deal with self-referential asset...
# ask-community
d
Is there a way to deal with self-referential assets well? In airflow, I would often do something like:
Copy code
# ensure table exists in db
last_record = #query db for the latest record of a timeseries
fill_data = #query api for data between last_record and time.now()
#optionally do very long running processing task
#append fill_data to database
In dagster, I'm confused. This is often a time partition, but it might be i.e. minute-partitioned data that I want to schedule every 5 minutes and the range nature of the airflow query means the setup/teardown work gets optimized for free in longer queries.
t
Would scheduling multiple partitions into one job/op be a solution?
d
oh, interesting. I hadn't thought of that
s
You can have self-referential dependencies in time-partitioned assets:
Copy code
@asset(
        partitions_def=DailyPartitionsDefinition(start_date="2020-01-01"),
        ins={
            "a": AssetIn(
                partition_mapping=TimeWindowPartitionMapping(start_offset=-1, end_offset=-1)
            )
        },
    )
    def a(a):
        ...