https://dagster.io/ logo
#dagster-feedback
Title
# dagster-feedback
z

Zachary Bluhm

01/11/2023, 4:30 PM
Backfilling assets that have a self dependency appears to have additional overheard. I couldn't pin point exactly what was going on, but seems like there is some lag between queuing the run and then the run starting that adds to existing overhead. I had no other jobs/tasks running when testing this. I tested with a single asset:
Copy code
@asset(
    partitions_def=HourlyPartitionsDefinition(start_date="2023-01-01-00:00", timezone=DEFAULT_TIMEZONE),
    ins={
        "hourly_depends_on_past": AssetIn(
            partition_mapping=TimeWindowPartitionMapping(start_offset=-1, end_offset=-1)
        )
    }
)
def hourly_depends_on_past(hourly_depends_on_past):
    time.sleep(3)
    return 1
And backfilled 10 partitions at a time. In theory I would expect this to finish locally with no K8s overhead extremely quickly (maybe < 90 secs), but in reality this took closer to 5.5 minutes. What's interesting is that each individual OP only took ~6.5 seconds
s

sandy

01/11/2023, 7:05 PM
When you say you backfilled 10 partitions at a time, how did you kick these off? Did you do this manually by clicking the Materialize button in Dagit?
And are you saying it takes 5.5 minutes to backfill 10 partitions?
z

Zachary Bluhm

01/11/2023, 7:17 PM
Hey Sandy - I kicked off a backfill for 10 partitions using the "materialize selected..." button. And yeah, it took 5.5 minutes for the backfill to complete
s

sandy

01/11/2023, 7:28 PM
got it - will investigate
🙏 1
Ok I discovered the issue and have a fix in mind: https://github.com/dagster-io/dagster/issues/11649. I don't think it will make it into this week's release, but should make it into next week's release.
Thanks for reporting this
z

Zachary Bluhm

01/11/2023, 11:13 PM
No rush, thanks for taking a look