03/17/2022, 12:08 PM
Good question. I guess it depends on the product logic behind the data and your day to day intervention on the pipelines themselves. In my case, I get data from vendors with broken APIs or FTP dumps. Backfills can be useful to rebuild the missings bits from the UI. Likewise, the commercial team asks for a rebuild outside of schedule : you rematerialize only the subset of partitions that you need.
❤️ 1

George Pearse

03/18/2022, 11:10 AM
I think if I used a cloud vendor I'd push for it (backfills within time partitions), and then run a lot in parallel, but I'm on prem, so that's not a reasonable option.