Eric Cheminot
03/31/2022, 9:52 AMsandy
03/31/2022, 10:15 PMPartitionMapping
abstraction.
In practice, it's common to bound partition dependencies with the equivalent of a "watermark". I.e. you say that, if data arrives more than X minutes/hours/days later than its event time, it gets ignored. This makes it more tractable to keep the derived asset up-to-date as new data arrives for the upstream assets. It essentially corresponds to the "standard deviation" header in Maxime's post.
For versioned derived assets, one approach is to have a 2-D partitioning where one of the dimensions is the version.
Does that answer your question?Eric Cheminot
04/01/2022, 8:19 AMvictor
04/05/2022, 8:55 AMsandy
04/05/2022, 2:58 PMEric Cheminot
04/06/2022, 9:16 AM