Prasad Chalasani
01/28/2021, 7:59 PMdbt
to manage a workflow consisting of transforming a dataset from bigquery, then running ML on it, and putting results back in bigquery. I know this was asked before, but one question — when re-running a pipeline containing heavy computations, are there mechanisms for avoiding re-computation of parts whose inputs haven’t changed? Let’s say the solids’ inputs [outputs] are results read from [dumped to] files Essentially some type of caching/invalidation mechanism.johann
01/28/2021, 8:01 PMchris
01/28/2021, 8:03 PMPrasad Chalasani
01/28/2021, 8:03 PMchris
01/28/2021, 8:04 PMPrasad Chalasani
01/28/2021, 8:04 PMcat
01/29/2021, 12:53 AMPrasad Chalasani
01/29/2021, 1:12 AMcat
01/29/2021, 1:18 AM