Antonio Bertino
07/30/2022, 2:21 PMowen
08/01/2022, 4:39 PM"<base path>/<run id>/<step key>/<output name>"
, so it is possible to reconstruct this path at runtime and load from it (even if it's a bit of a pain). In this case, you can pass in the upstream run id as configuration when launching your downstream job from your run status sensor. The step key will be the name of the step whose output you want to load (this is generally just the same as the name of the op), and the output name will be "result" by default.
You could do this using a root input manager, or just do that loading within the body of an op.Antonio Bertino
08/01/2022, 8:14 PMRoei Jacobovich
08/02/2022, 7:16 AMGabriel Montañola
08/02/2022, 6:29 PMs3_io_manager
.
Is there any examples of how can we load the result of a past job run on another one?
Ex:
Job A -> outputs a pretty pandas dataframe
Job B -> loads this pretty panda dataframe from s3 (AFAIK it's pickled) and do other stuffowen
08/02/2022, 8:23 PM_get_path
function to create the correct path, and this input manager will need to have a config schema that allows you to input the run id of the upstream output that you want to load (because the path depends on that run id)Gabriel Montañola
08/02/2022, 8:25 PMmake a custom root_input_manager using code from *s3 io manage*r 🙂
Antonio Bertino
08/03/2022, 3:57 AMGabriel Montañola
08/03/2022, 12:12 PMowen
08/03/2022, 4:51 PM