Hello, I’ve been looking at `observable_source_ass...
# ask-community
Hello, I’ve been looking at
but it’s not clear how I can later access the
in code For context: • Every day, an external job writes a new file in an S3 bucket (eg
). We want to process this file with a bunch of dowstream assets • I modeled this with
which returns the date of the latest available dump, by listing the bucket and finding the max date • The downstream assets are as follows:
Copy code
def second_layer(dump: bytes) -> pd.DataFrame:
• I believe I must now implement a custom IOManager that knows how to load the
asset? must I reimplement the logic to find the latest version? is there a way to read the version from the
? Let me know if that makes sense or if another model would be better Thanks!
@sean - are you able to help out with this one?
Hi Louis, Currently the IO manager and observation function are separate-- if you want your IO manager to load the most recent file and the observation function to record it as a data version, I recommend extracting the logic that detects the most recent file into a function and calling it from both the IO manager and the observation function. Then if you run the observable on a schedule, it will tell you when a new file is available for your downstream assets.