hopefully a simple question: I have two assets: `...
# ask-community
r
hopefully a simple question: I have two assets:
A -> B
A
generates a pandas dataframe and returns it to
B
which does some more stuff with it, and then loads to BigQuery via I/O manager. locally this works great ... but when I deploy to k8s,
B
fails because it can't find the dataframe on the filesystem (at
$DAGSTER_HOME/storage/...
). this makes sense, it's k8s so each job execution may not be looking at the same filesystem. how do I configure
A
(and/or
B
) to properly hand off / receive the dataframe on k8s?
z
You'll need to use an IO manager which writes to cloud storage, like the GCS IO manager
👍 1
r
@Zach sounds good. I wonder if there's a world where by default, each k8s pod spins up a small gRPC server and pass the objects that way (pod to pod directly). that seems cleaner, since it allows Dagster to own Dagster data, instead of relying on an external system.
I guess pods are ephemeral by nature so this wouldn't work