I have a graph with 2 ops that are being launched ...
# ask-community
j
I have a graph with 2 ops that are being launched via
k8s_job_executor
. the second op takes one of the outputs (
model_path
) of the first op as an input. it fails when starting the second op
Copy code
dagster.core.errors.DagsterExecutionLoadInputError: Error occurred while loading input "model_path" of step "prediction_op":
The above exception was caused by the following exception:
FileNotFoundError: [Errno 2] No such file or directory: '/opt/dagster/dagster_home/storage/ebc695a2-524b-4781-8e9b-9402880c6a49/training_op/model_path'
I don’t understand why it’s referring to `/opt/dagster/dagster_home/storage/ebc695a2-524b-4781-8e9b-9402880c6a49/training_op/model_path`…
model_path
should just be a string (a location in gcloud). is there anything I have to define in my
K8sRunLauncher
?
d
Seems like you are not using a gcs IO manager. The string looks like a local filesystem path.
j
thanks! i’m probably just using the default. so far I haven’t used
k8s_job_executor
so all my jobs ever ran in one pod. now it’s different I guess, let me take a look at the gcs IO manager you mentioned
y
You can set up GCS IO manager to enable cross-node data passing: https://docs.dagster.io/deployment/guides/gcp#using-gcs-for-io-management
j
thanks, that seems to be working now!
blob thumbs up 1
there is one thing I don’t understand though: so when I run the pipeline I can see that a
dagster-run..
pod is created, which then in turn launches every op in a separate
dagster-step..
pod. Why is there a need to write the outputs to gcloud using the gcs IO manager. couldn’t they just be written to the local filesystem of the
dagster-run..
pod?
y
Great question. Each step will be executed within its own ephemeral kubernetes pod, and outputs will be written to the local filesystem of
dagster-step …
instead of
dagster-run …
. I believe the
dagster-run …
pod is crated by the RunLauncher per Dagster job run (K8sRunLauncher) which is separate from your user code (i.e. executor) Here’s the architecture diagram of Dagster deployment: https://docs.dagster.io/deployment/overview#architecture, where Executor is responsible for running user code (ops, assets, jobs, etc), while Run Launcher handles the orchestration.