I would like to better understand the the executio...
# announcements
f
I would like to better understand the the execution model of dagster, RunLauncher, ExecutionPlan, Executor, etc. Where should I start?
a
the link in the docs is busted but heres a picture from the repo

https://github.com/dagster-io/dagster/blob/master/docs/sections/api/apidocs/internal/execution_flow.png

that was a snapshot of how things were layered as of March 2020
I would use this as a map then cross reference the API docs https://docs.dagster.io/docs/apidocs/ or just the source code itself
f
Thanks, I will try to have a look also at the libraries
Is there any library currently implementing the RunStep?
a
RunStep?
a
pyspark
uses the
StepLauncher
if thats what you mean
f
Maybe it's just my missunderstanding of it, but it seems that you can create runners at different levels, like step or plan?
I'm not familiar enough with it yet... need to read a bit more code to get familiar
a
ya I think part of what you are referring to is the recently added
StepLauncher
https://dagster.phacility.com/D2688
f
What should I implement if I want to create a custom runner? For example I want to write something that will take a whole compiled pipeline (is this what is called an ExecutionPlan?) and then execute it using its own internal mechanism.
a
likely an
Engine
so the examples to reference would be the celery and dask engines
👍 1
f
How is that different from the k8s one?
a
they are all similar in nature - they are effectively handling how to execute each individual step, usually the key aspect being federating out the work somewhere and managing that
f
So, is it fair to assume that every Launcher will execute something out-process?
Regardless if it's a StepLauncher or RunLauncher?
a
so our
k8s
deployment example uses a
celery_k8s_job_executor
which is submits tasks to a celery queue for each step that will in turn submit k8s jobs
f
Ok, I think I understand better... but at the same time the k8s deployment launches a dagster job to orchestrate the pipeline execution or is it done from the original process?
a
so the
RunLauncher
determines where the
Executor
( I called it Engine above by mistake) or run master is operating then the
Executor
decides how to handle each step and a
StepLauncher
is a way to special case steps from the default
Executor
behavior - the current version does this to ship
pyspark
solids to a spark cluster
👍 1
the
k8s
deployment uses the
K8sRunLauncher
to launch the run master as its own k8s job
f
I think I get, that's a very helpful starter! 👍
Thank you Alex
Once I fully understand it I might try to write it down and open an MR to the docs
👍 2
a
no problem - good luck!
t
it seems like this diagram link is 404ing now -- did that image move?
a

https://github.com/dagster-io/dagster/blob/master/docs/next/public/assets/images/apidocs/internal/execution_flow.png

ya it got moved to fix the broken link in the docs site
t
danke