Silly question. What's the difference between a Da...
# ask-community
m
Silly question. What's the difference between a Dagster StepLauncher and Executor?
StepLauncher is responsible for executing steps, either in-process or in an external process.
vs
For the given context and execution plan, orchestrate a series of sub plan executions in a way that satisfies the whole plan being executed.
Does an
Executor
ultimately end up invoking a
StepLauncher
?
I'm currently most familiar with
k8s_job_executor
, which is a
StepDelegatingExecutor
. That has its own
StepHandler
that launches steps. For k8s, that's the
K8sStepHandler
.
But the
K8sStepHandler
just directly invokes the K8s API. No
StepLauncher
to be seen.
My confusion is based on this github discussion about the EMR step launcher.
It makes it sound like it could be a
StepHandler
or indeed even an
Executor
.
For example, the emr_pyspark_step_launcher has the op run inside an EMR cluster, instead of the process that the executor would normally execute it inside.
It sounds like the
StepLauncher
executes first, and for the EMR launcher, it it delegates to EMR instead of invoking the
Executor
?
as far as I can tell there are only really EMR and DataBricks step launchers.
StepExecutionContext
has a
step_launcher
in it...
whereas
PlanOrchestrationContext
and
StepOrchestrationContext
have an
executor
j
Yep you’ve caught some overlapping concepts. The current reality of how these are used: • executor is responsible for launching all steps (step = an execution of an asset or op) • the StepDelegatingExecutor is a specific executor which takes StepHandlers and uses them to launch the steps. Ideally all executors would use the StepDelegatingExecutor framework to dedup more logic. We’re getting around to that slowly • StepLaunchers are fairly distinct and have a confusing name given the above. They came before StepDelegatingExecutor or they’d be called something different. They are a way to override execution for a particular step by shipping it off to emr or databricks
🌈 1
❤️ 1
m
have a confusing name given the above
haha, you're telling me 😛
interesting. would it be fair to say that, if we want to customize step "launching" behavior, we should look into Executors these days, rather than StepLaunchers? and specifically we should look into
StepHandler
j
A lovely endstate would be: all executors use the StepDelegatingExecutor with different default StepHandlers. Instead of StepLaunchers, you can override the StepHandler for a given step
m
jinx!
We're looking into setting up a Kubernetes Spark (as opposed to EMR, DataBricks) step launcher thingy. And all the examples I had seen for Spark were
StepLauncher
, although all my experience has been with
k8s_job_executor
.
It's a little unfortunate that all the specific Spark interop examples are in an interface I can't / shouldn't use, but I think this clarifies the direction for me a bit.
j
Got it. For this use case I think I’d actually recommend a StepLauncher for now, since it sticks with the pattern we currently have going
👍 1
m
yeah, that definitely makes sense
j
The StepHandlers so far have only been used for K8s and Docker, and I’d be worried about hitting some limitation with the current health monitoring api or something like that. Versus the step launcher pattern is more tried
👍 1
m
got it. thank you!
j
Np!
m
I guess a related question. do you imagine an
Executor
and a
StepLauncher
would play nice within the same job?
j
Yes
m
perfect 🙂
j
All jobs have an executor. Some ops have a step launcher
👍 1
Under the hood, when you’re using a step launcher: • the executor still spins up a process/container/pod for the given step • that process/container/pod makes the call out to databricks/emr, instead of invoking the python code locally. It polls until whatever it launched completes
m
Ahhh interesting. So there is still a step running in the background.
j
Correct
It adds a bit of overhead but makes monitoring the launched thing easier
m
We're interested in removing that step pod, because it has a habit of getting killed and our jobs lose all Spark progress. This is just an artifact of our K8s clusters aggressively cycling nodes. Run pods are stable, but step pods are treated as fault tolerant / can be descheduled. I guess we could just use the Child Process Executor for Spark steps, so the step would run on the run pod.
OK, well either way, this gives me some very useful context to start figuring out our strategy.
👍 1
j
If you absolutely needed to remove the step pod and didn’t want to do the multiprocess executor, then yeah I’d recommend trying out a step handler. We currently only support 1 StepHandler per job though, so the job would need to be purely Spark