Kevin
06/23/2020, 5:47 PMmax
06/23/2020, 5:51 PMKevin
06/23/2020, 6:01 PMalex
06/23/2020, 6:24 PMRunLauncher
and Engine
components are pluggable so we can create new ones that attempt to meet your constraints.Fran Sanchez
06/23/2020, 6:30 PMalex
06/23/2020, 6:44 PMEngine
which would take the ExecutionPlan
distribute the steps to your Argo cluster. The RunLauncher
would come in to play for handling the process where the Engine
would execute, if that should be on a different box than where dagit
is hosted.Fran Sanchez
06/23/2020, 7:28 PMalex
06/23/2020, 7:29 PMFran Sanchez
06/23/2020, 7:29 PMStepLauncher
Kevin
06/23/2020, 7:29 PMalex
06/23/2020, 7:30 PMEngine
determines the default behavior and then you can use the StepLauncher
machinery to special case some stepspyspark
steps and submitting them to the appropriate cluster instead of handling those steps like the othersFran Sanchez
06/23/2020, 7:30 PMEngine
or do you need to provide a special StepEngine
-kind of class?alex
06/23/2020, 7:32 PMEngine
(ideally, there may be some quirks currently, it is very new as i said)Fran Sanchez
06/23/2020, 7:32 PMEngine
could be fairly simple, pretty much conversion to yaml and submission + monitoring of the remote workflow I guess.alex
06/23/2020, 7:33 PMhttps://docs.dagster.io/assets/images/apidocs/internal/execution_flow.png▾
Fran Sanchez
06/23/2020, 7:33 PMdagster-graphqul
(I think this is what I need to run in these pods, right?)alex
06/23/2020, 7:34 PMfairly simpleYa I think the hard part will be figuring out how to do the translation
Fran Sanchez
06/23/2020, 7:34 PMalex
06/23/2020, 7:35 PMI think this is what I need to run in these pods, right?I don’t know enough about Argo to say anything with confidence.
Fran Sanchez
06/23/2020, 7:36 PMdagster-graphql
so I guess that this is what I need to run in every step to stick to dagster expectationsalex
06/23/2020, 7:37 PMFran Sanchez
06/23/2020, 7:38 PMargs=[
'-p',
'executeRunInProcess',
'-v',
seven.json.dumps(
{
'runId': run.run_id,
'repositoryName': external_pipeline.handle.repository_name,
'repositoryLocationName': external_pipeline.handle.location_name,
}
),
],
...
command=['dagster-graphql'],
args=args,
alex
06/23/2020, 7:46 PMCeleryK8sJobEngine
which will end up submitting each step as a separate K8s job - but does so via celery queues to provide a means for global resource constraintsK8sJobEngine
which just directly submits the jobs, but plan to in the near futureFran Sanchez
06/23/2020, 7:47 PMEngine
handling all the stepsalex
06/23/2020, 7:48 PMFran Sanchez
06/23/2020, 7:49 PM