Our pipeline uses SGE to distribute jobs on a compute cluster. Can Dagster be used as a layer on top of SGE to submit jobs? Can a Dagster solid run a qsub shell command and somehow understand when the SGE job is finished?
we don’t have any active work on SGE, but Dagster is pretty flexible on what compute substrate it can run on. You might be able to make something work with Dask https://jobqueue.dask.org/en/latest/ - https://docs.dagster.io/docs/apidocs/libraries/dagster_dask
Thanks for the info on Dask. I was thinking of just defining a solid to run a shell command that submits a job. Then poll for job completion every 5 minutes until the job is done. Are there more idiomatic ways to do job status polling in Dagster solids?
I'm also using Dagster in HPC and am using Dask-Jobqueue's
to start a scheduler/workers, then deploying pipelines on them as @nate mentioned
that way Dask/Dagster handle orchestration and there's no need to write your own status polling mechanism
Thanks Wes. I'll look into jobqueue in Dask.
