The cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability.

dagster

Hi all, I'm trying to get dagster to run through dask; I have a dask scheduler and worker running via docker locally. I can't seem to get the config right, any pointers?

can you describe more about the state you are in? overall i would expect the `execution` section of the run_config to be filled out to select dask and point at it the cluster you have already set up

Hi <@UH3RM70A2> Thanks for the quick reply. The wall I'm running into is how to configure the dagster-dask client to talk to an already running scheduler. `dagit` does not like me adding `client` to the context config.

For example, the docs specify to do this, but `dagit` warns that it is not valid (undefined field `address`):
```execution:
  dask:
    config:
      address: "dask_scheduler.dns_name:8787"```

<https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-dask/dagster_dask/executor.py>

you need to select which type of cluster at the top first

it looks like our docs example is inaccurate

I originally had:
```execution:
  dask:
    config:
      cluster:
        slurm:
          cores: 2
          memory: "2GB"
          client:
            address: "127.0.0.1:8786"```
But it doesn't seem to like `client` there either. I believe I could construct the context code-side (in python) but it seems like there should be a way to pass into context.

My ultimate goal here is to pass specific pipelines into slurm hpc stack from dagster, maybe I'm going about it the wrong way? Since there's no direct `dagster-slurm` module, I figured piping through dask made the most sense.

i think the ability to connect to an existing scheduler was accidentally removed when we added support for other cluster types - I sent out a diff <https://dagster.phacility.com/D4925> to fix

In case this is helpful, alex, I've tried this config after digging into the code a bit:
```execution:
  dask:
    config:
      cluster:
        slurm:
          scheduler_options:
            dashboard_address: ':8787'
            host: ':8786'
          memory: '2GB'
          cores: 1```
but it wants to bind a new service at those addresses

yea that lines up with what i have discovered

Oh, that's great alex, thanks. I'll diff locally and see if that kind of gets me there or not

Thanks so much for looking into this. I'll apply and get back to you on results

sounds good - we release on thursdays so this will likely go out then

but do let me know if the patch works, and thanks for the report and followingup

Hey <@UH3RM70A2> my jobs are making it to dask queue now. Now it's just a matter of getting py requirements set up correctly on that end. Your patch seems to work for me

Hey Alex-- is there a release today that will include your changes for adding `existing` cluster?

yep later today - a post fires in <#CCCR6P2UR|general> when its up