https://dagster.io/ logo
a

Andy H

10/27/2020, 2:01 PM
Hi all, I'm trying to get dagster to run through dask; I have a dask scheduler and worker running via docker locally. I can't seem to get the config right, any pointers?
a

alex

10/27/2020, 3:27 PM
can you describe more about the state you are in? overall i would expect the
execution
section of the run_config to be filled out to select dask and point at it the cluster you have already set up
a

Andy H

10/27/2020, 4:21 PM
Hi @alex Thanks for the quick reply. The wall I'm running into is how to configure the dagster-dask client to talk to an already running scheduler.
dagit
does not like me adding
client
to the context config.
For example, the docs specify to do this, but
dagit
warns that it is not valid (undefined field
address
):
Copy code
execution:
  dask:
    config:
      address: "dask_scheduler.dns_name:8787"
you need to select which type of cluster at the top first
it looks like our docs example is inaccurate
a

Andy H

10/27/2020, 5:00 PM
I originally had:
Copy code
execution:
  dask:
    config:
      cluster:
        slurm:
          cores: 2
          memory: "2GB"
          client:
            address: "127.0.0.1:8786"
But it doesn't seem to like
client
there either. I believe I could construct the context code-side (in python) but it seems like there should be a way to pass into context.
My ultimate goal here is to pass specific pipelines into slurm hpc stack from dagster, maybe I'm going about it the wrong way? Since there's no direct
dagster-slurm
module, I figured piping through dask made the most sense.
a

alex

10/27/2020, 6:57 PM
edit: never mind i think that was wrong
i think the ability to connect to an existing scheduler was accidentally removed when we added support for other cluster types - I sent out a diff https://dagster.phacility.com/D4925 to fix
a

Andy H

10/27/2020, 8:06 PM
In case this is helpful, alex, I've tried this config after digging into the code a bit:
Copy code
execution:
  dask:
    config:
      cluster:
        slurm:
          scheduler_options:
            dashboard_address: ':8787'
            host: ':8786'
          memory: '2GB'
          cores: 1
but it wants to bind a new service at those addresses
a

alex

10/27/2020, 8:06 PM
yea that lines up with what i have discovered
a

Andy H

10/27/2020, 8:06 PM
Oh, that's great alex, thanks. I'll diff locally and see if that kind of gets me there or not
*apply diff
Thanks so much for looking into this. I'll apply and get back to you on results
a

alex

10/27/2020, 8:11 PM
sounds good - we release on thursdays so this will likely go out then
a

Andy H

10/27/2020, 8:11 PM
ah, very cool
a

alex

10/27/2020, 8:11 PM
but do let me know if the patch works, and thanks for the report and followingup
a

Andy H

10/27/2020, 8:11 PM
absolutely, excited to see if this works
Hey @alex my jobs are making it to dask queue now. Now it's just a matter of getting py requirements set up correctly on that end. Your patch seems to work for me
a

alex

10/27/2020, 8:45 PM
greath thanks for the update
a

Andy H

10/27/2020, 8:45 PM
for sure
thanks for the quick fix man!
Hey Alex-- is there a release today that will include your changes for adding
existing
cluster?
a

alex

10/29/2020, 6:39 PM
yep later today - a post fires in #general when its up
a

Andy H

10/29/2020, 6:49 PM
That's rad, thanks alex!