https://dagster.io/ logo
#announcements
Title
# announcements
j

Justin Swaney

03/19/2020, 9:47 PM
I'm up and running a simple pipeline deployed with Dask on SGE, but I'm not seeing the run progress bars turn green in dagit anymore
a

alex

03/19/2020, 9:51 PM
alright lets see
where are the dask workers running?
for context: the way that we have things set up currently is that the source of truth for the “event stream” is the event storage as per configured on the dagster instance. The worker nodes are expected to have the same instance cofiguration so that they can write the event stream to the same place
j

Justin Swaney

03/19/2020, 9:53 PM
I read that, but I mostly focused on the Dask-specfici page which was really helpful
I am using
dagster_dask
and pointing to a scheduler I started in the config
So the workers are running in AWS along with the scheduler on a master node connected through dask
I guess based on your explanation I should verify that the workers have S3 access too, which I haven't checked because most of my stuff is just on a shared EBS volume
a

alex

03/19/2020, 9:58 PM
do you have the instance configured where you are initiating the run?
can you hover over the version number in
dagit
?
j

Justin Swaney

03/19/2020, 10:00 PM
I entered the config in the dagit playground, but I also have a YAML file next to my module
When I hover over the version, a big tooltip keeps flashing but not staying long enough for me to read it
a

alex

03/19/2020, 10:02 PM
huh - havent seen the flashing bug before
how about
cat $DAGSTER_HOME/dagster.yaml
j

Justin Swaney

03/19/2020, 10:04 PM
a

alex

03/19/2020, 10:05 PM
ah ok ya you haven’t set up your instance yet - so the “source of truth” is a temp directory created when you launched
dagit
if you go through that first link i sent you
j

Justin Swaney

03/19/2020, 10:05 PM
Gotcha, thanks for all your help!
a

alex

03/19/2020, 10:05 PM
and set up for example an RDS database in AWS and set your
run_storage
and
event_storage
to point at that- you should get everything showing up in dagit
j

Justin Swaney

03/19/2020, 10:06 PM
That makes more sense
a

alex

03/19/2020, 10:08 PM
thanks for working through this all - you are the first person to show up in slack who has tried the
Dask
integration so im excited to see how it all works once you get it set up!
j

Justin Swaney

03/19/2020, 10:21 PM
🎉
🎉 2
For future reference, I had to get
DAGSTER_HOME
propagated to all
dask-workers
for this to work. Can be configured in
jobqueue.yaml
on the Dask side:
Copy code
# in ~/.config/dask/jobqueue.yaml
sge:
  job-extra: ['-v DAGSTER_HOME=/shared/dagster']
Here
/shared
is a volume that all workers have access to. Also, you have to make sure to start new jobs if you update your source code, otherwise the
dask-workers
will be running stale pipelines
a

alex

03/20/2020, 8:01 PM
would you be interested in sending a PR for the dagster-dask README with these notes and any others you have?
j

Justin Swaney

03/20/2020, 8:56 PM
done
🤩 1
a

alex

03/20/2020, 9:49 PM
the fix from above is out now in
0.7.5