Hi first of all congratulations to that amazing framework Wh dagster #announcements

Hi, first of all congratulations to that amazing f...

Daniel H

03/14/2021, 3:12 PM

Hi, first of all congratulations to that amazing framework! While building up an environment based on the deploy_docker example, I ran into the following problem: The pipeline being launched in a separate container via DockerRunLauncher needs access to volumes to manipulate data being present on the host machine. While the container running the gRPC-Server (using the same image as the worker-container) has access to the volumes (specified in the docker-compose.yaml), the launched worker-container does not. Is there a way to enable/specify volumes for the pipeline? What would be a possible solution for that problem? Thanks, Daniel

daniel

03/14/2021, 3:47 PM

Hi Daniel - if you’re ok having the volume available to all launched runs in your deployments, it would just require a small change to the DockerRunLauncher to accept volumes as part of its config in the dagster.yaml file and pass those through to the launched container. It would get a little bit trickier if you wanted to only have certain volumes available to certain pipelines (is that the case?) but should still be doable.

Daniel H

03/14/2021, 4:21 PM

Hi, yeah, in my case it would be fine all launched runs having the same mounts available. I just digged into the DockerRunLauncher and I think I know in what direction you are thinking. Another way of doing it would be by using the "volumes_from" argument from the docker-py start() call. That way the launched run could inherit the volumes from any other container specified by its name.

Daniel H

03/14/2021, 9:58 PM

I implemented the changes to the DockerRunLauncher (and the dagster.yaml) and it works perfectly fine. Shall i create a PR? I wanted to add some tests, but unfortunately I'm not able not make the docker-tests run on my machine.

daniel

03/14/2021, 10:31 PM

That would be great!

Open in Slack

Previous Next