Hi, first of all congratulations to that amazing framework!
While building up an environment based on the deploy_docker example, I ran into the following problem:
The pipeline being launched in a separate container via DockerRunLauncher needs access to volumes to manipulate data being present on the host machine.
While the container running the gRPC-Server (using the same image as the worker-container) has access to the volumes (specified in the docker-compose.yaml), the launched worker-container does not.
Is there a way to enable/specify volumes for the pipeline? What would be a possible solution for that problem?
03/14/2021, 3:47 PM
Hi Daniel - if you’re ok having the volume available to all launched runs in your deployments, it would just require a small change to the DockerRunLauncher to accept volumes as part of its config in the dagster.yaml file and pass those through to the launched container. It would get a little bit trickier if you wanted to only have certain volumes available to certain pipelines (is that the case?) but should still be doable.
03/14/2021, 4:21 PM
yeah, in my case it would be fine all launched runs having the same mounts available.
I just digged into the DockerRunLauncher and I think I know in what direction you are thinking.
Another way of doing it would be by using the "volumes_from" argument from the docker-py start() call. That way the launched run could inherit the volumes from any other container specified by its name.
I implemented the changes to the DockerRunLauncher (and the dagster.yaml) and it works perfectly fine.
Shall i create a PR? I wanted to add some tests, but unfortunately I'm not able not make the docker-tests run on my machine.