https://dagster.io/ logo
#ask-community
Title
# ask-community
c

chrispc

05/15/2022, 4:53 PM
Hi team! @daniel @prha @alex thanks in advance. I was wondering if you could help me with this. I am using the DockerRunLauncher, so I set the volumes in Dagster.yaml (I want to store few files in the host machine) like this:
Copy code
run_launcher:
  module: dagster_docker
  class: DockerRunLauncher
  config:
    env_vars:
      - DAGSTER_PG_USERNAME
      - DAGSTER_PG_PASSWORD
      - DAGSTER_PG_HOST
      - DAGSTER_PG_DB
      - PGPORT_BACKEND
    network: dagster_network
    container_kwargs:
      auto_remove: true
      volumes: { 'pipelines_storage':
                   { 'bind': '/opt/dagster/app/local_artifact_storage/storage', 'mode': 'rw' },
                 'finapp/data':
                   { 'bind': '/opt/dagster/app/finapp/data', 'mode': 'rw' }}
and in the docker-compose:
Copy code
services:  
finapp_pipelines:
    build:
      context: .
      dockerfile: ./Dockerfile_pipelines
    container_name: finapp_pipelines
    image: finapp_pipelines
#    restart: on-failure:2
    expose:
      - 4000
    ports:
      - "4000:4000"
    environment:
      DAGSTER_PG_PASSWORD: ${DAGSTER_PG_PASSWORD}
      DAGSTER_PG_USERNAME: ${DAGSTER_PG_USERNAME}
      DAGSTER_PG_DB: ${DAGSTER_PG_DB}
      DAGSTER_PG_HOST: ${DAGSTER_PG_HOST}
      DAGSTER_PG_OP_DB: ${DAGSTER_PG_OP_DB}
      PGPORT_BACKEND : ${POSTGRES_DEV_PORT}
      DAGSTER_CURRENT_IMAGE: ${DAGSTER_CURRENT_IMAGE}
      FINNUB_KEY: ${FINNUB_KEY}
    volumes:
      - pipelines_storage:/opt/dagster/app/local_artifact_storage/storage
      - ./finapp/data:/opt/dagster/app/finapp/data
    networks:
      - dagster_network
    depends_on:
      - op_db
volumes:
  pipelines_storage:
    external: false
pipelines_storage is a docker volume but ./finapp/data is a bind mount. However when I execute I pipeline that save the data in the bind volume I got this:
Copy code
docker.errors.APIError: 400 Client Error for <http+docker://localhost/v1.41/containers/create>: Bad Request ("create finapp/data: "finapp/data" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path")

  File "/usr/local/lib/python3.8/site-packages/dagster/core/instance/__init__.py", line 1450, in launch_run
    self._run_launcher.launch_run(LaunchRunContext(pipeline_run=run, workspace=workspace))
  File "/usr/local/lib/python3.8/site-packages/dagster_docker/docker_run_launcher.py", line 176, in launch_run
    container = client.containers.create(
  File "/usr/local/lib/python3.8/site-packages/docker/models/containers.py", line 878, in create
    resp = self.client.api.create_container(**create_kwargs)
  File "/usr/local/lib/python3.8/site-packages/docker/api/container.py", line 428, in create_container
    return self.create_container_from_config(config, name)
  File "/usr/local/lib/python3.8/site-packages/docker/api/container.py", line 439, in create_container_from_config
    return self._result(res, True)
  File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 274, in _result
    self._raise_for_status(response)
  File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 270, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/usr/local/lib/python3.8/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
How I can set the volumes in DockerRunLauncher without passing the abs path? or is there a way to inject env variables (${PWD}) in Dagster.yaml file? Thank so much!
d

daniel

05/16/2022, 3:45 PM
Hi @chrispc - welcome back. Before we dig into the answer, just wanted to note that we have a support rotation now that's going through posts and make sure they get answered, so there's no need to tag-mention specific team members. Right now the volume needs to be an absolute path - the dagster.yaml file is loaded within a Docker container, which doesn't have any knowledge about the file structure of the machine that launched the container. We can investigate if there's a way to specify a list of environment variables for the volumes though (instead of putting them inside container_kwargs) - then maybe you could set that environment variable within the docker-compose file So maybe it could look something like
Copy code
run_launcher:
  module: dagster_docker
  class: DockerRunLauncher
  config:
    env_vars:
      - DAGSTER_PG_USERNAME
      - DAGSTER_PG_PASSWORD
      - DAGSTER_PG_HOST
      - DAGSTER_PG_DB
      - PGPORT_BACKEND
    network: dagster_network
    volumes:
       pipelines_storage:
         bind:
           env: PIPELINES_STORAGE_LOCATION
         mode: rw
and then you would set in the value of PIPELINES_STORAGE_LOCATION to the right absolute location within the docker-compose file? That's not possible today but I don't think it would be too hard
❤️ 1
c

chrispc

05/16/2022, 8:45 PM
ohh thanks @daniel. I really appreciate your help. I am trying to teach Dagster to others in a MLops bootcamp so I am dealing with this. But, for sure, I am not going to tag-mention any more. I will take a look if I can do this so I could make a PR.
condagster 1
d

daniel

05/16/2022, 10:09 PM
I think that could be a great PR if you're interested - the steps would be adding a new "volumes" key here: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-docker/dagster_docker/docker_run_launcher.py#L61-L62 (it's ok if the new key is just in the RunLauncher config for now), and having its type match the format that Docker volumes are expecting, and then using StringSource as the type for any of the strings (like "bind") so that environment variables can be set there
condagster 1
c

chrispc

05/17/2022, 3:47 PM
Hi, @daniel I am curious about what is the best place to add the volume keyword. I am seeing this https://github.com/dagster-io/dagster/blob/80153e0ecbf58f6e92754921c113a4fc8de556f[…]python_modules/libraries/dagster-docker/dagster_docker/utils.py and this: https://github.com/dagster-io/dagster/blob/80153e0ecbf58f6e92754921c113a4fc8de556f[…]es/libraries/dagster-docker/dagster_docker/container_context.py I am not sure why you have the
network
keyword in one place and
networks
in the other place. Is there any difference between this two?
d

daniel

05/17/2022, 4:05 PM
I think adding it to DOCKER_CONTAINER_CONTEXT_SCHEMA makes sense - network was added before networks, so its just a backwards compatibility thing
condagster 1
c

chrispc

05/17/2022, 4:20 PM
@David Aronchick
5 Views