Hello, I adapted the configuration from the Docker...
# announcements
a
Hello, I adapted the configuration from the Docker multi-container deployment example linked in yours docs, and currently have it hosted on a modest EC2 instance for development. We plan on moving to ECS later. Dagit is finding all the repos successfully under the Status tab. However, when I attempt to run a pipeline, I get an error: "No docker image specified by the instance config or repository" (see attached screenshot) I have set the $COMPOSE_PROJECT_NAME environment variable correctly via the
.env
file, which is loaded by
docker-compose.yml
. I have also verified the $DAGSTER_CURRENT_IMAGE env var is set in the pipelines container by shelling into it while it's running. Here are the running containers output by `docker ps`:
Copy code
CONTAINER ID   IMAGE                                          COMMAND                  CREATED        STATUS        PORTS                    NAMES
9360afd792f3   dagster-data-orchestration_dagster_daemon      "dagster-daemon run"     18 hours ago   Up 18 hours                            dagster_daemon
606fbbba144c   dagster-data-orchestration_dagster_dagit       "dagit -h 0.0.0.0 -p…"   18 hours ago   Up 18 hours   0.0.0.0:3000->3000/tcp   dagster_dagit
85b25377d7c3   postgres:11                                    "docker-entrypoint.s…"   18 hours ago   Up 18 hours   0.0.0.0:5432->5432/tcp   dagster_postgresql
51c5d0f47b5b   dagster-data-orchestration_dagster_pipelines   "dagster api grpc -h…"   18 hours ago   Up 18 hours   0.0.0.0:4000->4000/tcp   dagster_pipelines
And here is the value of $DAGSTER_CURRENT_IMAGE in
dagster-data-orchestration_dagster_pipelines
's running container (retrieved via bash in running container
docker exec -it 51c5d0f47b5b /bin/sh
) :
Copy code
# echo $DAGSTER_CURRENT_IMAGE
dagster-data-orchestration_dagster_pipelines
And here is my full `docker-compose.yml`:
Copy code
version: "3.7"

services:
  # This service runs the postgres DB used by dagster for run storage, schedule storage,
  # and event log storage.
  dagster_postgresql:
    image: postgres:11
    container_name: dagster_postgresql
    ports:
      - "5432:5432"
    environment:
      POSTGRES_USER: "postgres_user"
      POSTGRES_PASSWORD: "postgres_password"
      POSTGRES_DB: "postgres_db"
    networks:
      - dagster_network

  # This service runs the gRPC server that loads and executes your pipelines, in both dagit
  # and dagster-daemon. By setting DAGSTER_CURRENT_IMAGE to its own image, we tell the
  # run launcher to use this same image when launching runs in a new container as well.
  # Multiple containers like this can be deployed separately - each just needs to run on
  # its own port, and have its own entry in the workspace.yaml file that's loaded by dagit.
  dagster_pipelines:
    build:
      context: .
      dockerfile: ./Dockerfile_pipelines
    container_name: dagster_pipelines
    ports:
      - "4000:4000"
    env_file:
      - .env
    environment:
      DAGSTER_POSTGRES_USER: "postgres_user"
      DAGSTER_POSTGRES_PASSWORD: "postgres_password"
      DAGSTER_POSTGRES_DB: "postgres_db"
      DAGSTER_CURRENT_IMAGE: "${COMPOSE_PROJECT_NAME}_dagster_pipelines"
    networks:
      - dagster_network

  # This service runs dagit, which loads the pipelines from the user code container.
  # Since our instance uses the QueuedRunCoordinator, any runs submitted from dagit will be put on
  # a queue and later dequeued and launched by dagster-daemon.
  dagster_dagit:
    build:
      context: .
      dockerfile: ./Dockerfile_dagster
    entrypoint:
      - dagit
      - -h
      - "0.0.0.0"
      - -p
      - "3000"
      - -w
      - workspace.yaml
    container_name: dagster_dagit
    expose:
      - "3000"
    ports:
      - "3000:3000"
    env_file:
      - .env
    environment:
      DAGSTER_POSTGRES_USER: "postgres_user"
      DAGSTER_POSTGRES_PASSWORD: "postgres_password"
      DAGSTER_POSTGRES_DB: "postgres_db"
    networks:
      - dagster_network
    depends_on:
      - dagster_postgresql
      - dagster_pipelines

  # This service runs the dagster-daemon process, which is responsible for taking runs
  # off of the queue and launching them, as well as creating runs from schedules or sensors.
  dagster_daemon:
    build:
      context: .
      dockerfile: ./Dockerfile_dagster
    entrypoint:
      - dagster-daemon
      - run
    container_name: dagster_daemon
    env_file:
      - .env
    environment:
      DAGSTER_POSTGRES_USER: "postgres_user"
      DAGSTER_POSTGRES_PASSWORD: "postgres_password"
      DAGSTER_POSTGRES_DB: "postgres_db"
    volumes: # Make docker client accessible so we can launch containers using host docker
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - dagster_network
    depends_on:
      - dagster_postgresql
      - dagster_pipelines

networks:
  dagster_network:
    driver: bridge
    name: dagster_network
I am also using the QueuedRunCoordinator in the daemon.
d
Hi Andrew - Very strange! Did you also leave the workspace.yaml pretty much the same?
a
I did modify that to better organize our repos. Perhaps I'm doing this incorrectly. Here's my `workspace.yaml`:
Copy code
load_from:
  - python_file: 
      relative_path: workspaces/canonical/canonical_repo.py
  - python_file:
      relative_path: workspaces/salesforce/salesforce_repo.py
Both repos and their pipelines, solids, etc. are being found and shown correctly in Dagit. It's just the execution which is failing.
d
Aha! So that's your problem actually. By changing it from grpc_server to python_file, you're making it no longer use your dagster_pipelines container to serve the pipeline information in dagit (instead, it loads it up its own subprocess to serve them, which doesn't have any way of knowing which image to use). For this example at least, you'll want to have it still be pointing at your containers to load the pipeline information.
🙌 1
a
I should have caught that from the example. facepalm Learning the hard way. Thanks for the help! So if I'm understanding correctly, I should have a separate image for each of
salesforce
and
canonical
w/ corresponding Dockerfile, right?
And then each dockerfile will be responsible for launching its own grpc server from the respective repo file?
d
yeah, that's exactly right. you can also have a container that serves more than one repository, if you don't mind updating/deploying them together
(it will pick up any repositories defined in the file that you point the grpc server at)
a
OK, awesome!
I have another question related this. I now have this setup and running, but it appears environment variables are not passed to the ephemeral container when it's created upon pipeline execution. I'm using an env file with the images in the docker-compose file. Is there a solution for this, or should I be handling environment config differently?
d
there's an 'env_vars' field on the config for the DockerRunLauncher where you can list environment variables that you want to be passed in to the ephemeral container, would that work for this?
er that will assume that the variables are set in the container that launches the run though
a
Perfect, that worked. Thanks!
k
Hi, I am also getting this same error (
Exception: No docker image specified by the instance config or repository
). Any help would be appreciated. docker-compose.yml
Copy code
version: "3.7"

services:
  docker_example_postgresql:
    image: postgres:11
    container_name: docker_postgresql
    environment:
      DAGSTER_POSTGRES_USER: "${POSTGRES_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      DAGSTER_POSTGRES_DB: "${POSTGRES_DB}"
      DAGSTER_POSTGRES_HOSTNAME: "${POSTGRES_HOSTNAME}"
    networks:
      - docker_example_network

  docker_example_pipelines:
    build:
      context: .
      dockerfile: ./pipelines.Dockerfile
    container_name: docker_example_pipelines
    image: docker_example_pipelines_image
    environment:
      DAGSTER_POSTGRES_USER: "${POSTGRES_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      DAGSTER_POSTGRES_DB: "${POSTGRES_DB}"
      DAGSTER_POSTGRES_HOSTNAME: "${POSTGRES_HOSTNAME}"
      DAGSTER_DAGSTER_CURRENT_IMAGE: "docker_example_pipelines_image"
    networks:
      - docker_example_network

  docker_example_dagit:
    build:
      context: .
      dockerfile: ./dagster.Dockerfile
    entrypoint:
      - dagit
      - -h
      - "0.0.0.0"
      - -p
      - "3000"
      - -w
      - workspace.yaml
    container_name: docker_example_dagit
    expose:
      - "3000"
    ports:
      - "3000:3000"
    environment:
      DAGSTER_POSTGRES_USER: "${POSTGRES_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      DAGSTER_POSTGRES_DB: "${POSTGRES_DB}"
      DAGSTER_POSTGRES_HOSTNAME: "${POSTGRES_HOSTNAME}"
    volumes: 
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - docker_example_network
    depends_on:
      - docker_example_postgresql
      - docker_example_pipelines
  
  docker_example_daemon:
    build:
      context: .
      dockerfile: ./dagster.Dockerfile
    entrypoint:
      - dagster-daemon
      - run
    container_name: docker_example_daemon
    restart: on-failure
    environment:
      DAGSTER_POSTGRES_USER: "${POSTGRES_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      DAGSTER_POSTGRES_DB: "${POSTGRES_DB}"
      DAGSTER_POSTGRES_HOSTNAME: "${POSTGRES_HOSTNAME}"
    volumes: 
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - docker_example_network
    depends_on:
      - docker_example_postgresql
      - docker_example_pipelines

networks:
  docker_example_network:
    driver: bridge
    name: docker_example_network
workspace.yml
Copy code
load_from:
  - grpc_server:
      host: docker_example_pipelines
      port: 4000
      location_name: "example_pipelines"
dagster.yaml
Copy code
scheduler:
  module: dagster.core.scheduler
  class: DagsterDaemonScheduler

run_coordinator:
  module: dagster.core.run_coordinator
  class: QueuedRunCoordinator

run_launcher:
  module: dagster_docker
  class: DockerRunLauncher
  config:
    env_vars:
      - DAGSTER_POSTGRES_HOSTNAME
      - DAGSTER_POSTGRES_USER
      - POSTGRES_PASSWORD
      - DAGSTER_POSTGRES_DB
    network: docker_example_network

run_storage:
  module: dagster_postgres.run_storage
  class: PostgresRunStorage
  config:
    postgres_db:
      hostname: 
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: POSTGRES_PASSWORD
      db_name:
        env: DAGSTER_POSTGRES_DB
      port: 5432

schedule_storage:
  module: dagster_postgres.schedule_storage
  class: PostgresScheduleStorage
  config:
    postgres_db:
      hostname:  
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: POSTGRES_PASSWORD
      db_name:
        env: DAGSTER_POSTGRES_DB
      port: 5432

event_log_storage:
  module: dagster_postgres.event_log
  class: PostgresEventLogStorage
  config:
    postgres_db:
      hostname:  
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: POSTGRES_PASSWORD
      db_name:
        env: DAGSTER_POSTGRES_DB
      port: 5432
d
Hi Kieran - you want your docker_example_pipelines container to set the DAGSTER_CURRENT_IMAGE image env var so that dagster knows which image to load
🙏 1
m
Hi Daniel & Kieron, I am completely new to docker and dagster. Could you tell me what line you changed and what the value you set it to was? Or just post the version of the code that you used to get it to work.
d
Hi Marc! I think this example is a good overview of a docker setup: https://github.com/dagster-io/dagster/tree/0.12.15/examples/deploy_docker and the accompying docs: https://docs.dagster.io/deployment/guides/docker#multi-container-docker-deployment In Kieran's case, he needed to set DAGSTER_CURRENT_IMAGE in his user code container, like the example does here: https://github.com/dagster-io/dagster/blob/0.12.15/examples/deploy_docker/docker-compose.yml#L32
👏 1
m
Thank you! I'll give these an in depth look