Hello, everyone! I'm new to Dagster and working on...
# deployment-ecs
j
Hello, everyone! I'm new to Dagster and working on getting a version working on ECS. I'm following the tutorial, however I am not using docker-compose. I have docker images for dagit, daemon, and the user code all on ECR. I'm scripting out my services and task definitions as JSON. I have dagit running as an ECS service and I have ECS tasks running for the dagit, daemon, and user code. The dagit UI and daemon are working, but I am running into an error with my user code task. In the UI under the
Deployment
tab and
Code locations
, the _example_jobs_ has a Failed status with this error:
Copy code
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "DNS resolution failed for user_code:4000: C-ares status is not ARES_SUCCESS qtype=A name=user_code is_balancer=0: Domain name not found"
The
workspace.yaml
looks like:
Copy code
load_from:
  # Each entry here corresponds to a service in the docker-compose file that exposes jobs.
  - grpc_server:
      host: user_code
      port: 4000
      location_name: "example_jobs"
I have no experience/knowledge about gRPC so my guess is that I have to set up the networking for the user code to communicate with the daemon, but I'm not sure where to even start with that. Does anyone have an idea or any resources to point me in the right direction? Let me know if there's extra info I can add. Additional information: • I have my dagit service running with a target group:
Protocol: HTTP, Port: 80, Protocol version: HTTP1
• Here's a redacted version of the user code task definition:
Copy code
{
  "family": "user_code",
  "executionRoleArn": "<TaskRole>",
  "taskRoleArn": "<TaskRole>",
  "networkMode": "awsvpc",
  "containerDefinitions": [
    {
      "name": "user_code",
      "image": "<ECR REPO>/deploy_ecs/user_code:latest",
      "environment": [
        {
          "name": "DAGSTER_CURRENT_IMAGE",
          "value": "<ECR REPO>/deploy_ecs/user_code"
        }
      ],
      "secrets": [
        {
          "valueFrom": "arn:aws:secretsmanager:XXX",
          "name": "DAGSTER_POSTGRES_PASSWORD"
        },
        {
          "valueFrom": "arn:aws:secretsmanager:XXX",
          "name": "DAGSTER_POSTGRES_HOSTNAME"
        },
        {
          "valueFrom": "arn:aws:secretsmanager:XXX",
          "name": "DAGSTER_POSTGRES_USER"
        }
      ]
    }
  ],
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "cpu": "256",
  "memory": "512"
}
• Here is my
dagster.yaml
Copy code
---
scheduler:
  module: dagster.core.scheduler
  class: DagsterDaemonScheduler

run_coordinator:
  module: dagster.core.run_coordinator
  class: QueuedRunCoordinator

run_launcher:
  module: dagster_aws.ecs
  class: EcsRunLauncher
  config:
    include_sidecars: true
    secrets_tag: ""

run_storage:
  module: dagster_postgres.run_storage
  class: PostgresRunStorage
  config:
    postgres_db:
      hostname:
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: DAGSTER_POSTGRES_PASSWORD
      db_name: dagster_test
      port: 5432

schedule_storage:
  module: dagster_postgres.schedule_storage
  class: PostgresScheduleStorage
  config:
    postgres_db:
      hostname:
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: DAGSTER_POSTGRES_PASSWORD
      db_name: dagster_test
      port: 5432

event_log_storage:
  module: dagster_postgres.event_log
  class: PostgresEventLogStorage
  config:
    postgres_db:
      hostname:
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: DAGSTER_POSTGRES_PASSWORD
      db_name: dagster_test
      port: 5432
p
Hi Jeff. I think you have to set up a bridge network for the user code containers to be able to communicate with dagit/daemon containers. You should be able to look at the docker docs for setting this up: https://docs.docker.com/network/network-tutorial-standalone/#use-user-defined-bridge-networks This would correspond to the networks section in the
docker-compose.yml
file.
j
Thanks @prha! I'll take a look. I'm also wondering if my target group is incorrect. I'm using the HTTP1 protocol version but I'm noticing now there is HTTP2 or gRPC. Should I use one of these?
I also see in the AWS docs that
For Amazon ECS tasks on Fargate, the awsvpc network mode is required.