https://dagster.io/ logo
#dagster-ecs
Title
# dagster-ecs
b

Brendan Couche

01/26/2023, 4:55 PM
Hi everyone, I'm working on getting a pretty bare bones deployment running in ECS on Fargate. I started with the fully-featured template project, replaced the dbt models with our own, and setup a basic config of the
EcsRunLauncher
. I've packaged everything up in the same container (
dagit
,
dagster-daemon
, and pipeline code). Infrastructure is managed with Terraform and everything seems to deploy cleanly. Unfortunately, when I try to kick off a task I'm greeted with the following exception:
Copy code
dagster._check.ParameterCheckError: Param "image" is not a str. Got None which is type <class 'NoneType'>.
  File "/opt/venv/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1966, in launch_run
    self.run_launcher.launch_run(LaunchRunContext(pipeline_run=run, workspace=workspace))
  File "/opt/venv/lib/python3.10/site-packages/dagster_aws/ecs/launcher.py", line 330, in launch_run
    run_task_kwargs = self._run_task_kwargs(run, image, container_context)
  File "/opt/venv/lib/python3.10/site-packages/dagster_aws/ecs/launcher.py", line 513, in _run_task_kwargs
    task_definition_config = DagsterEcsTaskDefinitionConfig.from_task_definition_dict(
  File "/opt/venv/lib/python3.10/site-packages/dagster_aws/ecs/tasks.py", line 120, in from_task_definition_dict
    return DagsterEcsTaskDefinitionConfig(
  File "/opt/venv/lib/python3.10/site-packages/dagster_aws/ecs/tasks.py", line 53, in __new__
    check.str_param(image, "image"),
  File "/opt/venv/lib/python3.10/site-packages/dagster/_check/__init__.py", line 1365, in str_param
    raise _param_type_mismatch_exception(obj, str, param_name, additional_message)
I'm using the default run coordinator and the following config for
EcsRunLauncher
:
Copy code
run_launcher:
  module: dagster_aws.ecs
  class: EcsRunLauncher
  config:
    container_name: "dagster_task"
    include_sidecars: true
Any help would be much appreciated 🙂
From some digging in the code and the Dagster DB, I'm guessing that this is the culprit from
runs.run_body
(DB) on one of the tasks:
Copy code
"pipeline_code_origin": {
		"__class__": "PipelinePythonOrigin",
		"pipeline_name": "__ASSET_JOB",
		"repository_origin": {
			"__class__": "RepositoryPythonOrigin",
			"code_pointer": {
				"__class__": "PackageCodePointer",
				"attribute": "emotive_repository",
				"module": "data_director",
				"working_directory": "/home/app/dagster_app"
			},
			"container_context": {},
			"container_image": null,
			"entry_point": ["dagster"],
			"executable_path": "/opt/venv/bin/python"
		}
	}
Unfortunately, I'm not sure what I need to do to get that populated. that being the "container_image" 🙂
m

Mike Atlas

01/26/2023, 5:15 PM
image
is one of the fields in an ECS Task Definition (which are kinda like kubernetes manifests)
in this case it should be set to your code location image
it's under
container_definitions
b

Brendan Couche

01/26/2023, 5:16 PM
I'm using the same image for everything at the moment...and it appears to be defined in the Terraform 😕
Quick run down of what I'm doing... 1. I'm building a single Dockerfile and reusing it with different entry points 2. I've got two ECS services (
dagster-daemon
and
dagster-web
) 3. I'm using the
DefaultRunCoordinator
to spin tasks up immediately with the above
EcsRunLauncher
config - my understanding was that this would mean inspecting the existing task definition and running a duplicate (with different entrypoint) for tasks that are spun up.
I can paste a redacted version of the task definition JSON
m

Mike Atlas

01/26/2023, 5:20 PM
https://docs.dagster.io/deployment/guides/aws#launching-runs-in-ecs you can pass in your own task definition (by ARN:version)
at least, that's what we (my team runs an ecs deployment ourselves) are doing
b

Brendan Couche

01/26/2023, 5:26 PM
Cool, I'll give that a shot shortly. For reference, here's the redacted copy of the task definition JSON.
Copy code
{
    "taskDefinitionArn": "arn:aws:ecs:us-west-2:<ACCT>:task-definition/dagster-daemon:15",
    "containerDefinitions": [
        {
            "name": "dagster-daemon",
            "image": "<ACCT>.<http://dkr.ecr.us-west-2.amazonaws.com/dagster:<COMMIT|dkr.ecr.us-west-2.amazonaws.com/dagster:<COMMIT>>",
            "cpu": 0,
            "portMappings": [
                {
                    "containerPort": 8000,
                    "hostPort": 8000,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "command": [
                "dagster-daemon",
                "run"
            ],
            "environment": [
                ...
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "linuxParameters": {
                "initProcessEnabled": true
            },
            "secrets": [
                ...
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/dev/data-platform/dagster-daemon",
                    "awslogs-region": "us-west-2",
                    "awslogs-stream-prefix": "dagster-daemon"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "ps -eax | grep -v grep | grep dagster-daemon || exit 1"
                ],
                "interval": 60,
                "timeout": 30,
                "retries": 3,
                "startPeriod": 5
            }
        }
    ],
    "family": "dev-dagster-daemon",
    "taskRoleArn": "arn:aws:iam::<ACCT>:role/<role-name>",
    "executionRoleArn": "arn:aws:iam::<ACCT>:role/<role-name>",
    "networkMode": "awsvpc",
    "revision": 15,
    "volumes": [],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "ecs.capability.execution-role-awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.ecr-auth"
        },
        {
            "name": "com.amazonaws.ecs.capability.task-iam-role"
        },
        {
            "name": "ecs.capability.container-health-check"
        },
        {
            "name": "ecs.capability.execution-role-ecr-pull"
        },
        {
            "name": "ecs.capability.secrets.ssm.environment-variables"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
        },
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.25"
        },
        {
            "name": "ecs.capability.extensible-ephemeral-storage"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "4096",
    "memory": "30720",
    "ephemeralStorage": {
        "sizeInGiB": 40
    },
    "runtimePlatform": {
        "cpuArchitecture": "X86_64",
        "operatingSystemFamily": "LINUX"
    },
    "registeredAt": "2023-01-26T16:20:28.900Z",
    "registeredBy": "arn:aws:sts::<ACCT>:assumed-role/release/aws-go-sdk-1674750028660011218",
    "tags": [
        ...
    ]
}
m

Mike Atlas

01/26/2023, 5:27 PM
your tasks image don't need the daemon
err, you don't want to deploy the daemon image alone
b

Brendan Couche

01/26/2023, 5:27 PM
Right, I've got a task definition each for the daemon and dagit
👍 1
but they use the same image
I can create another definition for the tasks easily enough, will do that, specify it manually, and report back 🙂
Welp, this approach gives me a different error, but I think helps narrow down what's happening. It would appear that
boto3.client("ecs").describe_task_definition(...)
isn't returning anything in the
containerDefinitions
property and as a result, Dagster freaks out
I've got no idea why that'd be, but it gives me something to pick at
j

johann

01/26/2023, 11:22 PM
Does your manually specified task def include a container definition?
b

Brendan Couche

01/26/2023, 11:22 PM
it does
I've got a potential facepalm moment that I'm checking right now 🙂
There's definitely a container definition defined - but the container name may not be aligned
👍 1
Progress! That does appear to be it 😞
m

Mitchell Hynes

01/27/2023, 2:31 AM
I also ran into this minutes ago... I’m using the
docker compose
ECS example. I have very little changed except my
user_code
is a separate compose definition, so we can have multiple code locations
The first compose looks like this:
Copy code
services:
  user_code:
    platform: linux/amd64
    build:
      context: .
      dockerfile: ./Dockerfile
      target: user_code
    image: "$REGISTRY_URL/deploy_ecs/user_code"
    container_name: user_code
    command: "./env-wrap.sh dagster api grpc -h 0.0.0.0 -p 4000 -f pipelines.py"
Here’s the other one
Copy code
services:
  dagit:
    platform: linux/amd64
    build:
      context: .
      dockerfile: ./Dockerfile
      target: dagit
    image: "$REGISTRY_URL/deploy_ecs/dagit"
    container_name: dagit
    command: "./env-wrap.sh dagit -h 0.0.0.0 -p 3000 -w workspace.yaml"
    ...
  daemon:
    platform: linux/amd64
    build:
      context: .
      dockerfile: ./Dockerfile
      target: dagster
    image: "$REGISTRY_URL/deploy_ecs/daemon"
    container_name: daemon
    command: "./env-wrap.sh dagster-daemon run"
Sorry for muddying up the thread
b

Brendan Couche

01/27/2023, 3:09 AM
The issue I encountered was a different
container_name
specified for the
EcsRunLauncher
from what was defined in the
containerDefinitions
property of the ECS task definition. I'm not trying the compose route, so I'm not sure how much help that'll be 😞
17 Views