https://dagster.io/ logo
#deployment-ecs
Title
# deployment-ecs
b

Bianca Rosa

03/13/2023, 5:06 PM
When specifying the
task_definition_arn
on
DAGSTER_CONTAINER_CONTEXT
, do I need to specify the task revision? Currently hitting
botocore.errorfactory.ClientException: An error occurred (ClientException) when calling the RunTask operation: TaskDefinition not found.
without the task rev.
Also any way to see + logs to check what was the task def name used?
j

johann

03/13/2023, 8:14 PM
Could you share the full stack trace for
TaskDefinition not found.
? Wondering if it’s in
launch_run
b

Bianca Rosa

03/13/2023, 8:15 PM
It is
Copy code
botocore.errorfactory.InvalidParameterException: An error occurred (InvalidParameterException) when calling the RunTask operation: TaskDefinition not found.
  File "/usr/local/lib/python3.10/site-packages/dagster/_daemon/run_coordinator/queued_run_coordinator_daemon.py", line 335, in _dequeue_run
    instance.run_launcher.launch_run(LaunchRunContext(dagster_run=run, workspace=workspace))
  File "/usr/local/lib/python3.10/site-packages/dagster_aws/ecs/launcher.py", line 394, in launch_run
    response = self.ecs.run_task(**run_task_kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.10/site-packages/ddtrace/contrib/botocore/patch.py", line 377, in patched_api_call
    result = original_func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 960, in _make_api_call
    raise error_class(parsed_response, operation_name)
j

johann

03/13/2023, 8:15 PM
And what dagster version?
b

Bianca Rosa

03/13/2023, 8:15 PM
Latest!
Lemme confirm
Copy code
dagit = "1.1.21" # "1.0.12"
dagster = "1.1.21" # "1.0.12" # "0.15.5"
dagster-aws = "0.17.21" # "0.16.12"
dagster-docker = "0.17.21" # "0.16.12"
dagster-mysql = "0.17.21" # "0.16.12"
j

johann

03/13/2023, 8:25 PM
And how are you deployed? Are you using https://github.com/dagster-io/dagster/tree/1.2.1/examples/deploy_ecs or something else?
b

Bianca Rosa

03/13/2023, 8:25 PM
so we have a mix of terraformed ecs config and deploys through circleci
we have been using
0.15.9
before with one code location and its custom task def, but then upgraded to
1.1.21
to use 2 code locations each with its custom task def
so I’m trying to get this setup
j

johann

03/13/2023, 8:28 PM
Could you share what you’re passing for container context? (with anything sensitive removed)
b

Bianca Rosa

03/13/2023, 8:34 PM
yep
Copy code
ENV TASK_DEF_ARN=arn:aws:ecs:us-west-2:<redacted>:task-definition/$ENVIRONMENT-ml-workflows-runs
ENV CONTAINER_NAME=$ENVIRONMENT-ml-workflows-runs
ENV DAGSTER_CONTAINER_CONTEXT='{"ecs":{"task_definition_arn":"'$TASK_DEF_ARN'","container_name":"'$CONTAINER_NAME'"}}'
EXPOSE 4000
Screen Shot 2023-03-13 at 17.35.27.png
(we print
DagsterInstance.get().info_dict()
on startup) ⬆️
I tried adding the task def version too but it looks as thought it didnt have any effect.
We also have that policy allowing dagsterdaemon to describe task defs:
Copy code
# Account wide settings. These resources cannot be filtered.
      {
        Action = [
          "ec2:DescribeNetworkInterfaces",
          "ecs:ListAccountSettings",
          "ecs:DescribeTaskDefinition",
          "ecs:RegisterTaskDefinition",
          "secretsmanager:ListSecrets"
        ],
        Resource = [
          "*"
        ],
        Effect = "Allow"
      },
Any way I can increase logs to have more information here?
j

johann

03/14/2023, 6:18 PM
Not currently unfortunately. So you have a
qa-ml-workflows-runs
task def specified in your run launcher, and in the container context of one grpc server you’re overriding that with another?
What happens if you don’t pass the container context (and therefore use the task def configured on the run launcher)?
b

Bianca Rosa

03/14/2023, 6:31 PM
I think I’ve removed the one in the run_launcher
And left just the container context
Issue here is that the run launcher gives one single task def and we would like each code repository to have its own task def
If I successfully remove the one from the run_launcher, it wont show here right https://dagster.slack.com/archives/C014UDS8LAV/p1678739857226159?thread_ts=1678727160.406249&amp;cid=C014UDS8LAV?
j

johann

03/14/2023, 6:47 PM
That’s correct. I was just wondering if settting the task def via the run launcher was working?
b

Bianca Rosa

03/14/2023, 8:26 PM
Oh yes!
It works -we’ve been using 4ever and the upgrade didnt change that
j

johann

03/14/2023, 8:47 PM
Got it. I’m curious if you set the new task def (that you’re trying to use in the container context) on the run launcher if that will work, or if it will fail with the task def not found. That would isolate it to be something about the new task def, vs something about container context
❤️ 1
b

Bianca Rosa

03/14/2023, 8:48 PM
Oooh gotcha! I just tried that and it looks like it’s working.
Screen Shot 2023-03-14 at 17.48.57.png
It failed for some other unrelated reason but it grabbed the task def!
This is just the
run_launcher
setting, without the context override.
j

johann

03/14/2023, 8:53 PM
strange. So the same
task_definition_arn
works when set in the run launcher, but not in container context?
b

Bianca Rosa

03/14/2023, 8:59 PM
Okay so I think I found the problem - the questions and walkthrough were super helpful btw. I checked in another env other than the one I’ve been messing around and looks like using a regular docker
ENV
instead of
ONBUILD ENV
might have caused the issue
I am also wondering if we are using the task_def for the wrong purpose cause it looks like we just want to override the container image / make sure its grabbing DAGSTER_CURRENT_IMAGE from the current repo.
Can I tweak the start timeout from 180s to another time? 🤔
D 1
a

Arnoud van Dommelen

03/15/2023, 10:29 AM
Hi Bianca,
ENV DAGSTER_GRPC_TIMEOUT_SECONDS=300
This environment variable should do this if I am not mistaking! I also use this variable to increase the schedule evaluation period :)
❤️ 1
10 Views