HI I am using dagster cloud hybrid on AWS ECS dagster versio dagster #dagster-plus

HI!. I am using dagster cloud hybrid (on AWS ECS, ...

taker

04/08/2023, 12:10 AM

HI!. I am using dagster cloud hybrid (on AWS ECS, dagster version 1.2.4). The code location server is shutting down at a certain time by ttl setting. When I run a job while the code location server is shutdown, I get the following error

Copy code

dagster._core.errors.DagsterUserCodeUnreachableError: Timed out waiting for call to user code GET_SUBSET_EXTERNAL_PIPELINE_RESULT [{value}]
New or dormant Branch Deployments can take time to become ready, try again in a little bit.

I know the cause of the error, but I do not know how to change the timeout time setting. If you can, could you please tell us how to do that?

daniel

04/08/2023, 1:45 AM

Hello! You can configure this TTL by setting the following field in your EcsUserCodeLauncher config in your dagster.yaml:

Copy code

user_code_launcher:
  module: dagster_cloud.workspace.ecs
  class: EcsUserCodeLauncher
  config:
    ...
    server_ttl:
      branch_deployments: <your value here in seconds - the default is 24 hours>

The tradeoff here is that the servers will stay around for longer in your cluster if you increase this setting.

taker

04/08/2023, 2:18 AM

Thank you for your response. Yes, I have made that setting. I believe that when this setting is enabled, the code location server should shut down (which is the intended behavior). However, after the server shuts down, I encounter the error I previously mentioned when trying to execute the job. I think this is because the server's startup wait time is timing out. That's why I would like to adjust this timeout. I apologize if my English is not clear

daniel

04/08/2023, 2:19 AM

Ah I don't think its because the startup time is timing out - I think it's because ECS tasks can take a few minutes to start up. What i'd expect to happen is if you try again in a couple of minutes, the task will have started up and the job will start.

daniel

04/08/2023, 2:21 AM

as soon as you load the code location for the branch deployment in dagit, it will send a signal to your agent and start spinning the ECS task back up - the error you're seeing will happen if you start a job right away while the task is starting back up

taker

04/08/2023, 3:08 AM

I understand. That's right, after encountering this error, the job can be executed normally. I believe that even if the server is not running, it should start when the job is executed. However, from now on, before running the job, I plan to explicitly start the server (for example, from the deployment menu).

4 Views

Open in Slack

Previous Next