Hopefully last time I have to post here :slightly_...
# deployment-ecs
t
Hopefully last time I have to post here 🙂 . We are trying to use the ECS Run Launcher to run our jobs in ECS however I get the following error when launching the run in launchpad...
Copy code
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_graphql/implementation/utils.py", line 126, in _fn
    return fn(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_graphql/implementation/execution/launch_execution.py", line 35, in launch_pipeline_execution
    return _launch_pipeline_execution(graphene_info, execution_params)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_graphql/implementation/execution/launch_execution.py", line 69, in _launch_pipeline_execution
    run = do_launch(graphene_info, execution_params, is_reexecuted)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_graphql/implementation/execution/launch_execution.py", line 54, in do_launch
    return graphene_info.context.instance.submit_run(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster/_core/instance/__init__.py", line 2079, in submit_run
    submitted_run = self._run_coordinator.submit_run(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster/_core/run_coordinator/default_run_coordinator.py", line 40, in submit_run
    self._instance.launch_run(pipeline_run.run_id, context.workspace)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster/_core/instance/__init__.py", line 2132, in launch_run
    self.run_launcher.launch_run(LaunchRunContext(dagster_run=run, workspace=workspace))
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_aws/ecs/launcher.py", line 370, in launch_run
    run_task_kwargs = self._run_task_kwargs(run, image, container_context)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_aws/ecs/launcher.py", line 557, in _run_task_kwargs
    self._get_current_task(),
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_aws/ecs/launcher.py", line 483, in _get_current_task
    current_task_metadata = self._get_current_task_metadata()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_aws/ecs/launcher.py", line 478, in _get_current_task_metadata
    self._current_task_metadata = get_current_ecs_task_metadata()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/dagster_aws/ecs/tasks.py", line 255, in get_current_ecs_task_metadata
    task_metadata_uri = _container_metadata_uri() + "/task"
Our config block in the dagster.yaml (AWS acct redacted) is...
Copy code
run_launcher:
  module: "dagster_aws.ecs"
  class: "EcsRunLauncher"
  config:
    use_current_ecs_task_config: false
    run_task_kwargs:
      cluster: arn:aws:ecs:us-west-2:XXXXXXXXX:cluster/produsa-data-science
      launchType: "FARGATE"
Am I missing something in the config? I can post the code for the actual job if needed but figured I would start with this first.
d
Hi Timothy - am I correct in thinking that the process launching the run is not in an ECS task? (which is fine, just confirming)
i think we're not handling the case well where you set "use_current_ecs_task_config" to False but don't configure the task definition to use in some way via the "task_definition" field
t
@daniel You probably are. I'm having to decipher code that another group built lol. Below is the @op code snippet with the sensitive stuff redacted...
Copy code
@op
def extract_tableau(orgs):

    image=f"<http://xxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/dagster_jobs:tableau_extractor-v0.2.0|xxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/dagster_jobs:tableau_extractor-v0.2.0>",
    env_vars=[
            f"AWS_ACCESS_KEY_ID={aws_access_key_id}",
            f"AWS_SECRET_ACCESS_KEY={aws_secret_access_key}",
            "EXTRACT_PATH=/opt/extract.hyper",
            "PARQUET_PATH=xxxxxxxxxx/donor_bom.parquet",
            "DATASOURCE_NAME=dagster_test",
            "ORG_EXCLUDE=",
            f"TABLEAU_USER={username}",
            f"TABLEAU_PASS={password}",
            f"TABLEAU_PROJECT=QA",
        ],
    resources={
            "limit_memory": "12Gi",
            "request_memory": "12Gi",
            "limit_cpu": 2,
            "request_cpu": 2,
        },
    labels={
            "<http://fanthreesixty.com/role|fanthreesixty.com/role>": "datasci",
            "<http://fanthreesixty.com/app|fanthreesixty.com/app>": "airflow",
            "<http://fanthreesixty.com/purpose|fanthreesixty.com/purpose>": "",
            "<http://fanthreesixty.com/environment|fanthreesixty.com/environment>": "produsa",
        },
    timeout=600,

    print(f"Tableau extract: {orgs}")
d
Got it - where's the run being launched from though?
Or ok here's a question - what prompted you (or the person before you) to set use_current_ecs_task_config to False?
t
I have no idea to be honest. From what I read that was optional and could probably be taken out. I will say that I get the error regardless of whether or not that is in the dagster.yaml file or not. Also we have Dagster daemon and Dagit running on an EC2 instance and are wanting the jobs to spin up and execue in an ECS Fargate cluster.
d
Ah ok - if it's in an EC2 instance and not in an ECS task then you probably do want it to be False. I think this will go away if you set the task_definition field to something non-empty, and i'll come up with a real fix separately. So even something that doesn't actually do anything like
Copy code
task_definition:
  sidecar_containers: []
(which is a no-op) should work
👍 1
i.e. try it with this:
Copy code
run_launcher:
  module: "dagster_aws.ecs"
  class: "EcsRunLauncher"
  config:
    use_current_ecs_task_config: false
    task_definition:
      sidecar_containers: []
    run_task_kwargs:
      cluster: arn:aws:ecs:us-west-2:XXXXXXXXX:cluster/produsa-data-science
      launchType: "FARGATE"
t
I'll give it a shot and report back!
That got me past that error, thanks! Dealing with the proverbial "Param image is not a str" error but I'll dig through here and see how others have resolved that.
d
Do you know what image you want to use to launch the run in?
t
Well, I know that we have our own container images. In the code above they are trying to run image=f"xxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/dagster_jobs:tableau_extractor-v0.2.0"
d
How is dagit loading your Dagster job code currently?
in a lot of ECS deployments you would have a separate ECS task running a grpc server with the code on it
t
Currrently our code location/repo's are all located on that EC2 instance. We don't have Dagster/Dagit running in ECS.
d
so you have the code locally on the EC2 box on the same image that dagit / the daemon are using?
we could probably add an "image" key under "task_definition" that could work for this - it's a little risky though, ideally you would have the same image powering both the code that dagit loads and the code that gets run when the job launches, or its easy for them to get out of sync
we don't have this currently, but i wonder if we had an ecs_task_op like the k8s_job_op here if it might be a better fit for what you're trying to do here: https://docs.dagster.io/_apidocs/libraries/dagster-k8s#dagster_k8s.k8s_job_op
it seems like that's kind of what the op that you're replacing was doing
Does the image that you're hoping to use the run launcher for have the same dagster jobs/assets/etc. defined that dagit is loading?
t
Yeah the biggest problem is that we will have about 40-50 different images for about the same number of jobs that will be getting deployed at random intervals
They have a ton of airflow jobs they'll be porting over to use dagster. We can point these to a K8's cluster to utilize fargate profiles if need be but were hoping to use Fargate "natively" without the overhead that the K8's api, etc can bring
Looking at this and with some of what you mentioned it might be better for us to run these in EKS with a Fargate profile. Appreciate the assistance and info @daniel!