When I try to cancel a pipeline I"m running into a...
# ask-community
When I try to cancel a pipeline I"m running into an error that looks like:
Copy code
botocore.exceptions.NoCredentialsError: Unable to locate credentials
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/implementation/execution/__init__.py", line 108, in terminate_pipeline_execution
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/run_coordinator/queued_run_coordinator.py", line 252, in cancel_run
    return self._instance.run_launcher.terminate(run_id)
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 675, in run_launcher
    launcher = cast(InstanceRef, self._ref).run_launcher
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/instance/ref.py", line 491, in run_launcher
    return self.run_launcher_data.rehydrate() if self.run_launcher_data else None
  File "/usr/local/lib/python3.10/site-packages/dagster/_serdes/config_class.py", line 99, in rehydrate
    return klass.from_config_value(self, check.not_none(result.value))
  File "/usr/local/lib/python3.10/site-packages/dagster_aws/ecs/launcher.py", line 311, in from_config_value
    return EcsRunLauncher(inst_data=inst_data, **config_value)
  File "/usr/local/lib/python3.10/site-packages/dagster_aws/ecs/launcher.py", line 127, in __init__
    task_definition = self.ecs.describe_task_definition(taskDefinition=self.task_definition)
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 943, in _make_api_call
    http, parsed_response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 966, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/usr/local/lib/python3.10/site-packages/botocore/endpoint.py", line 119, in make_request
    return self._send_request(request_dict, operation_model)
  File "/usr/local/lib/python3.10/site-packages/botocore/endpoint.py", line 198, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/usr/local/lib/python3.10/site-packages/botocore/endpoint.py", line 134, in create_request
  File "/usr/local/lib/python3.10/site-packages/botocore/hooks.py", line 412, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/hooks.py", line 256, in emit
    return self._emit(event_name, kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/hooks.py", line 239, in _emit
    response = handler(**kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/signers.py", line 105, in handler
    return self.sign(operation_name, request)
  File "/usr/local/lib/python3.10/site-packages/botocore/signers.py", line 189, in sign
  File "/usr/local/lib/python3.10/site-packages/botocore/auth.py", line 418, in add_auth
    raise NoCredentialsError()
but I'm having trouble narrowing down exactly where to look. I follow that it's an issue with AWS credentials but not sure exactly what dagster is trying to do here. Can anyone offer any insight? Thanks very much in advance!
which dagster version are you using?
Hi @Bernardo Cortez I'm running 1.1.21
Are you using the default step launcher?
yes i am
I had a similar problem, that was solved with this https://github.com/dagster-io/dagster/pull/11421
Hi! Is the pipeline that you're canceling using any AWS resources?
Hi @Tim Castillo the dags themselves are running on AWS ECS and writing files to a few different S3 buckets. When I let the pipeline run through I am able to see the files in those s3 buckets
Thanks for the response! Sounds like two possible vectors: • the ECS instances (likely) • the connections that write to the S3 buckets (less likely) let me see what I can dig up about this.
ok great! Thanks very much for the help
Hi @Tim Castillo I haven't been able to get to the bottom of this issue, Wondering if you were able to find anything?
Let me know me follow up with the team to see if they can solve this!
Thank you!
Hi @Tim Castillo I was wondering if you were able to find anything out here?
Hi @Tim Castillo just checking in on the above to see if you were able to uncover anything...I'm totally stumped
Hi again! Sorry for not replying last time, following up with the team again.
No problem @Tim Castillo I appreciate the help!
@Ben Wilson how do you have dagster deployed and how are you providing aws credentials for it to launch the ECS tasks in the first place?
The first thing that comes to mind is that runs are launched by the daemon process, and cancellations take place in the dagit process. Maybe only the daemon has credentials
Ah okay thanks @johann I can check on that. I am deploying Dagster on ECS/Fargate with roles defined in terraform but didn't appreciate that separation of duties
Very helpful. I am seeing that I have a task role defined one the daemon defined with the following permissions, but no task role for the task running dagit. Is there a baseline set of permissions I should use for the dagit instance?
I don’t think we have a list compiled, you could file an issue for that. I think most users don’t run in to it since they use the docker compose solution https://github.com/dagster-io/dagster/tree/1.3.2/examples/deploy_ecs. I’d imagine that this particular action needs
Posting back in case helpful to someone else. Adding the permissions
Copy code
seems to address the issue and allows the dagit process/assigned role to cancel the process. Thanks @johann and @Tim Castillo very much for giving me some helpful guidance along the way!
thankyou 2