Hey Team, dagster deployment was fine and then sta...
# ask-community
s
Hey Team, dagster deployment was fine and then started experiencing this error (I havent encountered it before and not sure what it is telling me):
Copy code
Error loading artemis_dagster: {'__typename': 'PythonError', 'message': 'Exception: Timed out after waiting 315s for server artemisdagster-prod-b471b4.serverless-agents-namespace-4:4000.\n\nTask logs:\n{"time": "02/Aug/2023:19:03:45 +0000", "log": "_server - Exception calling application: <_InactiveRpcError of RPC that terminated with:\\n\\tstatus = StatusCode.UNAVAILABLE\\n\\tdetails = \\"failed to connect to all addresses\\"\\n\\tdebug_error_string = \\"{\\"created\\":\\"@1691003025.587928869\\",\\"description\\":\\"Failed to pick subchannel\\",\\"file\\":\\"src/core/ext/filters/client_channel/client_channel.cc\\",\\"file_line\\":3260,\\"referenced_errors\\":[{\\"created\\":\\"@1691003025.587927973\\",\\"description\\":\\"failed to connect to all addresses\\",\\"file\\":\\"src/core/lib/transport/error_utils.cc\\",\\"file_line\\":167,\\"grpc_status\\":14}]}\\"\\n>\\nTraceback (most recent call last):\\n  File \\"/usr/local/lib/python3.10/site-packages/grpc/_server.py\\", line 443, in _call_behavior\\n    response_or_iterator = behavior(argument, context)\\n  File \\"/dagster-cloud/dagster_cloud/pex/grpc/server/server.py\\", line 147, in Ping\\n    return self._query(\\"Ping\\", request, context)\\n  File \\"/dagster-cloud/dagster_cloud/pex/grpc/server/server.py\\", line 133, in _query\\n    )._get_response(api_name, request)\\n  File \\"/dagster/dagster/_grpc/client.py\\", line 130, in _get_response\\n    return getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)\\n  File \\"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\\", line 946, in __call__\\n    return _end_unary_response_blocking(state, call, False, None)\\n  File \\"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\\", line 849, in _end_unary_response_blocking\\n    raise _InactiveRpcError(state)\\ngrpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:\\n\\tstatus = StatusCode.UNAVAILABLE\\n\\tdetails = \\"failed to connect to all addresses\\"\\n\\tdebug_error_string = \\"{\\"created\\":\\"@1691003025.587928869\\",\\"description\\":\\"Failed to pick
seemed to have been resolved with a redeploy, but would love to learn why it just crash and if there was a way to mitigate.
t
You're on Dagster Cloud Serverless, right? We can look and see what went wrong.
s
yep that is correct. We're on dagster cloud severless
t
I'll peek, but feel free to drop a ping if it happens again
👍 1
s
Hey Dagster team were getting the same error again
Copy code
Copy
Exception: Timed out after waiting 315s for server artemisdagster-prod-58c4c2.serverless-agents-namespace-4:4000.

Task logs:
{"time": "03/Aug/2023:18:07:47 +0000", "log": "_server - Exception calling application: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1691086067.818166503\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":3260,\"referenced_errors\":[{\"created\":\"@1691086067.818165518\",\"description\":\"failed to connect to all addresses\",\"file\":\"src/core/lib/transport/error_utils.cc\",\"file_line\":167,\"grpc_status\":14}]}\"\n>\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.10/site-packages/grpc/_server.py\", line 443, in _call_behavior\n    response_or_iterator = behavior(argument, context)\n  File \"/dagster-cloud/dagster_cloud/pex/grpc/server/server.py\", line 147, in Ping\n    return self._query(\"Ping\", request, context)\n  File \"/dagster-cloud/dagster_cloud/pex/grpc/server/server.py\", line 133, in _query\n    )._get_response(api_name, request)\n  File \"/dagster/dagster/_grpc/client.py\", line 130, in _get_response\n    return getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)\n  File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 946, in __call__\n    return _end_unary_response_blocking(state, call, False, None)\n  File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 849, in _end_unary_response_blocking\n    raise _InactiveRpcError(state)\ngrpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1691086067.818166503\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":3260,\"referenced_errors\":[{\"created\":\"@1691086067.818165518\",\"description\":\"failed to connect to all addresses\",\"file\":\"src/core/lib/transport/error_utils.cc\",\"file_line\":167,\"grpc_status\":14}]}\"\n>", "status": "ERROR", "logger": "grpc._server"}
{"time": "03/Aug/2023:18:07:48 +0000", "log": "_server - Exception calling application: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1691086068.826795498\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":3260,\"referenced_errors\":[{\"created\":\"@1691086068.826794541\",\"description\":\"failed to connect to all addresses\",\"file\":\"src/core/lib/transport/error_utils.cc\",\"file_line\":167,\"grpc_status\":14}]}\"\n>\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.10/site-packages/grpc/_server.py\", line 443, in _call_behavior\n    response_or_iterator = behavior(argument, context)\n  File \"/dagster-cloud/dagster_cloud/pex/grpc/server/server.py\", line 147, in Ping\n    return self._query(\"Ping\", request, context)\n  File \"/dagster-cloud/dagster_cloud/pex/grpc/server/server.py\", line 133, in _query\n    )._get_response(api_name, request)\n  File \"/dagster/dagster/_grpc/client.py\", line 130, in _get_response\n    return getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)\n  File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 946, in __call__\n    return _end_unary_response_blocking(state, call, False, None)\n  File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 849, in
t
Triaging it over!
s
I think i had this same error on my local and had to add this on my dagster.yml file:
Copy code
code_servers:
  local_startup_timeout: 500
t
What version of Dagster are you on? and do you happen to be using any integrations like dbt?
s
I added this to my dagster-cloud.yaml file hoping it did the same thing:
Copy code
locations:
  - location_name: artemis_dagster
    code_source:
      module_name: artemis_dagster
code_servers:
  local_startup_timeout: 500
Yeah we're using dbt-cloud. Which i think it takes a while for dagster to start up. We're hoping to migrate over to dbt-core/snowflake soon but havent gotten around to it yet
We're on version 1.3.7
👍🏽 1