https://dagster.io/ logo
#ask-community
Title
# ask-community
b

Beau Albiston

06/29/2023, 5:17 AM
Hello! Running into an issue with Hybrid Branch Deployments... I'm probably missing something obvious. My GitHub Action Workflow is based on
dagster-io/dagster-cloud-hybrid-quickstart
. The
prod
deployment works fine when merging a PR into main. However, branch deployments fail during
on: pull_request
. I get the following error with step "Deploy to Dagster Cloud":
Copy code
Exception: Invalid image <MY DOCKER IMAGE>. Only images managed by Dagster Cloud can be used in Serverless deployments.
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1308, in _reconcile
    self._check_for_image(code_deployment_metadata)
  File "/dagster-cloud-serverless-agent/dagster_cloud_serverless_agent/serverless/user_code_launcher.py", line 95, in _check_for_image
    raise Exception(
The agent
dagster.yaml
looks like this (note
branch_deployments: true
):
Copy code
# dagster.yaml

instance_class:
  module: dagster_cloud.instance
  class: DagsterCloudAgentInstance

dagster_cloud_api:
  agent_token: <MY AGENT TOKEN>
  branch_deployments: true
  deployment: prod

user_code_launcher:
  module: dagster_cloud.workspace.docker
  class: DockerUserCodeLauncher
  config:
    networks:
      - dagster_cloud_agent
    server_ttl:
      enabled: true
      ttl_seconds: 7200   # 2 hours
The branch deployments seem to remain configured for Serverless -- they have a "Managed by Dagster Cloud" agent assigned to them. BUT... A couple of times I have seen both my hybrid agent AND the serverless agent assigned to the branch deployment. Thanks!
Update... Walked away last night, and came back this morning to find the followin error and screenshot:
Copy code
Copy
docker.errors.APIError: 500 Server Error for <http+docker://localhost/v1.43/containers/60b492f372a6bc2dc74bb243b14574befc9388ed3abc8fdcf7b5245c2a9fc311/start>: Internal Server Error ("failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: sethostname: invalid argument: unknown")

  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1310, in _reconcile
    new_dagster_servers[to_update_key] = self._start_new_dagster_server(
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1605, in _start_new_dagster_server
    return self._start_new_server_spinup(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 298, in _start_new_server_spinup
    container, server_endpoint = self._launch_container(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 240, in _launch_container
    container.start()
  File "/usr/local/lib/python3.10/site-packages/docker/models/containers.py", line 406, in start
    return self.client.api.start(self.id, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/docker/api/container.py", line 1127, in start
    self._raise_for_status(res)
  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 270, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
  File "/usr/local/lib/python3.10/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e

The above exception was caused by the following exception:
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: <http+docker://localhost/v1.43/containers/60b492f372a6bc2dc74bb243b14574befc9388ed3abc8fdcf7b5245c2a9fc311/start>

  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 268, in _raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
p

prha

06/29/2023, 6:56 PM
Hi Beau. We typically only support either Hybrid or Serverless deployments for an org. I think you’re in a state right now where you have agents up for both. Should we spin down your Serverless agent?
I think by default, branch deployments will get directed to the serverless agent if it exists, which may explain what you are seeing
b

Beau Albiston

06/29/2023, 6:59 PM
My intent is to use Hybrid for both "prod" and branch deployments. As mentioned above, I'm using the latest GitHub Workflow
dagster-cloud-hybrid-quickstart
, which I would expect would setup all deployments as Hybrid...
Is there some configuration I'm missing to get into the above desired state?
p

prha

06/29/2023, 8:07 PM
I just spun down your serverless agent. I believe your branch deployments should now be directed to your hybrid agent.
b

Beau Albiston

06/29/2023, 8:32 PM
Thank you. Checking it out now.
Hi @prha, I see the serverless agent is gone... Thanks. Now I'm getting this error only for branch deployments (
prod
works):
Copy code
Copy
docker.errors.APIError: 500 Server Error for <http+docker://localhost/v1.43/containers/dc1870de17e0e757e80077f57ed07a561ca7d9ced6511a6ed7082defe32a1804/start>: Internal Server Error ("failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: sethostname: invalid argument: unknown")

  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1310, in _reconcile
    new_dagster_servers[to_update_key] = self._start_new_dagster_server(
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1605, in _start_new_dagster_server
    return self._start_new_server_spinup(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 298, in _start_new_server_spinup
    container, server_endpoint = self._launch_container(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 240, in _launch_container
    container.start()
  File "/usr/local/lib/python3.10/site-packages/docker/models/containers.py", line 406, in start
    return self.client.api.start(self.id, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/docker/api/container.py", line 1127, in start
    self._raise_for_status(res)
  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 270, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
  File "/usr/local/lib/python3.10/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e

The above exception was caused by the following exception:
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: <http+docker://localhost/v1.43/containers/dc1870de17e0e757e80077f57ed07a561ca7d9ced6511a6ed7082defe32a1804/start>

  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 268, in _raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
p

prha

06/29/2023, 9:30 PM
Can you try restarting the agent and redeploying to one of your branch deployments? It’s curious that the hostname is
localhost
even though you have a network configured in your user code launcher
b

Beau Albiston

06/29/2023, 9:51 PM
Same error... Would it be helpful if I send the agent logs directly to you?
I see where it's pulling the image for the branch deployment:
2023-06-29 21:46:51 +0000 - dagster_cloud.user_code_launcher - INFO - Pulling image <MY IMAGE URL>
. But, as you've noticed already, immediately after pulling the image, my agent errors with:
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: <http+docker://localhost/v1.43/containers/a63fae1297789ace53ec6722e15c8f142c0f24fd106f3c801ebea382e4fc7831/start>
.
p

prha

06/29/2023, 10:06 PM
And just to confirm, on your GH repo, there’s only the hybrid workflow file?
b

Beau Albiston

06/29/2023, 10:11 PM
Yes.
p

prha

06/29/2023, 10:11 PM
Apologies… I’m going to have to dig a little more and confer with some of my colleagues
b

Beau Albiston

06/29/2023, 10:12 PM
No problem. Thank you.
@Rusty Zarse
@Pawel Gucik
@prha hybrid branch deployments are still broken for us... Google results seem to indicate the following error may be related to the length of the host name exceeding a 64 character limit.
Copy code
docker.errors.APIError: 500 Server Error for <http+docker://localhost/v1.43/containers/dc1870de17e0e757e80077f57ed07a561ca7d9ced6511a6ed7082defe32a1804/start>: Internal Server Error ("failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: sethostname: invalid argument: unknown
I'm see two names floating around in the docker logs related to the dagster docker agent:
Copy code
dataorchestration-8086304f9bdf87e07c7d838ac9ba292ad80e89f5-b751bb
dataorchestration-prod-e9dc38
I believe the branch deployment "name" is the first one, and
prod
the second -- obviously. Also, to be clear,
prod
deployments are presently working fine...
dataorchestration-prod-e9dc38
is less than 64 characters. The branch deployment docker host name, on the other hand, is 66 characters. Anyhoo... any help would be greatly appreciated.
d

daniel

08/28/2023, 4:01 PM
Hey Beau - from your description this looks like a bug on our side with how we truncate container names - should be a quick fix, I'll keep you posted
b

Beau Albiston

08/28/2023, 4:29 PM
Hi @daniel, thank you! Let me know if you need any help testing.
d

daniel

08/28/2023, 5:07 PM
Ok, I can reproduce and have a quick fix out for the release later this week. In the short term, a workaround would be to shorten the name of your code location a bit to keep the length of that character name <= 63 characters (the dns hostname limit)
❤️ 1
b

Beau Albiston

08/30/2023, 5:00 AM
Thank you, @daniel!
2 Views