hello! trying to set up the Hybrid deployment… I ...
# dagster-plus
a
hello! trying to set up the Hybrid deployment… I have a single code location that I want to use with Dagster Cloud managed agents and my own k8s agents. ran into the issue where the agent on k8s cannot pull the image (401 error) that has been deployed to serverless:
dagster-cloud serverless deploy-docker --from "dagster_cloud.yaml" --location-name "px_dagster_gitlab_ci" --base-image "python:3.11-bookworm"
dagster_cloud.yaml:
Copy code
locations:
  - location_name: px_dagster_gitlab_ci
    code_source:
      module_name: px_dagster.definitions
error from the agent k8s pod:
Copy code
2024-02-23 15:38:51 +0000 - dagster_cloud.user_code_launcher - INFO - Waiting for new grpc server for ('prod', 'px_dagster_gitlab_ci') for (image=CodeDeploymentMetadata(image='<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1>', python_file=None, package_name=None, module_name='px_dagster.definitions', working_directory=None, executable_path=None, attribute=None, git_metadata=None, container_context={}, cloud_context_env={'DAGSTER_CLOUD_DEPLOYMENT_NAME': 'prod', 'DAGSTER_CLOUD_IS_BRANCH_DEPLOYMENT': 0, 'DAGSTER_CLOUD_LOCATION_NAME': 'px_dagster_gitlab_ci'}, pex_metadata=None, agent_queue=None)) to be ready...
2024-02-23 15:39:30 +0000 - dagster_cloud.user_code_launcher - ERROR - Error while waiting for server for prod:px_dagster_gitlab_ci for (image=CodeDeploymentMetadata(image='<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1>', python_file=None, package_name=None, module_name='px_dagster.definitions', working_directory=None, executable_path=None, attribute=None, git_metadata=None, container_context={}, cloud_context_env={'DAGSTER_CLOUD_DEPLOYMENT_NAME': 'prod', 'DAGSTER_CLOUD_IS_BRANCH_DEPLOYMENT': 0, 'DAGSTER_CLOUD_LOCATION_NAME': 'px_dagster_gitlab_ci'}, pex_metadata=None, agent_queue=None)) to be ready: Exception: Error creating deployment for pxdagstergitlabci-prod-48641e.
Debug information for pod pxdagstergitlabci-prod-48641e-74d6dcf5ff-pkdjg:

Pod status: Pending
Container 'dagster' status: Waiting: ImagePullBackOff: Back-off pulling image "<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1>"

No logs for container 'dagster'.

Warning events for pod:
Failed: Failed to pull image "<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1>": rpc error: code = Unknown desc = failed to pull and unpack image "<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1>": failed to resolve reference "<http://657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1|657821118200.dkr.ecr.us-west-2.amazonaws.com/serverless-agent-89825961-af96-37ab-9eef-070083e09ac2:prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1>": pulling from host <http://657821118200.dkr.ecr.us-west-2.amazonaws.com|657821118200.dkr.ecr.us-west-2.amazonaws.com> failed with status code [manifests prod-px_dagster_gitlab_ci-e6f276dab1ba48748a7e7c019097dde1]: 401 Unauthorized (x3)
Failed: Error: ErrImagePull (x3)
Failed: Error: ImagePullBackOff (x2)

For more information about the failure, run `kubectl describe pod pxdagstergitlabci-prod-48641e-74d6dcf5ff-pkdjg` or `kubectl describe deployment pxdagstergitlabci-prod-48641e` in your cluster.

Stack Trace:
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1420, in _reconcile
    self._wait_for_new_server_ready(
  File "/dagster-cloud/dagster_cloud/workspace/kubernetes/launcher.py", line 507, in _wait_for_new_server_ready
    wait_for_deployment_complete(
  File "/dagster-cloud/dagster_cloud/workspace/kubernetes/utils.py", line 317, in wait_for_deployment_complete
    raise Exception(error_message)
if I manually build the image and push it to my GitLab registry, and then provide
imagePullSecrets
in the agent helm chart, the k8s agent works, but Dagster agent complains that it can only run images that were built using serverless. what am I doing wrong? is there no way to have the same code location be used on cloud-managed agent and k8s agent?
m
Hi @Alex Prykhodko, Tyou can't useless the serverless images to run within kubernetes. There are significant differences in how the k8s and serverless agents launch and manage code locations and runs and for that reason you should use the CI/CD specific to hybrid to build and upload your images: https://docs.dagster.io/dagster-cloud/getting-started#step-4-configure-cicd-for-your-project If i am misunderstanding what you are trying to do or your use case, help me understand it.
a
@Mathieu Larose so, to confirm, using
dagster-cloud ci
I can build an image that can be used on both k8s agent and Dagster Cloud-managed agent? I did not get a clear confirmation in the docs that a single code location can be used deployed to both types of agents. this is the error that Dagster Cloud-managed agent produces:
Copy code
dagster._core.errors.DagsterUserCodeUnreachableError: Failure loading server endpoint for prod:px_dagster_gitlab_ci:
Exception: Invalid image <http://registry.gitlab.com/px-data/px-data-dagster:runner-latest|registry.gitlab.com/px-data/px-data-dagster:runner-latest>. Only images managed by Dagster Cloud can be used in Serverless deployments.
(k8s agent works as expected with the same image)
m
using
dagster-cloud ci
I can build an image that can be used on both k8s agent and Dagster Cloud-managed agent?
no, unfortunately you can't. can you explain what's your use case? are you running both serverless and hybrid on k8s ?
fyi - you can't run both serverless and hybrid from the same organization concurrently. you would have to deactivate one to switch to the other.
you can do this from your agent tab:
a
my use case: run 90% of the jobs on Dagster Cloud agent. for the jobs that require 32gb+ ram (about 500 minutes per month) – run them on my k8s agent. use the same control plane. it’s weird because I am able to run two agents (legacy plan) – they both show up even though I’m in ‘serverless’ mode (found the ‘switch to hybrid’ button – will be experimenting).
thanks for your help @Mathieu Larose
m
i think the daemons with which the agents interact will have different code paths in each mode. i can't guarantee that it will work. in fact, we don't think it should but happy to be surprised.
out of curiosity, if you went through the hoops of setting up a k8s cluster and agent, why do you keep serverless?
usually the impetus for serverless is avoiding that work and management