https://dagster.io/ logo
#dagster-feedback
Title
# dagster-feedback
m

Mark Fickett

06/01/2022, 1:13 PM
When adding a Dagster Cloud location, if I pass
--module-name
the CLI times out waiting for the update and the UI just says "Loading" for hours; I don't see an error from the agent. The code location docs say to use
--package-name
and don't mention
--module-name
. Would it make sense to remove the
--module-name
option from the
dagster-cloud
CLI entirely? It would be nice to either get an error message, or not have the option to do it the wrong way.
d

daniel

06/01/2022, 1:19 PM
Hi Mark - module-name and package-name should be identical. To confirm, it's working for you when you set package-name but not module-name?
Do you have a link to the location in cloud that you could share (here or via DM)?
should behave identically, that is
m

Mark Fickett

06/01/2022, 1:21 PM
Correct, worked with package-name but when I previously tired module-name it wouldn't load; same ECR image. It's the
tech-data-pipeline-ecr
location in https://formenergy.dagster.cloud/prod/workspace .
I just created
tech-data-pipeline-ecr-with-module-name-for-debug
with the same location definition except back to
module_name
.
d

daniel

06/01/2022, 1:23 PM
Do you know what version of the dagster-cloud library you're running?
m

Mark Fickett

06/01/2022, 1:24 PM
Copy code
$ pip list | grep dagster-cloud
dagster-cloud                            0.14.15
d

daniel

06/01/2022, 1:25 PM
What about in the agent itself?
m

Mark Fickett

06/01/2022, 1:25 PM
Is the agent hosted by Dagster Cloud? Or, how do I check?
d

daniel

06/01/2022, 1:26 PM
You're hosting the agent - the answer depends on whether you're in K8s/ECS/etc.
m

Mark Fickett

06/01/2022, 1:27 PM
Ah, Kubernetes hosted by AWS.
(I'm not very practiced with k8s so I may be using the wrong terminology here.)
d

daniel

06/01/2022, 1:33 PM
No problem - I believe we added support for module_name in a version of the agent that's later than what you have, but I'm surprised that you're not getting an error back from the Dagster Cloud CLI when you try to use module_name, we'll look into that. In the meantime though I suspect running a helm upgrade of the agent in your k8s cluster will allow you to use module_name - similar to the instructions here at step 4 https://docs.dagster.cloud/agents/kubernetes/setup
ty thankyou 1
m

Mark Fickett

06/01/2022, 1:33 PM
I vaguely remember starting the agent, but haven't updated it since 2021 fall at least. Here's the status. When I "view metadata" it says:
Copy code
{
  "type": null,
  "version": null
}
d

daniel

06/01/2022, 1:34 PM
Once you upgrade I suspect that version will populate as well
m

Mark Fickett

06/01/2022, 2:41 PM
I attempted to upgrade:
Copy code
$ helm repo update
...
$ helm upgrade \
>    --install user-cloud dagster-cloud/dagster-cloud-agent \
>    --namespace dagster-cloud \
>    --set dagsterCloud.deployment=prod
Release "user-cloud" has been upgraded. Happy Helming!
NAME: user-cloud
LAST DEPLOYED: Wed Jun  1 10:13:51 2022
NAMESPACE: dagster-cloud
STATUS: deployed
REVISION: 8
TEST SUITE: None
and I see:
Copy code
$ helm list --namespace dagster-cloud
NAME       NAMESPACE     REVISION UPDATED                                STATUS   CHART                       APP VERSION
user-cloud dagster-cloud 8       2022-06-01 10:13:51.449548477 -0400 EDTdeployed dagster-cloud-agent-0.14.17 0.14.17
but I'm not seeing any active agent (old one became inactive, new one has not appeared after 30m). 😬
d

daniel

06/01/2022, 2:42 PM
60 seconds should be plenty for it to appear. Any logs from the agent pod?
👀 1
My best guess would be needing to set the agent token secret (step 3 here: https://docs.dagster.cloud/agents/kubernetes/setup) - it's possible back when you set this up late last year it was done slightly differnetly and you included the secret in the helm upgrade command?
m

Mark Fickett

06/01/2022, 2:52 PM
Still looking for the logs. Seems like the secret already existed:
Copy code
$ kubectl create secret generic dagster-cloud-agent-token --from-literal=DAGSTER_CLOUD_AGENT_TOKEN=agent-<mytoken> --namespace=dagster-cloud
error: failed to create secret secrets "dagster-cloud-agent-token" already exists
$ kubectl get secret --namespace dagster-cloud
NAME                                         TYPE                                  DATA   AGE
dagster-cloud-agent-token                    Opaque                                1      216d
However, I deleted and re-created it.
Copy code
$ kubectl get secret --namespace dagster-cloud
NAME                                         TYPE                                  DATA   AGE
dagster-cloud-agent-token                    Opaque                                1      54s
d

daniel

06/01/2022, 2:54 PM
Got it - try "kubectl get pods" and then hopefully there's a pod with "agent" in the name - if its not running "kubectl logs <that pod ID>" will hopefully have some clues or "kubectl describe pod <that pod ID>"
m

Mark Fickett

06/01/2022, 2:55 PM
Copy code
user-cloud-dagster-cloud-agent-agent-5689bb4d6c-bc92s    0/1     CrashLoopBackOff   12         40m
Copy code
$ kubectl logs user-cloud-dagster-cloud-agent-agent-5689bb4d6c-bc92s --namespace dagster-cloud
Traceback (most recent call last):
  File "/usr/local/bin/dagster-cloud", line 33, in <module>
    sys.exit(load_entry_point('dagster-cloud', 'console_scripts', 'dagster-cloud')())
  File "/usr/local/lib/python3.8/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/dagster-cloud/dagster_cloud/agent/cli/__init__.py", line 179, in run
    run_local_agent_in_environment(dagster_home)
  File "/dagster-cloud/dagster_cloud/agent/cli/__init__.py", line 57, in run_local_agent_in_environment
    run_local_agent()
  File "/dagster-cloud/dagster_cloud/agent/cli/__init__.py", line 40, in run_local_agent
    with DagsterCloudAgentInstance.get() as instance:
  File "/dagster-cloud/dagster_cloud/instance/__init__.py", line 200, in get
    instance = DagsterInstance.get()
  File "/dagster/dagster/core/instance/__init__.py", line 409, in get
    return DagsterInstance.from_config(dagster_home_path)
  File "/dagster/dagster/core/instance/__init__.py", line 424, in from_config
    return DagsterInstance.from_ref(instance_ref)
  File "/dagster/dagster/core/instance/__init__.py", line 436, in from_ref
    return klass(  # type: ignore
  File "/dagster-cloud/dagster_cloud/instance/__init__.py", line 60, in __init__
    assert self.dagster_cloud_url
  File "/dagster-cloud/dagster_cloud/instance/__init__.py", line 111, in dagster_cloud_url
    raise DagsterInvariantViolationError(
dagster.core.errors.DagsterInvariantViolationError: Could not derive Dagster Cloud URL from agent token. Create a new agent token or set the `url` field under `dagster_cloud_api` in your `dagster.yaml`.
I see if I generate a new agent token the format is different. I'll reset that.
d

daniel

06/01/2022, 2:57 PM
Cool, two options here. Re-run the helm upgrade command with an additional:
Copy code
--set dagsterCloud.organization=<name of your organization>
or yeah generate a new token
👍🏻 1
either will work, the latter will make it not happen again in the future
m

Mark Fickett

06/01/2022, 3:03 PM
Great, after updating the agent token, the pod retried and was able to run, so I've got an updated agent going. Thanks! And it does show metadata:
Copy code
{
  "type": "K8sUserCodeLauncher",
  "version": "0.14.17"
}
19 Views