Liezl Puzon
08/03/2022, 4:28 PMJim Nisivoccia
08/08/2022, 6:36 PMBinoy Shah
08/09/2022, 2:39 PMstdout
and stderr
, I do not want to store logs in any place, What kind of logging manager should be applied in this case ?praveen
08/12/2022, 6:41 AMCharlie Bini
08/18/2022, 1:44 PMdatagun_etl_gsheets (fac0512e-a510-4ab9-87f7-528dd3b81fa8) started a new run worker while the run was already in state DagsterRunStatus.STARTED. This most frequently happens when the run worker unexpectedly stops and is restarted by the cluster. Marking the run as failed.
Pretty sure I traced the event that lead to this in the GKE logs:
Scale-down: removing node gk3-dagster-cloud-nap-1jnp3gdh-282a8794-g2v8, utilization: {0.7951653944020356 0.2775837641283469 0 cpu 0.7951653944020356}, pods to reschedule: dagster-cloud/dagster-run-fac0512e-a510-4ab9-87f7-528dd3b81fa8-x27xz
I'm running GKE Autopilot with the multiprocess executor. Not sure why it decided to scale down the node in the middle of a run, but any idea? Is there a good way to handle this? I have retries enabled, but that errored for another reason I'll post in the threadbitsofinfo
08/22/2022, 5:10 PMbitsofinfo
08/22/2022, 5:12 PMbitsofinfo
08/22/2022, 7:06 PMMatt Menzenski
08/25/2022, 9:35 PMScott Hood
08/30/2022, 3:02 PMBinoy Shah
08/31/2022, 6:18 PMdagsterApiGrpcArgs
dagsterApiGrpcArgs:
- "--package-name"
- "program_data.program_data"
- "--package-name"
- "survey_data.survey_data"
does not seem to be workingBinoy Shah
08/31/2022, 8:46 PMdagsterApiGrpcArgs
? my single docker has code for multiple packages, each with it’s own @repository
methodEegan K
09/07/2022, 2:27 PMNick Narcise
09/14/2022, 7:57 PMCaio Tavares
09/20/2022, 2:57 PMdagster
. All of our pipelines/jobs should run on specific namespaces rather than the dagster one. How do I configure the configMap user-env
to be deployed and created within each namespace where the jobs must run? What is happening is that once I trigger a pipeline, the job attempts to start on the target namespace but then I get an error:
configmap "xxx-dagster-dagster-user-deployments-yyy-user-env" not found: CreateContainerConfigErro
Dusty Shapiro
09/23/2022, 3:03 PMCeleryK8sRunLauncher
? I am getting 403 errors when attempting to run jobs and it appears it’s trying to create jobs in the default
namespace.Dusty Shapiro
09/28/2022, 4:06 PMWARNING - Could not load location data-team-dag-code to check for schedules due to the following error: dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster/_daemon/workspace.py", line 131, in _load_location
location = self._create_location_from_origin(origin)
File "/usr/local/lib/python3.7/site-packages/dagster/_daemon/workspace.py", line 150, in _create_location_from_origin
return origin.create_location()
File "/usr/local/lib/python3.7/site-packages/dagster/_core/host_representation/origin.py", line 333, in create_location
return GrpcServerRepositoryLocation(self)
File "/usr/local/lib/python3.7/site-packages/dagster/_core/host_representation/repository_location.py", line 561, in __init__
list_repositories_response = sync_list_repositories_grpc(self.client)
File "/usr/local/lib/python3.7/site-packages/dagster/_api/list_repositories.py", line 19, in sync_list_repositories_grpc
api_client.list_repositories(),
File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/client.py", line 169, in list_repositories
res = self._query("ListRepositories", api_pb2.ListRepositoriesRequest)
File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/client.py", line 115, in _query
raise DagsterUserCodeUnreachableError("Could not reach user code server") from e
The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1664381076.964556226","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1664381076.964555536","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
>
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/client.py", line 112, in _query
response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 946, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
Dusty Shapiro
09/28/2022, 4:09 PMFROM python:3.7-slim
RUN pip install dagster dagit dagster-postgres
COPY . /
Dusty Shapiro
09/28/2022, 4:10 PMdagster-user-deployments:
enabled: true
deployments:
- name: "data-team-dag-code"
image:
repository: "redacted/data-dagster"
tag: 9282022v2
pullPolicy: Always
dagsterApiGrpcArgs:
- "--python-file"
- "./src/test_repo.py"
port: 3030
Dusty Shapiro
09/28/2022, 4:13 PMDusty Shapiro
09/30/2022, 12:26 PMIsmael Rodrigues
10/04/2022, 3:38 AMSlackbot
10/04/2022, 5:23 AMDusty Shapiro
10/07/2022, 12:33 PMdagsterApiGrpcArgs
to the workspace YAML? Edit: Locally I’m able to specify multiple repositories in the workspace.yaml, but curious how to mimic that when deploying via Helm.Jordan Wolinsky
10/14/2022, 7:50 PMk8s_job_op
as an @asset
(k8s_job_asset
)https://docs.dagster.io/_apidocs/libraries/dagster-k8s#opsDusty Shapiro
10/17/2022, 7:24 PMchris
10/19/2022, 8:56 PMDusty Shapiro
10/21/2022, 12:09 AMdagster-user-deployments.deployments
The web UI only shows the error stating hat it can’t connect to the gRPC server, but curious the best way to surface the exception that is causing the server to fail, which I do now by kubectl log <pod-name>
Thanks!Eegan K
10/21/2022, 3:59 PMMark Fickett
10/25/2022, 2:26 PM<http://eks.amazonaws.com/compute-type=fargate:NoSchedule|eks.amazonaws.com/compute-type=fargate:NoSchedule>
taint. I think this is a similar issue.
Is it straightforward / the right solution to just add a toleration to the agent, and if so how would I do that? Or is that a change that needs to be built into the chart?