Hello team, I am encountering the following `Unex...
# ask-community
d
Hello team, I am encountering the following
Unexpected GraphQL error
error info on my Dagster:
Operation name: RunsRootQuery
Message: Cannot return null for non-nullable field Run.mode.
Path: ["pipelineRunsOrError","results",2,"mode"]
Locations: [{"line":34,"column":3}]
Is anyone having something like that too?
t
Have done any upgrades recently, changes in version, and is this in local or prod?
j
I’m getting the same error. I just updated to 1.4.5. This is in my stage environment in AWS ECS.
This doesn’t appear in my local docker-compose version though
My error is slightly different however:
Copy code
Operation name: RunsRootQuery

Message: Cannot return null for non-nullable field Run.mode.

Path: ["pipelineRunsOrError","results",0,"mode"]

Locations: [{"line":26,"column":3}]
d
@Tim Castillo @Josh Lloyd We've done no upgrades recently. I've added new pipelines... But even removing those new pipelines I am not able to fix this problem. So far we only have the prod version. In my case, it seems that the problem is related with the Dagster Graphql API. Whenever I try to retrieve the information about the runs or any details on the pipeline status, it raises me this error and then refreshes the page. Aaah, also, before I forget, in my case it is hosted on GCP. A new version was released today, so I am guessing that's the problem.... ? Were you guys able to fix it ?
j
Sorry to confuse this thread, because maybe this is a separate issue I’m having, but I realized that my dagit and daemon were on a different version than my pipeline code was. Once I got them both deployed on the same version (now 1.4.6) the error changed:
Copy code
Operation name: InstanceWarningQuery

Message: Cannot query field 'backfillId' on type 'PartitionBackfill'.

Path: 

Locations: [{"line":10,"column":9}]
but other than this error appearing in the top right corner every few seconds, the rest of the UI appears to be working 🤷
a
are the
dagster
dagster-graphql
and
dagster-webserver
/
dagit
packages all on the same version? Some of these errors observed in this thread arise when these packages are not in sync
d
@alex Yes! They're all in the same versions. However, I must say that we have a lot of pipelines in different grpc repos. But even if I remove all of them and keep a single one, I still have the same error.
a
can you share the output of
pip list | grep dagster
?
d
@alex Initially, I had deployed Dagster version
1.0.6
using Helm, and the pipeline was utilizing Dagster version
1.0.16
. This configuration had been functioning without any issues, and multiple pipelines were operational. However, a problem arose when I encountered a GraphQL error. In an attempt to resolve this issue, I upgraded the Helm version of Dagster to the latest release, which was
1.4.7
. This upgrade successfully addressed the GraphQL error. However, an unintended consequence of this upgrade was that it caused all the other pipelines to fail. This was due to the fact that those pipelines were designed to work with older versions of Dagster. Subsequently, I reverted back to the previous version,
1.0.6
, and the older pipelines began functioning properly once again. Unfortunately, this reinstated the GraphQL error. Upon closer examination, it became clear that this error was primarily associated with the historical run data. Specifically, whenever I attempted to access information about pipelines in progress or completed pipelines, the error would manifest, causing disruptions to the user interface.
When I inspect the logs of the pod, I have this:
Copy code
/usr/local/lib/python3.7/site-packages/dagster/_core/workspace/context.py:561: UserWarning: Error loading repository location us-opendata-etl:dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server

Stack Trace:
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/workspace/context.py", line 556, in _load_location
    location = self._create_location_from_origin(origin)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/workspace/context.py", line 480, in _create_location_from_origin
    return origin.create_location()
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/host_representation/origin.py", line 329, in create_location
    return GrpcServerRepositoryLocation(self)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/host_representation/repository_location.py", line 547, in __init__
    list_repositories_response = sync_list_repositories_grpc(self.client)
  File "/usr/local/lib/python3.7/site-packages/dagster/_api/list_repositories.py", line 19, in sync_list_repositories_grpc
    api_client.list_repositories(),
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/client.py", line 169, in list_repositories
    res = self._query("ListRepositories", api_pb2.ListRepositoriesRequest)
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/client.py", line 115, in _query
    raise DagsterUserCodeUnreachableError("Could not reach user code server") from e

The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1692608615.831746170","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1692608615.831745169","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
>

Stack Trace:
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/client.py", line 112, in _query
    response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
  File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)

  location_name=location_name, error_string=error.to_string()
WARNING:  Invalid HTTP request received.
/usr/local/lib/python3.7/site-packages/dagster/_core/storage/pipeline_run.py:288: UserWarning: Found unhandled arguments from stored PipelineRun: dict_keys(['has_repository_load_data'])
  "Found unhandled arguments from stored PipelineRun: {args}".format(args=kwargs.keys())
Traceback (most recent call last):
[...]
complete_nonnull_value
    path=path,
graphql.error.base.GraphQLError: Cannot return null for non-nullable field Run.mode.

WARNING:  Invalid HTTP request received.
WARNING:  Invalid HTTP request received.
WARNING:  Invalid HTTP request received.
WARNING:  Invalid HTTP request received.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py", line 481, in complete_value_catching_error
    exe_context, return_type, field_asts, info, path, result
[...]
  File "/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py", line 547, in complete_value
    exe_context, return_type, field_asts, info, path, result
  File "/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py", line 748, in complete_nonnull_value
    path=path,
graphql.error.base.GraphQLError: Cannot return null for non-nullable field Run.mode.
The graphql address is not correct? Some internal variable that have changed ?
a
Could not reach user code server
This indicates that the code grpc server (running the “pipelines” image) was unreachable, likely it failed to start and there is an error in those pod logs that is relevant.
graphql.error.base.GraphQLError: Cannot return null for non-nullable field Run.mode
the details of this issue are here https://github.com/dagster-io/dagster/issues/15087#issuecomment-1679067325 • within the
webserver
and
daemon
images, all dagster packages must be on the same version • the version that is running in the
webserver
and
daemon
should be the same • the version that is running in the
webserver
and
daemon
should be greater than whats running in your “pipeline” or code grpc server image. Put another way, the version used in the “pipelines” / grpc code server can be out of sync and and older version than what the
webserver
/
daemon
is using.