Hi! Have you ever encountered this error: `dagster...
# ask-community
g
Hi! Have you ever encountered this error:
dagster._check.CheckError: Invariant failed. Description: Parent pipeline snapshot id out of sync with passed parent pipeline snapshot
? How can I overcome it?
a
in what context did you encounter this error?
g
Hi @alex I am running dagit locally and changing some configs in one of my assets and requesting backfills. It was working until I changed "something", then I started getting the error
I am trying to revert the changes and checking what caused it, I am starting to guess it is related to my asset's metadata
here's my asset definition code:
Copy code
@asset(
        name="Way2_ETL",
        description=description or "Way2 ETL",
        partitions_def=partitions_def,
        compute_kind="Way2 API",
        key_prefix=key_prefix,
        required_resource_keys={
            way2_api_connector_key,
            sqlalchemy_engine_key,
        },
        # metadata={
        #     "scada_id": scada_id,
        #     "pontoId_mapping": pontoId_mapping,
        #     "grandezas_mapping": grandezas_mapping,
        #     "resource_key_mapping": {
        #         "way2_api_connector_key": way2_api_connector_key,
        #         "sqlalchemy_engine_key": sqlalchemy_engine_key,
        #     },
        # },
    )
    def way2_etl(context: OpExecutionContext):
        ...
It works with
metadata
commented out, but stops working when I uncomment it
a
what precisely stops working? the backfills?
g
yeah,
a
what version are you on?
g
let me check
dagster 1.3.7 dagit 1.3.7 dagster-graphql 1.3.7
I am uncommenting part of my metadata to isolate what's causing the error
a
are you using
dagster dev
or manually running dagit and the daemon? You could try restarting the daemon
g
dagster dev
a
a full stack trace would be helpful if you are receiving one
g
sure
just a sec
btw,I guess I found the problem
I am passing a
dict[int, int]
to my metadata
No errors when running as below
Copy code
dagster._check.CheckError: Invariant failed. Description: Parent pipeline snapshot id out of sync with passed parent pipeline snapshot

  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/implementation/utils.py", line 126, in _fn
    return fn(*args, **kwargs)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/implementation/utils.py", line 57, in _fn
    result = fn(self, graphene_info, *args, **kwargs)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/schema/roots/mutation.py", line 281, in mutate
    return create_execution_params_and_launch_pipeline_exec(graphene_info, executionParams)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/schema/roots/mutation.py", line 259, in create_execution_params_and_launch_pipeline_exec
    return launch_pipeline_execution(
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/implementation/execution/launch_execution.py", line 32, in launch_pipeline_execution
    return _launch_pipeline_execution(graphene_info, execution_params)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/implementation/execution/launch_execution.py", line 66, in _launch_pipeline_execution
    run = do_launch(graphene_info, execution_params, is_reexecuted)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/implementation/execution/launch_execution.py", line 49, in do_launch
    dagster_run = create_valid_pipeline_run(graphene_info, external_job, execution_params)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster_graphql/implementation/execution/run_lifecycle.py", line 79, in create_valid_pipeline_run
    dagster_run = graphene_info.context.instance.create_run(
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1395, in create_run
    dagster_run = self._construct_run_with_snapshots(
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1118, in _construct_run_with_snapshots
    self._ensure_persisted_job_snapshot(job_snapshot, parent_job_snapshot)
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1165, in _ensure_persisted_job_snapshot
    check.invariant(
  File "/home/gustavo/miniconda3/lib/python3.10/site-packages/dagster/_check/__init__.py", line 1654, in invariant
    raise CheckError(f"Invariant failed. Description: {desc}")
Full stack trace
a
hmm, i find that surprising if its just a regular
dict[int,int]
is it something slightly more complex than that?
g
not at all
Copy code
with open(root / "json_definitions/circuits.json") as file:
    circuits = json.load(file)

pontoId_mapping = {circuit["tag"]: circuit["id"] for circuit in circuits}
here's the code defining it
Copy code
[
 {
  "id": 100,
  "tag": 5174
 },
 {
  "id": 101,
  "tag": 5173
 },
 {
  "id": 102,
  "tag": 5172
 },
 ...
]
this is
circuits.json
i will try to build a quick MWE
ok, the MWE did not report the error
I must clarify that I made changes to
pontoId_mapping
, btw. Before (the asset could backfill even with the
dict[int, int]
in the metadata)
Copy code
pontoId_mapping = {circuit["id"]: circuit["tag"] for circuit in circuits}
After (the asset backfill started giving error)
Copy code
pontoId_mapping = {circuit["tag"]: circuit["id"] for circuit in circuits}
a
that before and after looks the same unless im missing something
g
sry
I swapped
tag
and
id
(edited above accordingly)
a
hmm, does`circuits.json` have any collisions in tag values?
g
let me check, it should not
a
if you had a string value and int value for the same key, that would cause issues going through serialize->deserialize, something like this:
Copy code
>>> {4: 'x', '4': 'y'}
{4: 'x', '4': 'y'}
>>> d = {4: 'x', '4': 'y'}
>>> import json
>>> json.dumps(d)
'{"4": "x", "4": "y"}'
>>> json.loads(json.dumps(d))
{'4': 'y'}
g
got it, but there are no collisions in the data
a
got it. I havent been able to produce a repro either
g
neither for
id
or
tag
, each field is "all uniques"
I believe it is something that is persisted somewhere and is giving me the snapshot id conflict
however, I already have cleared all my $DAGSTER_HOME directory (multiple times actually), but the error still happening
can this be stored somewhere else?
a
I believe the desync is happening between how this is loaded in code server process that loads the definitions directly and the dagit/daemon host processes that operate on serialized representations that are fetched from the code server
g
hmmm
that would make sense in the clashing between str and int
but i couldnt reproduce the mwe
a
are you unblocked by using
dict[str,int]
?
g
yes
if do this:
"pontoId_mapping": {str(k): v for k, v in pontoId_mapping.items()},
it works
a
would you be willing to file an issue for this ? https://github.com/dagster-io/dagster/issues
g
Yeah, I would! What kind of data should I provide, since I couldn't isolate a MWE?
thank you box 1
a
including what you know and what you tried will be helpful
👍 1
g