https://dagster.io/ logo
Title
g

Gabriel Fioravante

11/02/2022, 8:41 PM
Stack Trace:
  File "/home/aydev/.local/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_plan.py", line 299, in dagster_event_sequence_for_step
    raise dagster_user_error.user_exception
  File "/home/aydev/.local/lib/python3.10/site-packages/dagster/_core/execution/plan/utils.py", line 47, in solid_execution_error_boundary
    yield
  File "/home/aydev/.local/lib/python3.10/site-packages/dagster/_utils/__init__.py", line 430, in iterate_with_context
    next_output = next(iterator)
  File "/home/aydev/.local/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 557, in _gen_fn
    gen_output = output_manager.handle_output(output_context, output.value)
  File "/home/aydev/.local/lib/python3.10/site-packages/dagster/_core/storage/fs_io_manager.py", line 176, in handle_output
    pickle.dump(obj, write_obj, PICKLE_PROTOCOL)
  File "stringsource", line 2, in _catboost._PoolBase.__reduce_cython__
Hello all, getting above issue on
py31
with
dagster
latest (
1.0.15
). Seems like an issue with pickling C code, not sure how to about that. Any tips? :]
a

alex

11/02/2022, 8:46 PM
you will likely need to use a custom io manager to be able to persist complex objects https://docs.dagster.io/concepts/io-management/io-managers#io-managers
that or change the value thats being passed as an output to something simpler
g

Gabriel Fioravante

11/03/2022, 12:29 PM
Thanks for the tips @alex. Using a simpler value seems like least effort solution, but I wonder what would be “simple” in this case, the object in question is a small (strings with max 1 level nested) dict passed into execute_in_process as
resource
. Are we using the dagster api in a non-standard/disencouraged way here? I would like to know so we can avoid this type of issue in future version bumps
Just to give a bit more context: This is just an unit test running a
@graph
with
execute_in_process
to assert on execution result
a

alex

11/03/2022, 2:13 PM
hm - whats the exception message? I only see the stack trace above.
the object in question is a small (strings with max 1 level nested) dict passed into execute_in_process as resource
are you saying the op returns this passed in value directly? the error above has to do with the output of an op
g

Gabriel Fioravante

11/03/2022, 2:41 PM
def my_test:
    result = my_graph.execute_in_progress(resources={...required resources...})
    assert result.success
full stack trace
We use this pattern for other unit tests and it’s working fine
it seems to be something specific to this particular
resources
and set of ops
resources in this case is the bare minimum the graph needs to run
a

alex

11/03/2022, 2:43 PM
what type of object is
train_pool
?
g

Gabriel Fioravante

11/03/2022, 2:43 PM
resources={
    "pre_computed": {
        "schema": "3",
        "all_features": traffic_shaping_server_features,
    },
    "health_check": health_check_dict,
    "s3": s3_mock,
    "clickhouse": clickhouse_client,
    "customer": customer_response_dict,
    "ch_cached_stats": {
        "max_timestamp_customer": "2022-08-12 00:00:00",
        "timezone": "UTC",
    },
    "limits": {
        "limit_train": 1_000_000,
        "limit_eval": 1_000_000,
        "limit_test": 1_000_000,
    },
    "clock": mock_clock,
}
Oh, this is from a third-party package
catboost
It’s a class
Poll
, which has some C code in it (possibly why it’s breaking?)
a

alex

11/03/2022, 2:54 PM
yea thats the object that cant be serialized via pickle and stored on the filesystem