nate
08/12/2020, 10:52 PMnate
08/12/2020, 10:52 PMnate
08/13/2020, 4:05 PMKing Chung Huang
08/13/2020, 4:10 PMnate
08/13/2020, 4:11 PMnate
08/13/2020, 4:12 PMMichiel Ghyselinck
11/04/2020, 10:26 AMModuleNotFoundError: No module named 'dagster_dask'
File "/usr/local/lib/python3.8/site-packages/dagster/cli/api.py", line 380, in _execute_run_command_body
for event in execute_run_iterator(recon_pipeline, pipeline_run, instance):
File "/usr/local/lib/python3.8/site-packages/dagster/core/execution/api.py", line 727, in __iter__
for event in self.iterator(
File "/usr/local/lib/python3.8/site-packages/dagster/core/execution/api.py", line 665, in _pipeline_execution_iterator
for event in pipeline_context.executor.execute(pipeline_context, execution_plan):
File "/usr/local/lib/python3.8/site-packages/dagster_dask/executor.py", line 256, in execute
for future, result in iterate_with_context(raise_interrupts_immediately, futures):
File "/usr/local/lib/python3.8/site-packages/dagster/utils/__init__.py", line 443, in iterate_with_context
next_output = next(iterator)
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 4475, in __next__
return self._get_and_raise()
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 4466, in _get_and_raise
raise exc.with_traceback(tb)
File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
I think this means I need to add the package 'dagster_dask' to my dask workers, is that correct?Andy H
11/23/2020, 8:47 PMCharles Lariviere
06/24/2021, 9:54 PM<http://dd.to|dd.to>_delayed()
representation and loads it back as dd.from_delayed()
but those still seemed to include lambda
functions which Python’s pickle
can’t handle.
For context, the Dagster multiprocess bit is unrelated to Dask per-say; I’m enriching a subset of rows by querying an API which I’m attempting to do in batches to:
1. speed it up by processing multiple batches at a time
2. reduce the risk of losing everything in case of failure since API calls are paid — if 1 batch fails, I can simply replay that single batch.