https://dagster.io/ logo
Title
y

Yuan Cheng

02/17/2023, 8:20 PM
Hi guys, we are having below issue when running multi process, no matter for which multi process, there will always be one random job not been executed, but those none executed jobs can be run manually no problem. Here's the error message we got and thanks for the help: dagster._core.errors.DagsterSubprocessError: During multiprocess execution errors occurred in child processes: In process 15201: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Stack Trace: File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/executor/child_process_executor.py", line 79, in _execute_command_in_child_process for step_event in command.execute(): File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/executor/multiprocess.py", line 78, in execute repository_load_data=self.repository_load_data, File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/execution/api.py", line 943, in create_execution_plan pipeline_def = pipeline.get_definition() File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 229, in get_definition return self.repository.get_definition().get_maybe_subset_job_def( File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 117, in get_definition return repository_def_from_pointer(self.pointer, self.repository_load_data) File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 782, in repository_def_from_pointer target = def_from_pointer(pointer) File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 672, in def_from_pointer target = pointer.load_target() File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/code_pointer.py", line 250, in load_target module = load_python_module(self.module, self.working_directory) File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/code_pointer.py", line 138, in load_python_module return importlib.import_module(module_name) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1006, in _gcd_import File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 677, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 728, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/home/dagster/rdbms_code_repository/repositories.py", line 3, in <module> from rdbms_code_repository.assets import ( File "/home/dagster/rdbms_code_repository/assets/dbt.py", line 60, in <module> dbt_assets = dbt_assets_factory(**dbt_project) File "/home/dagster/rdbms_code_repository/assets/dbt.py", line 42, in dbt_assets_factory display_raw_sql=True, File "/home/dagster/.local/lib/python3.7/site-packages/dagster_dbt/asset_defs.py", line 614, in load_assets_from_dbt_project project_dir, profiles_dir, target_dir, select, exclude File "/home/dagster/.local/lib/python3.7/site-packages/dagster_dbt/asset_defs.py", line 78, in _load_manifest_for_project return json.load(f), cli_output File "/usr/local/lib/python3.7/json/__init__.py", line 296, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/local/lib/python3.7/json/__init__.py", line 348, in loads return _default_decoder.decode(s) File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/local/lib/python3.7/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None The above exception occurred during handling of the following exception: StopIteration: 0 Stack Trace: File "/usr/local/lib/python3.7/json/decoder.py", line 353, in raw_decode obj, end = self.scan_once(s, idx) File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/execution/api.py", line 992, in pipeline_execution_iterator for event in pipeline_context.executor.execute(pipeline_context, execution_plan): File "/home/dagster/.local/lib/python3.7/site-packages/dagster/_core/executor/multiprocess.py", line 316, in execute subprocess_error_infos=list(errs.values()),
o

owen

02/17/2023, 9:33 PM
hi @Yuan Cheng! what version of dagster/dbt-core are you on? there were some compatibility issues with dbt-core 1.4 that resulted in similar-looking errors (which have been resolved in more recent versions of dagster-dbt). another possibility is that load_assets_for_dbt_project is being called on the same project in multiple threads simultaneously, resulting in the manifest.json file being written by two separate processes at the same time. you could get around this by switching over to load_assets_from_dbt_manifest, which uses a pre-compiled manifest (which should result in some significant performance improvements)
y

Yuan Cheng

02/17/2023, 10:26 PM
Here are dbt version and dagster dbt version:
dagster-dbt==0.17.15
dbt-core==1.1.3
Hi @owen, the issue is for any multiprocess execution.
o

owen

02/17/2023, 10:57 PM
then in that case i think it's the second suggestion -- using
load_assets_for_dbt_manifest
is probably the safest solution. for reference, load_assets_for_dbt_project will result in your entire dbt project being compiled whenever an op is executed (regardless of if it's a dbt-related op or not)
y

Yuan Cheng

02/17/2023, 11:54 PM
oh ic, let me try thx