Harpal
06/24/2022, 10:34 AM2022-06-24 11:30:57 +0100 - dagster.daemon.SensorDaemon - INFO - Not checking for any runs since no sensors have been started.
2022-06-24 11:31:59,643 96676 utils Executing command: dbt --no-use-color --log-format json ls --project-dir ./dbt --profiles-dir ./dbt/config --select tag:hold --resource-type model --output json
Does anyone know why this could be happening and how to make it stop?
I’m pretty sure it has something to do with software defined assets because only that job is being logged. Specifically this line of code:
dbt_assets = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select=f"tag:{DATASET_TYPE}")
See more info in the comment section 🎉sandy
06/24/2022, 3:52 PMHarpal
07/14/2022, 5:48 PMexecutor_def
can help us create a new pod to run our DBT-software-defined-asset jobs on. But is there a way we can control their resource usage? I don’t want to have to pay for extra memory because dagster does some work and outputs the spammy logs that I didn’t request…
You know how @ops
let the user assign resources, request limits, etc. Can I do that for jobs defined by AssetGroups()
?
Limiting their resources should at least stop them from requiring as many resources.owen
07/14/2022, 6:04 PMHarpal
07/14/2022, 7:38 PMowen
07/14/2022, 8:39 PMload_assets_from_dbt_manifest
(which uses a pre-compiled manifest.json file, rather than compiling the dbt project on the fly, as load_assets_from_dbt_project
does)Harpal
07/14/2022, 8:54 PMv0.15.5
these logs are still visible upon spinning up the node named dagster-dagster-user-deployments-moonfire-dagster-repo-6cdm5f8h
.
When running the job, It spins up a new pod after I added
executor_def=k8s_job_executor
BUT now the jobs don’t even run. Sometimes it hands up the user-deployments
pod dies. It gets as far as this before it stops responding 😞
2022-07-15 15:53:27,032 1 utils dbt exited with return code 0
2022-07-15 15:53:28 +0000 - dagster - DEBUG - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - RUN_START - Started execution of run for "sector_cls_all_assets".
2022-07-15 15:53:28 +0000 - dagster - DEBUG - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - ENGINE_EVENT - Starting execution with step handler K8sStepHandler.
2022-07-15 15:53:28 +0000 - dagster - DEBUG - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - run_dbt_moonfire_dbt_cb1dc - STEP_WORKER_STARTING - Executing step "run_dbt_moonfire_dbt_cb1dc" in Kubernetes job dagster-step-1da2935a7c85a426bf5be7a94bb252d9.
2022-07-15 15:54:04 +0000 - dagster - ERROR - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - _assets - Dependencies for step _assets failed: ['run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc']. Not executing.
2022-07-15 15:54:05 +0000 - dagster - ERROR - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - RUN_FAILURE - Execution of run for "sector_cls_all_assets" failed. Steps failed: ['run_dbt_moonfire_dbt_cb1dc'].
{"__class__": "DagsterEvent", "event_specific_data": null, "event_type_value": "PIPELINE_START", "logging_tags": {}, "message": "Started execution of run for \"sector_cls_all_assets\".", "pid": 1, "pipeline_name": "sector_cls_all_assets", "solid_handle": null, "step_handle": null, "step_key": null, "step_kind_value": null}
Any idea why it’s not working post-update or what else could be causing these failures?owen
07/15/2022, 4:27 PMdbt ls
command that dagster executes to create the dbt manifest.json
file seems to be taking nearly 10 seconds to complete (15:34:13,002
to 15:34:22,814
). This is a pretty high cost for something that will get executed every time that a pod/process containing your dagster code is spun up, so I would definitely recommend trying out load_assets_from_manifest_json
, as I could definitely imagine worlds where this delay could have bad knock-on effects.0.15.5
on the executor you were originally using?Harpal
07/18/2022, 10:39 AMdagster-user-deployments
)dagster_dbt.errors.DagsterDbtCliOutputsNotFoundError: Expected to find file at path ./dbt/target/run_results.json
See the full error message below.executor_def
.)
It looks like the pod won’t run without the executor_def. the logs still show, and the job fails with or without the executor_def (without = memory failure, with = fails otherwise).│ 2022-07-18 11:07:43 +0000 - dagster - INFO - resource:dbt - edcb3e45-a10d-416f-9e95-910504e5d7d7 - run_dbt_moonfire_dbt_78d03 - dbt exited with return code 2 │
│ 2022-07-18 11:07:43 +0000 - dagster - ERROR - metrics_assets_job - edcb3e45-a10d-416f-9e95-910504e5d7d7 - 28 - run_dbt_moonfire_dbt_78d03 - STEP_FAILURE - Execution of step "run_dbt_moonfire_dbt_78d03" failed. │
│ │
│ dagster_dbt.errors.DagsterDbtCliOutputsNotFoundError: Expected to find file at path ./dbt/target/run_results.json │
│ │
│ Stack Trace: │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/execute_plan.py", line 224, in dagster_event_sequence_for_step │
│ for step_event in check.generator(step_events): │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/execute_step.py", line 353, in core_dagster_event_sequence_for_step │
│ for user_event in check.generator( │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/execute_step.py", line 69, in _step_output_error_checked_user_event_sequence │
│ for user_event in user_event_sequence: │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/compute.py", line 174, in execute_core_compute │
│ for step_output in _yield_compute_results(step_context, inputs, compute_fn): │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/compute.py", line 142, in _yield_compute_results │
│ for event in iterate_with_context( │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/utils/__init__.py", line 406, in iterate_with_context │
│ next_output = next(iterator) │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster_dbt/asset_defs.py", line 254, in dbt_op │
│ dbt_output = DbtOutput(result=context.resources.dbt.get_run_results_json()) │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster_dbt/cli/resources.py", line 286, in get_run_results_json │
│ return parse_run_results(project_dir, target_path) │
│ File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster_dbt/cli/utils.py", line 169, in parse_run_results │
│ raise DagsterDbtCliOutputsNotFoundError(path=run_results_path)
owen
07/18/2022, 5:03 PM2022-07-18 10:40:14 +0000 - dagster - ERROR - resource:dbt - bcb26efb-1263-4974-be56-d707cc5fbc6a - run_dbt_moonfire_dbt_78d03 - Encountered an error:
_relations_cache_for_schemas() takes 2 positional arguments but 3 were given
dbt-core
and your dbt-...
library (i.e. dbt-postgres
or dbt-snowflake
etc.)Harpal
07/18/2022, 5:07 PMubuntu@dagster-run-11183473-89c2-4f1d-8a2a-199723aec1b7-gpl25:/moonfire$ dbt --version
installed version: 1.0.1-rc1
latest version: 1.1.1
Your version of dbt is out of date! You can find instructions for upgrading here:
<https://docs.getdbt.com/docs/installation>
Plugins:
- postgres: 1.0.1rc1
ubuntu@dagster-run-11183473-89c2-4f1d-8a2a-199723aec1b7-gpl25:/moonfire$
Also i’m installing dbt-core
and dbt-postgres
in my Dockerfile as follows. Is this best practice?
# install dbt-core & dbt-postgres here in the Dockerfile as it is not used locally on mac m1
RUN python -m pip install --no-cache-dir "git+<https://github.com/dbt-labs/dbt-core@v1.0.1#egg=dbt-core&subdirectory=core>"
RUN python -m pip install --no-cache-dir "git+<https://github.com/dbt-labs/dbt-core@v1.0.1#egg=dbt-postgres&subdirectory=plugins/postgres>"
owen
07/18/2022, 5:25 PMpip install dbt-core==1.0.1
)?Harpal
07/18/2022, 7:02 PM_assets
section of the job (in charge of looping over the assets and uploading specific files to google cloud storage).
Here is the code I am executing and the logs below. Do you have any idea why it isn’t finsihing the job and uploading the files to GCS? If not, any tips on how I can debug this in more detail would be awesome 🙂owen
07/18/2022, 8:31 PM_assets
step is getting executed and emitting outputs -- are you viewing this run in dagit? if so, mind sharing a screenshot?Harpal
07/18/2022, 9:50 PMowen
07/18/2022, 9:50 PM