Hey my fellow Dagsterinos! Does anyone know why m...
# ask-community
h
Hey my fellow Dagsterinos! Does anyone know why my dagster deployment is outputting a huge amount of logs every few minutes DESPITE not executing any jobs/sensors? It appears to run the following command every few minutes and the output is a description of every model/macro associated with one of my dagster jobs.
Copy code
2022-06-24 11:30:57 +0100 - dagster.daemon.SensorDaemon - INFO - Not checking for any runs since no sensors have been started.
2022-06-24 11:31:59,643  96676   utils                Executing command: dbt --no-use-color --log-format json ls --project-dir ./dbt --profiles-dir ./dbt/config --select tag:hold --resource-type model --output json
Does anyone know why this could be happening and how to make it stop? I’m pretty sure it has something to do with software defined assets because only that job is being logged. Specifically this line of code:
dbt_assets = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select=f"tag:{DATASET_TYPE}")
See more info in the comment section 🎉
🤖 1
🙌 1
s
hey @Harpal - thanks for reporting this. I filed an issue to track addressing this: https://github.com/dagster-io/dagster/issues/8601. I suspect we should be able to do something by next week
🤞 1
h
Hi @sandy has there been any progress on this? The spam is manageable but it looks like Dagster/DBT ends up producing these logs on each of our pods across our deployment. This spikes our RAM usage, causes a multi-minute delay to spinning up pods and can can stop other jobs from running as it kills our pods (OOM error). It looks like I can use
executor_def
can help us create a new pod to run our DBT-software-defined-asset jobs on. But is there a way we can control their resource usage? I don’t want to have to pay for extra memory because dagster does some work and outputs the spammy logs that I didn’t request… You know how
@ops
let the user assign resources, request limits, etc. Can I do that for jobs defined by
AssetGroups()
? Limiting their resources should at least stop them from requiring as many resources.
o
hey @Harpal, I did merge a change in for last release which would result in those logs getting emitted at the debug log level (and therefore invisible on the host machine in most cases), but I see that the slowdown you're experiencing seems to be from those logs being sent back to dagster, rather than a different log management service. later today, I can give you a shim of code that could silence those logs completely that should work until the next release (where I can put in a more permanent change)
h
You're a legend! Thanks for your hard work 😃
o
no problem 🙂 -- looking at it a bit more, I think my initial guess was actually wrong, as those logs shouldn't be getting sent to dagster (although they definitely are being emitted). are you sure these logs are the source of the slowdown? it seems strange to me that emitting a few hundred log lines would be spiking memory usage to that extent
just want to make sure that whatever change we make actually solves the root cause
and to confirm, are these logs still visible after upgrading to the most recent release? I'd definitely believe it if they were, just want to confirm
one possible other culprit might be the invocation cost of dbt itself (if your dbt project is large enough). it could potentially be worth looking into using
load_assets_from_dbt_manifest
(which uses a pre-compiled manifest.json file, rather than compiling the dbt project on the fly, as
load_assets_from_dbt_project
does)
h
Good point! I'll start by upgrading Dagster and trying again. I hadn't considered the invocation cost before but that could very well be the problem. Some of these tables are well over a gigabyte so the project is likely quite large. It's almost 10PM BST so I'll try these changes first thing in the morning. Feel free to keep dropping any ideas or suggestions in the chat! I'll be sure to give them a try dagsir
@owen After upgrading to dagster
v0.15.5
these logs are still visible upon spinning up the node named
dagster-dagster-user-deployments-moonfire-dagster-repo-6cdm5f8h
. When running the job, It spins up a new pod after I added
Copy code
executor_def=k8s_job_executor
BUT now the jobs don’t even run. Sometimes it hands up the
user-deployments
pod dies. It gets as far as this before it stops responding 😞
Copy code
2022-07-15 15:53:27,032  1       utils                dbt exited with return code 0
2022-07-15 15:53:28 +0000 - dagster - DEBUG - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - RUN_START - Started execution of run for "sector_cls_all_assets".
2022-07-15 15:53:28 +0000 - dagster - DEBUG - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - ENGINE_EVENT - Starting execution with step handler K8sStepHandler.
2022-07-15 15:53:28 +0000 - dagster - DEBUG - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - run_dbt_moonfire_dbt_cb1dc - STEP_WORKER_STARTING - Executing step "run_dbt_moonfire_dbt_cb1dc" in Kubernetes job dagster-step-1da2935a7c85a426bf5be7a94bb252d9.
2022-07-15 15:54:04 +0000 - dagster - ERROR - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - _assets - Dependencies for step _assets failed: ['run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc', 'run_dbt_moonfire_dbt_cb1dc']. Not executing.
2022-07-15 15:54:05 +0000 - dagster - ERROR - sector_cls_all_assets - 51e2b07f-d9bc-4404-8354-9ff06eec6a55 - 1 - RUN_FAILURE - Execution of run for "sector_cls_all_assets" failed. Steps failed: ['run_dbt_moonfire_dbt_cb1dc'].
{"__class__": "DagsterEvent", "event_specific_data": null, "event_type_value": "PIPELINE_START", "logging_tags": {}, "message": "Started execution of run for \"sector_cls_all_assets\".", "pid": 1, "pipeline_name": "sector_cls_all_assets", "solid_handle": null, "step_handle": null, "step_key": null, "step_kind_value": null}
Any idea why it’s not working post-update or what else could be causing these failures?
o
ah sorry that's happening... I don't see any smoking gun in those logs, but I did notice that the
dbt ls
command that dagster executes to create the dbt
manifest.json
file seems to be taking nearly 10 seconds to complete (
15:34:13,002
to
15:34:22,814
). This is a pretty high cost for something that will get executed every time that a pod/process containing your dagster code is spun up, so I would definitely recommend trying out
load_assets_from_manifest_json
, as I could definitely imagine worlds where this delay could have bad knock-on effects.
not sure what would have changed between those updates to cause a significant change, to be honest -- did you change the executor_def at the same time that you did the upgrade, or did you also try
0.15.5
on the executor you were originally using?
h
Good point owen. I’ll undo all changes to the executor def and try again. After undoing the executor def and upgrading dagster to the latest version, the spammy output still remains 😞 These are the logs displayed upon deploying the cluster (logged from
dagster-user-deployments
)
On top of that. When I try to run a job using sw-defined assets it now fails with the following error.
dagster_dbt.errors.DagsterDbtCliOutputsNotFoundError: Expected to find file at path ./dbt/target/run_results.json
See the full error message below.
I’m pretty sure this is the pod running out of RAM (because I removed the
executor_def
.) It looks like the pod won’t run without the executor_def. the logs still show, and the job fails with or without the executor_def (without = memory failure, with = fails otherwise).
After deploying I’m met with the same error
Copy code
│ 2022-07-18 11:07:43 +0000 - dagster - INFO - resource:dbt - edcb3e45-a10d-416f-9e95-910504e5d7d7 - run_dbt_moonfire_dbt_78d03 - dbt exited with return code 2                                                                                     │
│ 2022-07-18 11:07:43 +0000 - dagster - ERROR - metrics_assets_job - edcb3e45-a10d-416f-9e95-910504e5d7d7 - 28 - run_dbt_moonfire_dbt_78d03 - STEP_FAILURE - Execution of step "run_dbt_moonfire_dbt_78d03" failed.                                 │
│                                                                                                                                                                                                                                                   │
│ dagster_dbt.errors.DagsterDbtCliOutputsNotFoundError: Expected to find file at path ./dbt/target/run_results.json                                                                                                                                 │
│                                                                                                                                                                                                                                                   │
│ Stack Trace:                                                                                                                                                                                                                                      │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/execute_plan.py", line 224, in dagster_event_sequence_for_step                                                                                  │
│     for step_event in check.generator(step_events):                                                                                                                                                                                               │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/execute_step.py", line 353, in core_dagster_event_sequence_for_step                                                                             │
│     for user_event in check.generator(                                                                                                                                                                                                            │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/execute_step.py", line 69, in _step_output_error_checked_user_event_sequence                                                                    │
│     for user_event in user_event_sequence:                                                                                                                                                                                                        │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/compute.py", line 174, in execute_core_compute                                                                                                  │
│     for step_output in _yield_compute_results(step_context, inputs, compute_fn):                                                                                                                                                                  │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/core/execution/plan/compute.py", line 142, in _yield_compute_results                                                                                                │
│     for event in iterate_with_context(                                                                                                                                                                                                            │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster/utils/__init__.py", line 406, in iterate_with_context                                                                                                               │
│     next_output = next(iterator)                                                                                                                                                                                                                  │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster_dbt/asset_defs.py", line 254, in dbt_op                                                                                                                             │
│     dbt_output = DbtOutput(result=context.resources.dbt.get_run_results_json())                                                                                                                                                                   │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster_dbt/cli/resources.py", line 286, in get_run_results_json                                                                                                            │
│     return parse_run_results(project_dir, target_path)                                                                                                                                                                                            │
│   File "/home/ubuntu/pyenv/versions/3.9.8/lib/python3.9/site-packages/dagster_dbt/cli/utils.py", line 169, in parse_run_results                                                                                                                   │
│     raise DagsterDbtCliOutputsNotFoundError(path=run_results_path)
According to Levan’s message 5 months ago (on this channel) they resolved the following error by adding more ram/cpu to their pods. However this is prohibitively expensive and not a sustainable/scalable solution. I need a way to not only spin up a new node with this job but restrict the resources it is allowed to use. Is this possible by configuring the amount of memory I can allocate to this specific job ( like what can be done with ops)?
o
so for that error in your most recent message, it does look like dbt itself failed before the dagster job failed (dbt exited with return code 2). In the longer stack trace before that, I noticed:
Copy code
2022-07-18 10:40:14 +0000 - dagster - ERROR - resource:dbt - bcb26efb-1263-4974-be56-d707cc5fbc6a - run_dbt_moonfire_dbt_78d03 - Encountered an error:
_relations_cache_for_schemas() takes 2 positional arguments but 3 were given
doing a little digging (https://github.com/dbt-labs/dbt-core/issues/5278), that might indicate that you have mismatched versions of
dbt-core
and your
dbt-...
library (i.e.
dbt-postgres
or
dbt-snowflake
etc.)
my guess is that this is not ram/resource related (as i really don't think invoking dbt should use any appreciable amount of memory), so I'm hoping that upgrading the adapter library might solve your issue there
h
Okie dokie! I’ll match up my version of dbt-core and dbt-postgres and try again 😄 In the meanwhile here is the error message I get when running the existing code on my K8s cluster. (with the new manifest json)
It doesn’t appear to be a mismatch of versions (although i’m not using the latest version of either).
Copy code
ubuntu@dagster-run-11183473-89c2-4f1d-8a2a-199723aec1b7-gpl25:/moonfire$ dbt --version
installed version: 1.0.1-rc1
   latest version: 1.1.1

Your version of dbt is out of date! You can find instructions for upgrading here:
<https://docs.getdbt.com/docs/installation>

Plugins:
  - postgres: 1.0.1rc1
ubuntu@dagster-run-11183473-89c2-4f1d-8a2a-199723aec1b7-gpl25:/moonfire$
Also i’m installing
dbt-core
and
dbt-postgres
in my Dockerfile as follows. Is this best practice?
Copy code
# install dbt-core & dbt-postgres here in the Dockerfile as it is not used locally on mac m1
RUN python -m pip install --no-cache-dir "git+<https://github.com/dbt-labs/dbt-core@v1.0.1#egg=dbt-core&subdirectory=core>"
RUN python -m pip install --no-cache-dir "git+<https://github.com/dbt-labs/dbt-core@v1.0.1#egg=dbt-postgres&subdirectory=plugins/postgres>"
o
to be honest I'm not super familiar with docker, but any reason you're locked on that specific dbt version / installing via a git path (rather than
pip install dbt-core==1.0.1
)?
h
Thanks a bunch @owen Right now the jobs are running (with no visible errors) but this specific job isn’t exporting the files to gcloud as expected. It’s like it finishes the first portion of the job but then doesn’t start the
_assets
section of the job (in charge of looping over the assets and uploading specific files to google cloud storage). Here is the code I am executing and the logs below. Do you have any idea why it isn’t finsihing the job and uploading the files to GCS? If not, any tips on how I can debug this in more detail would be awesome 🙂
This is the job i’m running (but stops half way through)
These are the logs that show no failure (but also no Google Cloud export commands).
o
hmm in the logs (starting line 76), it does seem to me like the
_assets
step is getting executed and emitting outputs -- are you viewing this run in dagit? if so, mind sharing a screenshot?
h
You’re right! It is working. Just had to fiddle with some business logic on my end. Everything is working smooth rn. Thanks so much dagster team!!! @owen is still killing it out here dabparrot
o
dabparrot
glad it's all working again 😄