Mark Fickett
06/28/2022, 3:00 PM0.15.2
, I'm seeing lots of log messages like 2022-06-28 10:55:16 -0400 - dagster - DEBUG - battery_data_job_local - 983e56f8-a69b-4edd-937b-0417012c2a37 - 296222 - STEP_WORKER_STARTED - Executing step "blah" in subprocess.
that don't seem to be respecting the dagster.yaml
setting of:
python_logs:
python_log_level: INFO
default_system_loggers
have DEBUG
level by default which is causing this. If I edit that directly it silences the log messages, but I don't see a way to configure it otherwise.0.15.0
removes it.daniel
06/28/2022, 3:22 PMMark Fickett
06/28/2022, 3:22 PMdaniel
06/28/2022, 3:27 PMMark Fickett
06/28/2022, 3:28 PMdagster.execute_pipeline
.daniel
06/28/2022, 3:32 PMimport requests
import csv
from dagster import job, op, get_dagster_logger, execute_pipeline, reconstructable, DagsterInstance
@op
def hello_cereal():
response = requests.get("<https://docs.dagster.io/assets/cereal.csv>")
lines = response.text.split("\n")
cereals = [row for row in csv.DictReader(lines)]
get_dagster_logger().info(f"Found {len(cereals)} cereals")
@job
def hello_cereal_job():
hello_cereal()
if __name__ == "__main__":
with DagsterInstance.get() as instance:
execute_pipeline(reconstructable(hello_cereal_job), instance=instance)
If I run this with a dagster.yaml
like this:
python_logs:
python_log_level: INFO
I still see a bunch of DEBUG lines in the logs like:
2022-06-28 11:53:28 -0500 - dagster - DEBUG - hello_cereal_job - 8a54c239-6ea3-487e-8786-bf45200d2593 - 13536 - RUN_START - Started execution of run for "hello_cereal_job".
Mark Fickett
06/28/2022, 5:01 PMDagsterLogHandler
to get console logging output the way I want (previous discussion on that), and maybe that's working <= 0.15.0 but not with default_system_loggers
. So maybe this is a feature request and not a regression with this specific message's handling.daniel
06/28/2022, 5:02 PMMark Fickett
06/28/2022, 5:16 PMdagster.yaml
log level setting doesn't come into play either way. But https://github.com/dagster-io/dagster/pull/8648 looks like just what I'd need!
class _PatchedDagsterLogHandler(log_manager.DagsterLogHandler):
"""Customize DagsterLogHandler to get back original messages and add formatting."""
def __init__(self, *args, **kwargs):
self._customized_handler = logging.StreamHandler()
# This handler's filter only affects console output, so the noisy messages
# remain visible in the Dagit UI. (Also even if this filter edits the
# messages' level in place, the Dagit UI and JSOn logger still sees them at
# their original level, possibly because those loggers consume them first.)
self._customized_handler.addFilter(NoisyLogsFilter())
self._customized_handler.setFormatter(coloredlogs.ColoredFormatter(_FORMAT))
self._customized_handler.setLevel(<http://logging.INFO|logging.INFO>)
super().__init__(*args, **kwargs)
def _convert_record(self, record: logging.LogRecord) -> logging.LogRecord:
"""Restores the LogRecord's original msg.
This needs to override the existing DagsterLogHandler's behavior because it
affects both our console output, and logs sent to OpenTelemetry / JSON.
<https://github.com/dagster-io/dagster/issues/7642>
"""
converted = super()._convert_record(record)
if "STEP_FAILURE" in converted.msg:
# Dagster packs exception info for step failures into
# converted.dagster_meta['dagster_event'].event_specific_data.error
# not standard exc_info, so just use the provided formatted msg.
return converted
# Dagster's log_manager.DagsterLogHandler overwrites the LogRecord's msg and
# args fields in-place. Although the original getMessage() value is saved in
# dagster_meta["orig_message"], the args and exc_info are lost.
record.msg = record.dagster_meta["orig_message"]
record.args = () # Dagster stores the result of getMessage(), not msg + args.
record.exc_text = None # Tell downstream formatters to reformat exc_info.
return record
def handle(self, record):
"""Passes records to our custom console handler.
We use this instead of the Dagster framework @logger mechanism because Dagster
re-logs the message to @loggers and they therefore lose the logger name,
filename/lineno, etc. <https://github.com/dagster-io/dagster/issues/7890> By
passing records to our handler's .handle() method we make all original
attributes available for formatting.
Dagster doesn't expose a way to add more handlers along side DagsterLogHandler,
so adding a sub-handler here is just a way to hook into the log record stream.
It also matches the pattern DagsterLogHandler uses, or dispatching log records
to various other internal handlers and log records.
"""
super().handle(record)
# The level check is usually done before the handler gets the record.
if self._customized_handler and (
record.levelno >= self._customized_handler.level
):
self._customized_handler.handle(record)
daniel
07/15/2022, 2:53 PMMark Fickett
07/15/2022, 2:58 PMAverell
08/29/2022, 12:13 PMpython_logs:
python_log_level: WARN
and could still see the INFO logs coming out from the daemon process.
I'm on 1.0.6.
ThanksMark Fickett
10/18/2022, 2:14 PMexecute_job
call and view console logs, and the default_system_loggers
DEBUG
output makes this too noisy to be usable. But I'd like to get up to date.
Would you be OK with adding another setting such as default_system_log_level
as a peer of python_log_level
? Glad to help with a PR based on yours above, but I'm not sure where to find associated tests / docs / etc that would need updating as well.
I was also trying to patch default_system_loggers
or LogManager
or LogHandler
at runtime, but I'm not finding a time when my code is loaded and can do some patching before the log machinery is already set up and issuing step-start events.loggers: console: config: level: ERROR
shouldn't fix this? https://docs.dagster.io/concepts/logging/loggers#configuring-the-built-in-loggersdaniel
10/18/2022, 2:39 PMMark Fickett
10/18/2022, 3:17 PMloggers
config and it didn't seem to have any effect. From spelunking in the code a bit, it looks like default_system_loggers
doesn't look at any configs, and the multiprocess executor just calls create_context_free_log_manager
which always uses the default_system_loggers
. So if I'm following that right, with a multiprocess executor it just always installs the default system logger at DEBUG level and does not consult other loggers or configs, but I was hoping I'd missed something.daniel
10/18/2022, 4:11 PMMark Fickett
10/18/2022, 8:38 PMdaniel
10/18/2022, 8:40 PMMark Fickett
10/24/2022, 2:49 PM