Chris Chan
09/16/2021, 5:12 PMconsole
to each pipeline and have a run config set the log level to INFOowen
09/16/2021, 6:01 PMpython_logs:
python_log_level: INFO
), which it would probably make sense to have our default console logger inherit as well (although this wouldn't solve the json formatting part of your issue). Adding tap points for people to configure logging at the repository level is definitely something we're looking into, and this is a good motivating example.logger_defs={"json":json_console_logger.configured({"level":"INFO"})}
Chris Chan
09/16/2021, 6:57 PMmy_job = my_graph.to_job(
config={
"loggers": {
"console": {
"config": {
"log_level": "INFO"
}
}
}
},
logger_defs={"console": json_console_logger}
)
but this does not:
my_job = my_graph.to_job(
logger_defs={"console": json_console_logger.configured({"log_level":"INFO"})}
)
owen
09/16/2021, 6:58 PMChris Chan
09/16/2021, 6:59 PMowen
09/16/2021, 6:59 PMChris Chan
09/16/2021, 6:59 PMdagster api execute_run
(I’m doing this because I wrote a custom run launcher), I’m still getting log messages in stdout of this type:
{
"__class__": "DagsterEvent",
"event_specific_data": {
"__class__": "EngineEventData",
"error": null,
"marker_end": null,
"marker_start": null,
"metadata_entries": [
{
"__class__": "EventMetadataEntry",
"description": null,
"entry_data": {
"__class__": "TextMetadataEntryData",
"text": "20"
},
"label": "pid"
}
]
},
"event_type_value": "ENGINE_EVENT",
"logging_tags": {},
"message": "Multiprocess executor: parent process exiting after 1.68s (pid: 20)",
"pid": 20,
"pipeline_name": "do_nothing",
"solid_handle": null,
"step_handle": null,
"step_key": null,
"step_kind_value": null
}
so those are still getting through. stderr logs appears to be what I would expect thoughowen
09/27/2021, 6:00 PM.configured()
is actually not a bug, weirdly enough. The comment here indicates that only loggers referenced in the run config will actually be initialized. The reasoning makes some vague amount of sense, but it does result in some pretty strange end behavior that diverges from other configurable objects like resources. I'll look into the implications of changing that. For your other question, are all dagster events (like STEP_STARTED, OUTPUT_HANDLED, etc.) being logged this way, or just a subset of them (such as only engine events)?Chris Chan
09/27/2021, 8:13 PMSTEP_START
, STEP_OUTPUT
, LOADED_INPUT
owen
09/27/2021, 8:29 PMdagster api execute_run
instead of dagster pipeline launch
(docs)? I don't have a clear picture of why that would make a difference in logging behavior, but I believe they do end up hitting different initialization code paths, and I'm much less familiar with the first way of doing things.Chris Chan
09/27/2021, 8:30 PMdagster api execute_run
owen
09/27/2021, 8:33 PMChris Chan
09/27/2021, 8:37 PMlaunch_run()
it sends a request to initiate a Nomad job - the Nomad job is set to execute a shell script that runs dagster api execute_run ...
where the …
is whatever payload is providedserialize_dagster_namedtuple
owen
09/27/2021, 8:42 PMChris Chan
09/27/2021, 9:09 PM0.12.10
RunRequest
- so my run request looks like:
yield RunRequest(
run_key=s3_key,
run_config={
"ops": {
"get_key": {"config": {"s3_key": s3_key}},
},
"loggers": {"console": {"config": {"log_level": "INFO"}}},
},
)
and I don’t have it baked into my job like I do aboveowen
09/27/2021, 9:43 PMlogger_defs
argument in the to_job
call still {"console": json_console_logger}
? also is it possible to get a larger snippet of the output logs when you run this? finally, are you collecting these output logs via datadog running on the nomad machine(s), or doing something else to view the stdout/stderr logs?Chris Chan
09/27/2021, 11:31 PMupdate_job = update.to_job(
logger_defs={
"console": json_console_logger
},
resource_defs={
...
owen
09/28/2021, 12:10 AMChris Chan
09/28/2021, 1:14 PMowen
09/28/2021, 4:19 PMChris Chan
09/28/2021, 7:02 PM