Jori Geysen
10/29/2021, 12:46 PMdagster.yaml
file:
1. When I want to save all logs to a file:
python_logs:
managed_python_loggers:
- my_logger
dagster_handler_config:
handlers:
console:
class: logging.FileHandler
level: DEBUG
stream: "logfile.log"
This works great, both the logs from mylogger
and the dagster logs from the user code are saved to logfile.log
, sitting in the gRPC container.
2. When I want to view the logs in terminal by running docker logs
on the gRPC container (without saving to file first):
python_logs:
managed_python_loggers:
- my_logger
dagster_handler_config:
handlers:
console:
class: logging.StreamHandler
level: DEBUG
stream: <ext://sys.stdout>
This does not work however. When running docker logs
for the gRPC container, only the the dagster logs from the user code logs show up. The logs emitted by mylogger
do however show up on the dagit UI.
=> where are the 'standard' python logs direct to when the StreamHandler is used in combination with <ext://sys.stdout>
?
Thanks a lot for your help in advance, have a nice day !owen
10/29/2021, 4:28 PMpipe - 811187ff-05f1-4799-af9f-5f7ef9ae723c - 78999 - my_thing - STEP_SUCCESS - Finished execution of step "my_thing" in 66ms.
or 2021-10-29 09:18:39 - dagster - DEBUG - pipe - 811187ff-05f1-4799-af9f-5f7ef9ae723c - 78999 - my_thing - STEP_SUCCESS - Finished execution of step "my_thing" in 66ms.
(main difference is the timestamp at the front, as well as the name of the logger / log level in the second case)
Theoretically, you shouldn’t need the handler config at all for use case 2, because dagster should already send all the logs to stdout by default (and when you set my_logger to be a managed python logger, those logs should get sent to stdout in the same way). I ask about the formatting because the first format is what you’d see if those lines were coming from the handler that you defined, while the second format is what you get from the default dagster configuration.Jori Geysen
11/01/2021, 12:53 PMmylogger
when running docker logs
.
• The outputs for dagster look like this: 2021-11-01 12:16:42,156 - DEBUG - dagster - log_manager - test_hello_world - 87da9bee-1423-475b-b1ec-af582fa1a77e - 82 - hello_world - STEP_START - Started execution of step "hello_world".
I've tried running the following config, as you suggested (so leaving out the handler config):
python_logs:
managed_python_loggers:
- mylogger
• When I run a solid which executes the following:
my_logger = logging.getLogger("mylogger")
my_logger.error("Hello World!")
The dagster UI behaves as expected and a nice red error message pops up. However docker logs
looks like this:
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - RUN_START - Started execution of run for "test_hello_world".
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - ENGINE_EVENT - Executing steps in process (pid: 68)
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - hello_world - ENGINE_EVENT - Starting initialization of resources [io_manager].
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - hello_world - ENGINE_EVENT - Finished initialization of resources [io_manager].
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - ENGINE_EVENT - Finished steps in process (pid: 68) in 251ms
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - RUN_SUCCESS - Finished execution of run for "test_hello_world".
Which doesn't show any sign of the "Hello World!" error message.
However! after a lot of tinkering, the following config ends up working for me (where /proc/1/fd/1
is the linked location of stdout in a docker container, linked to by /dev/stdout
):
python_logs:
managed_python_loggers:
- mylogger
dagster_handler_config:
handlers:
myHandler:
class: logging.FileHandler
level: INFO
filename: "/proc/1/fd/1"
When running the same hello_world solid as above, the UI once again shows a nice red error message returning "Hello World!", as expected. However, docker logs
now looks like this:
2021-11-01 13:01:41,565 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_START - Started execution of run for "test_hello_world".
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_START - Started execution of run for "test_hello_world".
2021-11-01 13:01:41,596 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Executing steps in process (pid: 70)
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Executing steps in process (pid: 70)
2021-11-01 13:01:41,616 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Starting initialization of resources [io_manager].
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Starting initialization of resources [io_manager].
2021-11-01 13:01:41,629 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Finished initialization of resources [io_manager].
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Finished initialization of resources [io_manager].
2021-11-01 13:01:41,753 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - LOGS_CAPTURED - Started capturing logs for solid: hello_world.
2021-11-01 13:01:41,772 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - STEP_START - Started execution of step "hello_world".
2021-11-01 13:01:41,791 - ERROR - mylogger - repo - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - hello_world - Hello World!
2021-11-01 13:01:41,814 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - STEP_OUTPUT - Yielded output "result" of type "Any". (Type check passed).
2021-11-01 13:01:41,841 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - HANDLED_OUTPUT - Handled output "result" using IO manager "io_manager"
2021-11-01 13:01:41,858 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - STEP_SUCCESS - Finished execution of step "hello_world" in 67ms.
2021-11-01 13:01:41,889 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Finished steps in process (pid: 70) in 281ms
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Finished steps in process (pid: 70) in 281ms
2021-11-01 13:01:41,905 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_SUCCESS - Finished execution of run for "test_hello_world".
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_SUCCESS - Finished execution of run for "test_hello_world".
Which clearly return the "Hello World!" error, which is what I'm looking for. However, there's now some duplication of logs.
Any idea what I'm doing wrong here? Thanks again for looking into my case, really! 🙂owen
11/01/2021, 4:18 PMdocker logs
output. Dagster has 'ENGINE_EVENT' type dagster logs are duplicated with your most recent setup because those happen to be fired from the process with PID=1 in your docker container. step events (such as STEP_STARTED), are fired from a different process, and so even though they are treated identically by dagster (meaning they get the same handlers attached to them), they are treated differently by Docker because they are in a different process. If you want all dagster logs to make it to stdout from inside the docker container, your solution of sending them to /proc/1/fd/1
with a handler is probably the best way of doing it. Overall, how big of a deal is the duplication for your workflow? Generally, this should only impact the starts/ends of the log output (because the engine events are just output when the run starts/ends, for the most part). There's not an easy way to disable this default system logging, but you could probably get the precise output that you want by attaching a Dagster Logger that outputs /projc/1/fd/1 to your pipeline/job and removing the handler config.Jori Geysen
11/16/2021, 10:33 AMowen
11/16/2021, 5:17 PM