https://dagster.io/ logo
Title
j

Jori Geysen

10/29/2021, 12:46 PM
Hi all, I'm looking to include python standard logs with dagit logs. I've gone through the documentation for dagster logging. My overall implementation details are the following: • dagster 0.13.1 • multi-container docker deployment as described here ◦ including a dagster-daemon container ◦ including a dagster gRPC service for user code As I mentioned, I'm looking to include standard python logs with dagit logs. I came up with two configurations for the
dagster.yaml
file: 1. When I want to save all logs to a file:
python_logs:
  managed_python_loggers:
    - my_logger
  dagster_handler_config:
    handlers:
      console:
          class: logging.FileHandler
          level: DEBUG
          stream: "logfile.log"
This works great, both the logs from
mylogger
and the dagster logs from the user code are saved to
logfile.log
, sitting in the gRPC container. 2. When I want to view the logs in terminal by running
docker logs
on the gRPC container (without saving to file first):
python_logs:
  managed_python_loggers:
    - my_logger
  dagster_handler_config:
    handlers:
      console:
          class: logging.StreamHandler
          level: DEBUG
          stream: <ext://sys.stdout>
This does not work however. When running
docker logs
for the gRPC container, only the the dagster logs from the user code logs show up. The logs emitted by
mylogger
do however show up on the dagit UI. => where are the 'standard' python logs direct to when the StreamHandler is used in combination with
<ext://sys.stdout>
?
Thanks a lot for your help in advance, have a nice day !
o

owen

10/29/2021, 4:28 PM
hi @Jori Geysen! I wasn't immediately able to replicate this (still poking around at it). Just to confirm a few things:  • for case 2, you're missing log events from my_logger when you run docker logs (and nothing else)?  • do the log outputs that you do see look like:
pipe - 811187ff-05f1-4799-af9f-5f7ef9ae723c - 78999 - my_thing - STEP_SUCCESS - Finished execution of step "my_thing" in 66ms.
or
2021-10-29 09:18:39 - dagster - DEBUG - pipe - 811187ff-05f1-4799-af9f-5f7ef9ae723c - 78999 - my_thing - STEP_SUCCESS - Finished execution of step "my_thing" in 66ms.
(main difference is the timestamp at the front, as well as the name of the logger / log level in the second case) Theoretically, you shouldn’t need the handler config at all for use case 2, because dagster should already send all the logs to stdout by default (and when you set my_logger to be a managed python logger, those logs should get sent to stdout in the same way). I ask about the formatting because the first format is what you’d see if those lines were coming from the handler that you defined, while the second format is what you get from the default dagster configuration.
j

Jori Geysen

11/01/2021, 12:53 PM
Hi @owen, Thanks a lot for your efforts, this is great ! 🙂 • Yes, I'm missing the log events from
mylogger
when running
docker logs
. • The outputs for dagster look like this:
2021-11-01 12:16:42,156 - DEBUG - dagster - log_manager - test_hello_world - 87da9bee-1423-475b-b1ec-af582fa1a77e - 82 - hello_world - STEP_START - Started execution of step "hello_world".
I've tried running the following config, as you suggested (so leaving out the handler config):
python_logs:
  managed_python_loggers:
    - mylogger
• When I run a solid which executes the following:
my_logger = logging.getLogger("mylogger")
my_logger.error("Hello World!")
The dagster UI behaves as expected and a nice red error message pops up. However
docker logs
looks like this:
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - RUN_START - Started execution of run for "test_hello_world".
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - ENGINE_EVENT - Executing steps in process (pid: 68)
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - hello_world - ENGINE_EVENT - Starting initialization of resources [io_manager].
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - hello_world - ENGINE_EVENT - Finished initialization of resources [io_manager].
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - ENGINE_EVENT - Finished steps in process (pid: 68) in 251ms
2021-11-01 12:55:17 - dagster - DEBUG - test_hello_world - 4ce7041c-a8b7-491e-83c1-4252f2ad9936 - 68 - RUN_SUCCESS - Finished execution of run for "test_hello_world".
Which doesn't show any sign of the "Hello World!" error message. However! after a lot of tinkering, the following config ends up working for me (where
/proc/1/fd/1
is the linked location of stdout in a docker container, linked to by
/dev/stdout
):
python_logs:
  managed_python_loggers:
    - mylogger
  dagster_handler_config:
    handlers:
      myHandler:
        class: logging.FileHandler
        level: INFO
        filename: "/proc/1/fd/1"
When running the same hello_world solid as above, the UI once again shows a nice red error message returning "Hello World!", as expected. However,
docker logs
now looks like this:
2021-11-01 13:01:41,565 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_START - Started execution of run for "test_hello_world".
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_START - Started execution of run for "test_hello_world".
2021-11-01 13:01:41,596 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Executing steps in process (pid: 70)
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Executing steps in process (pid: 70)
2021-11-01 13:01:41,616 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Starting initialization of resources [io_manager].
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Starting initialization of resources [io_manager].
2021-11-01 13:01:41,629 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Finished initialization of resources [io_manager].
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - ENGINE_EVENT - Finished initialization of resources [io_manager].
2021-11-01 13:01:41,753 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - LOGS_CAPTURED - Started capturing logs for solid: hello_world.
2021-11-01 13:01:41,772 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - STEP_START - Started execution of step "hello_world".
2021-11-01 13:01:41,791 - ERROR - mylogger - repo - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - hello_world - Hello World!
2021-11-01 13:01:41,814 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - STEP_OUTPUT - Yielded output "result" of type "Any". (Type check passed).
2021-11-01 13:01:41,841 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - HANDLED_OUTPUT - Handled output "result" using IO manager "io_manager"
2021-11-01 13:01:41,858 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - hello_world - STEP_SUCCESS - Finished execution of step "hello_world" in 67ms.
2021-11-01 13:01:41,889 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Finished steps in process (pid: 70) in 281ms
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - ENGINE_EVENT - Finished steps in process (pid: 70) in 281ms
2021-11-01 13:01:41,905 - DEBUG - dagster - log_manager - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_SUCCESS - Finished execution of run for "test_hello_world".
2021-11-01 13:01:41 - dagster - DEBUG - test_hello_world - 6fee397f-9ec2-4d13-94d6-c1bd489a35dd - 70 - RUN_SUCCESS - Finished execution of run for "test_hello_world".
Which clearly return the "Hello World!" error, which is what I'm looking for. However, there's now some duplication of logs. Any idea what I'm doing wrong here? Thanks again for looking into my case, really! 🙂
:ty-thankyou: 1
o

owen

11/01/2021, 4:18 PM
Hi again @Jori Geysen -- as far as I can tell, this seems to be a bug, as the expected behavior is that all the logs from managed_python_loggers will be handled essentially identically to the logs produced from the dagster framework. I'm going to try to reproduce what you're seeing to confirm. Thanks for reporting!
I was able to look into this a bit more, and I think this is more of an unfortunate weirdness of docker interacting with dagster than a clear-cut bug. With docker, only stdout messages that come from a process with PID=1 actually show up in the
docker logs
output. Dagster has 'ENGINE_EVENT' type dagster logs are duplicated with your most recent setup because those happen to be fired from the process with PID=1 in your docker container. step events (such as STEP_STARTED), are fired from a different process, and so even though they are treated identically by dagster (meaning they get the same handlers attached to them), they are treated differently by Docker because they are in a different process. If you want all dagster logs to make it to stdout from inside the docker container, your solution of sending them to
/proc/1/fd/1
with a handler is probably the best way of doing it. Overall, how big of a deal is the duplication for your workflow? Generally, this should only impact the starts/ends of the log output (because the engine events are just output when the run starts/ends, for the most part). There's not an easy way to disable this default system logging, but you could probably get the precise output that you want by attaching a Dagster Logger that outputs /projc/1/fd/1 to your pipeline/job and removing the handler config.
j

Jori Geysen

11/16/2021, 10:33 AM
Hi @owen, sorry for my late response; I was off the grid the paste week. Thank you for your elaborate response, that was very helpful. I must admit that the problem I was reporting here seemed to have been a local glitch on my system. After doing some troubleshooting with colleagues, it seemed like the problem of the logs being missing didn't occur on their machines and after sufficient nuking of my docker images/containers/volumes, all was good on mine as well.. Thanks again! 🙂
o

owen

11/16/2021, 5:17 PM
glad you got it working! 😄
😛artydagster: 1