How can I stop dagster from logging all { "__class...
# ask-community
j
How can I stop dagster from logging all { "__class__": "DagsterEvent", ... into log? After each run I have a lot of json logs of DagsterEvent type. I have set
Copy code
python_logs:
  python_log_level: INFO
but It doesny help. Is there a posibility to get rid of those logs, as they are saved into postgres event_log_storage
Does anyone have the same problem that all events are logged as json into logs?
d
Hi Jackub - I wouldn't expect these lines to end up in postgres unless you have some custom logger config set up to capture certain loggers (I could imagine seeing them in pod logs or the raw compute logs, but getting stored in postgres is surprising unless there's something unique happening with your setup). Are you possibly using this feature? https://docs.dagster.io/concepts/logging/python-logging#capturing-python-logs- Would you maybe be able to pass along a debug file for a run where you're seeing this? There's a 'download debug file' option on the page for each run in the upper right:
j
We are using event_log_storage as postgres and additionally we can see logs in cloud watch. And the problem is that we can see in cloud watch all json DagsterEvents the same as in debug file, which I can't share here.
I tried setting log level to INFO, but those JSONs are still wisible only normal logs were filtered. One strage thing is that those jsons are writen to log after all the logs. Last normal log is: 2023-09-13 080107 +0000 - dagster - DEBUG - {job} - c70c5d36-6690-4bbc-8fca-6a4d9bbd7388 - 1 - RUN_SUCCESS - Finished execution of run for "{job}". and then all the jsons are written to logs, as if dagster was running in some debug mode...
d
I just want to confirm - you’re certain that these json logs are showing up as rows in the postgres db specifically? (As opposed to all the other places where events appear, like cloudwatch / pod logs / etc.)? Could you share a sample row from your postgres db that isn’t confidential?
j
They are stored in event_logs table here is the sample:
Copy code
8318721,e8a15a25-4614-4f83-acc6-ee246443d5b7, "{""__class__"": ""EventLogEntry"", ""dagster_event"": {""__class__"": ""DagsterEvent"", ""event_specific_data"": null, ""event_type_value"": ""PIPELINE_SUCCESS"", ""logging_tags"": {}, ""message"": ""Finished execution of run for \""{job}\""."", ""pid"": 1, ""pipeline_name"": ""{job}"", ""solid_handle"": null, ""step_handle"": null, ""step_key"": null, ""step_kind_value"": null}, ""error_info"": null, ""level"": 10, ""message"": """", ""pipeline_name"": ""{job}"", ""run_id"": ""e8a15a25-4614-4f83-acc6-ee246443d5b7"", ""step_key"": null, ""timestamp"": 1694601702.4474347, ""user_message"": ""Finished execution of run for \""{job}\"".""}",PIPELINE_SUCCESS,2023-09-13 10:41:42.447435,,,
but the main problem is that they are in cloud watch
d
Are you able to share your dagster.yaml (or values.yaml if you’re using the helm chart?)
Er wait, sorry - that row you sent me is the expected format of the events log table (unless you are seeing two rows for each event or something). Although I’m surprised to see the message repeated twice there in the row. I can ask into the Cloudwatch part though
I think I misread your original post to say that the extra json logs are also incorrectly appearing in postgres - but re-reading it, that is not what you were saying
j
In postgres it looks fine, It's just issue with cloudwatch
I wasn't expecting there whole jsons
as the same information is provided by logs:
2023-09-13 08:01:07 +0000 - dagster - DEBUG - vector_events - c70c5d36-6690-4bbc-8fca-6a4d9bbd7388 - 1 - RUN_SUCCESS - Finished execution of run for "{job}".
and then i see:
Copy code
{
  "__class__": "DagsterEvent",
  "event_specific_data": null,
  "event_type_value": "PIPELINE_SUCCESS",
  "logging_tags": {},
  "message": "Finished execution of run for \"vector_events\".",
  "pid": 1,
  "pipeline_name": "{job}",
  "solid_handle": null,
  "step_handle": null,
  "step_key": null,
  "step_kind_value": null
}
so logs looks exactly the same as in dagit/web_server but after run_success ale these DagsterEvent jsons are printed
Dagster.yaml
Copy code
scheduler:
  module: dagster.core.scheduler
  class: DagsterDaemonScheduler

run_coordinator:
  module: dagster.core.run_coordinator
  class: QueuedRunCoordinator
  config:
    max_concurrent_runs: 2
    tag_concurrency_limits:
      - key: "dagster/backfill"
        limit: 1

run_launcher:
  module: dagster_aws.ecs
  class: EcsRunLauncher
  config:
    include_sidecars: true
    task_definition:
      env: JOB_TASK_DEF_ARN
    use_current_ecs_task_config: False
    run_task_kwargs:
      cluster:
        env: ECS_CLUSTER
      launchType: "EC2"


run_monitoring:
  enabled: true
  start_timeout_seconds: 300 # ECS runs can take a long time to start (~80 seconds is normal)
  max_resume_run_attempts: 0
  poll_interval_seconds: 120

run_retries:
  enabled: true
  max_retries: 2

run_storage:
  module: dagster_postgres.run_storage
  class: PostgresRunStorage
  config:
    postgres_db:
      hostname:
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: DAGSTER_POSTGRES_PASSWORD
      db_name:
        env: DAGSTER_POSTGRES_DB
      port: 5432

schedule_storage:
  module: dagster_postgres.schedule_storage
  class: PostgresScheduleStorage
  config:
    postgres_db:
      hostname:
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: DAGSTER_POSTGRES_PASSWORD
      db_name:
        env: DAGSTER_POSTGRES_DB
      port: 5432

event_log_storage:
  module: dagster_postgres.event_log
  class: PostgresEventLogStorage
  config:
    postgres_db:
      hostname:
        env: DAGSTER_POSTGRES_HOSTNAME
      username:
        env: DAGSTER_POSTGRES_USER
      password:
        env: DAGSTER_POSTGRES_PASSWORD
      db_name:
        env: DAGSTER_POSTGRES_DB
      port: 5432
d
I think there is a PR already out for the cloudwatch piece that we just need to finishing testing and land: https://github.com/dagster-io/dagster/pull/12672 - I can take a look at that
j
from the description it sound like my problem
d
I'll get that fixed up so that we can land it - what version of dagster was that postgres row you posted from? I thought we fixed the issue where user_message was duplicated in the row
That user_message fix would have happened way back in 0.14.x I think
OK, that fix for the JSON logging just landed - it didn't quite make the release next week, but we'll be able to get it out next week