Clement Emmanuel
05/24/2023, 4:01 PMSELECT
event_logs.id,
event_logs.event
FROM
event_logs
WHERE
event_logs.dagster_event_type = $1
ORDER BY
event_logs.id DESC
Which seems to be invoked by
context: MultiAssetSensorEvaluationContext
context.latest_materialization_records_by_partition_and_asset()
Becomes very expensive as the event_logs table grows (which is indefinite I believe as it's essentially a write-only table). This is expensive even with the appropriate indexes that get leveraged by the query plan.
Has there been any throughput testing on this pattern, or any ideas as to how to optimize this. Unless i'm missing something this seems to make the canonical use of multi-asset sensors non viable even at a fairly modest scale as it will only decay in performance as materializations continue until it eventually (or in the case that materializations already exist when turning on the sensor, immediately) can't complete within the hard 60 second timeoutVitaly Markov
05/24/2023, 4:06 PMClement Emmanuel
05/24/2023, 4:11 PM