I hooked up OpenTelemetry trace context propagation in Dagster so that we can get traces of a job's `@op`s across multiple threads / CPUs. I'm considering making the effort to open-source it, would folks be interested in that?
For example you get traces like what's attached, where each op gets a span (line in the waterfall), ops within a subgraph are grouped, and also arbitrary traced functions internal to the computation can get spans as well. Relative to the Dagster job waterfall, you get some more tracing details and the performance/query oriented UI of whatever OTel app you choose. Also attached is a heatmap from Honeycomb, showing max resident set size of ops in our production pipeline.