https://dagster.io/ logo
c

Chet Lemon

04/27/2020, 7:54 PM
Hello! Our org has decided to build around dagster for our data pipeline needs and we are excited to be a part of the community. While creating some example pipelines for our users, we were a bit confused on best practice for structuring our
ExpectationResult
objects. As an example, we have a composite solid that verifies a number of values are within acceptable thresholds. Would it be preferred to do: 1. many
ExpectationResult
objects per solid, with one
JsonMetadataEntryData
metadata entry 2. one
ExpectationResult
object per solid, with many
JsonMetadataEntryData
metadata entries While trying to decide, we were also unsure on future uses of the different children of
EventMetadataEntry
(path, url. json, python attribute, markdown, text). Are there any set plans for using these? I think knowing that could perhaps help us better understand
p

prha

04/27/2020, 8:53 PM
For
EventMetadataEntry
, most of them are mostly there to allow for slightly different UI in the run logs
For
ExpectationResult
granularity, it’s just a matter of exposing the significance of a success/failure. We can track the number of successes / failures per solid and generate views on that, so it kind of depends on what will be most useful to you operationally.
c

Chet Lemon

04/27/2020, 10:49 PM
ah okay, understood. Good to know there isnt something in the works we're divergent from, thanks. Could you provide more color on this aggregate success/failure per solid feature? Is this released now?
p

prha

04/28/2020, 12:09 AM
We aggregated all the
ExpectationResult
at the pipeline-level in the runs view, but we’re starting to play around with per-solid aggregation in a cross-runs view for partitioned pipeline. You can start playing around with this experimental feature for partitioned pipelines on the
Schedules
tab.
thankyou 1