Hi, how can I use the solid selection syntax to qu...
# announcements
r
Hi, how can I use the solid selection syntax to query for a solid inside a composite? I’ve tried
composite_name.solid_name
, but this doesn’t seem to work. I’m trying this through the Dagit playground, but I’d like it to run through
execute_pipeline
s
I believe this isn’t supported in the solid selection syntax today, but is planned to be supported soon. @sandy or @yuhan can confirm
r
@sandy, @yuhan Just following up on this - any update on when this should be available? It would be really helpful to be able to query solids inside composites - at the moment I’m not really able to use composites to organise the pipeline, because I need to execute a subset of each of them at run-time. It would also be great if the ancestor dependency analysis was sensitive to the internal dependencies of composites (e.g. if a composite has multiple outputs, with fairly independent paths through the composite, and I query the first output and it’s ancestors, that only the necessary parts of the composite are run, rather than the whole thing). Another suggestion (although maybe a big shift in design/thinking) for the selection syntax would be for it to possible to select Intermediates/Solid outputs/IO items. This would be akin to asking “I’d like this piece of data produced, please run all the necessary steps to create it”
s
Apologies for missing your earlier comment @Richard Fisher. Selecting subsets inside composites is a very reasonable request. @max started some work on it here a few months ago: https://dagster.phacility.com/D2879. Unfortunately, I believe it's in a bit of a limbo state, so I don't have a date for you on when it will be completed. I filed a github issue to track: https://github.com/dagster-io/dagster/issues/3557
r
@sandy, great, thanks 👍
s
The desire to execute everything upstream of particular intermediates makes a lot of sense to me. That said, I assume the syntax would look something like
*solid_name.output_name
, and we already support
*solid_name
. Are you imagining that those two would do something different. E.g. that we'd omit saving the other outputs of that solid?
r
The only case I can think of where I’d want different behaviour when querying
*solid_name.output_name
vs
*solid_name
is if the solid is a composite solid - then I would only want upstream dependents of
output_name
to be run (there may be a ‘sub-graph’ in the composite which produces other outputs of the composite - I wouldn’t want those solids to run). This is essentially the same use case as being able to query solids inside composites, although with a more friendly query syntax.
s
Ah, that makes sense