<@U017KUAENS0> We've been continuing our memoizati...
# ask-community
d
@chris We've been continuing our memoization work. All is mostly good except for the DagsterNoStepsToExecute error. Is it intentional that an exception is raised if there is nothing to do? Am I calling the pipeline incorrectly? I end up having to wrap a bunch of our execution processes in try blocks like below: It makes me wonder how to access the output from a previously run memoized pipeline.
c
Yea, this is the (currently) correct behavior. If there are no steps to execute, then this error will be thrown. Essentially dagster isn't designed to run with no steps, and so to prevent more esoteric failures later on, we just fail fast here.
We don't currently have a super convenient API to access run results to my knowledge, and this is a known feature gap. In combination with the no step keys to execute error though, I'm seeing that it makes for a bit of an incongruency.
one potential workaround is directly instantiating your io manager, and using build_input_context/build_output_context to pass to the
handle_input
and
load_output
respectively
d
hmm. That's interesting. I'm glad to hear I interpreted it correctly. Do you anticipate this making it's way onto the roadmap at some point? It also seems like I could put a dummy step into the pipeline that would never get saved.
c
that seems like a good workaround for now. A better solution is definitely on the roadmap, whether that is a better output-retrieval API or being able to execute without any steps, or both
d
I realized this is a bit more frustrating than I originally thought. I've been playing around with it more and it seems like if something is loaded, I can't access it as output at all. Is that also correct? The use case is if I rerun a pipeline I can not access the previous output for a step. Is there something I can do to help make that possible?
I also looked through the instantiation docs but I'm not sure how to get that to work. It seems like I'm also missing the version information when I create that context.
I can get around the limitation by creating a dummy solid that calls that output. As long as it isn't saved, I can access the output.
c
Hmm. Are you saying that from the result object returned by
execute_pipeline
, that you can't access outputs memoized from previous runs?
I'll investigate solutions around this space. I think both of these problems should be fixable: not failing if we have no steps to execute, and retrieving outputs from previously memoized runs.
d
yes. that's the conclusion I've come to. I've ended up creating a dummy step that won't be memoized. That allows us to basically point back to the memoized dataframe. Happy to share whatever I can to help.
we fully realize this is experimental. It's a fantastic start.