sean12/09/2020, 3:55 AM
) as solids. Several of my notebook solids include plotting code that allows visualization of intermediate results. However, rendering these plots can be computationally expensive.
I want to be able to execute the pipeline in two differents ways: (1) with execution of plotting code, when debugging/inspecting; (2) without execution of plotting code, when I just want the final results.
The most obvious way to do this is to put the plotting code under a conditional that reads some parameter. To me, the most natural kind of parameter to use is what Dagster calls a [
](https://docs.dagster.io/overview/modes-resources-presets/modes-resources). I prefer this to a solid configuration parameter because (a) my configuration parameters concern the values I'm computing-- whether I pre-compute visualizations or not is a kind of meta-parameter; (b) the visualization parameter should be shared across all solids. Really it is a kind of logging configuration, but I don't think that I can use dagster's built-in logging API because I am trying to control execution of notebook cells.
So I'd like to use separate modes to toggle this visualization behavior. My problem is, from what I can tell from the
documentation, modes simply configure resource/logger/storage/executor keys-- I can't figure out how to read the name of the mode itself from the context during execution. Is this possible?
More generally, is there some more appropriate Dagster abstraction I should be using to control this behavior?
Finally, it occurred to me while thinking through this that it would be nice if dagstermill itself were capable of varying its notebook cell execution dependent on cell metadata. That way, the visualizations I described above could be sequestered in specially tagged cells, and the notebook execution engine could conditionally execute them depending on the mode. I'm pretty sure that Papermill supports this functionality.
max12/09/2020, 4:14 AM
sean12/09/2020, 4:33 AM
max12/09/2020, 6:42 AM