Keval
12/08/2020, 11:31 AMAndy H
12/08/2020, 5:42 PMPaul Wyatt
12/08/2020, 6:06 PMObject IPCErrorMessage(serializable_error_info=SerializableErrorInfo(message="dagster.check.CheckError: Invariant failed. Description: Can only serialize whitelisted namedtuples, received tuple ('2018-01-01', None)
but I'm having a lot of trouble finding the source of the Tuple. Thoughts on where this would be coming from?sean
12/08/2020, 6:45 PMdagstermill
solids. In the docs, we see input defs defined like this:
k_means_iris = dm.define_dagstermill_solid(
"k_means_iris",
script_relative_path("iris-kmeans_2.ipynb"),
input_defs=[InputDefinition("path", str, description="Local path to the Iris dataset")],
)
This way of doing things requires duplicating input definitions/descriptions between the notebook itself and the solid definition call. Ideally the input definitions could be parsed from the special parameters
-tagged cell that you need to define anyway (some special comment formatting could maybe be used for the descriptions). A similar cell could be used for outputs.
Is anything like this possible now or planned? I would be willing to work on this if devs think it is a good idea but no one is working on it.Noah K
12/08/2020, 7:40 PMNoah K
12/08/2020, 7:41 PMNoah K
12/08/2020, 7:41 PMNoah K
12/08/2020, 7:42 PMsean
12/09/2020, 3:55 AMdagstermill
) as solids. Several of my notebook solids include plotting code that allows visualization of intermediate results. However, rendering these plots can be computationally expensive.
I want to be able to execute the pipeline in two differents ways: (1) with execution of plotting code, when debugging/inspecting; (2) without execution of plotting code, when I just want the final results.
The most obvious way to do this is to put the plotting code under a conditional that reads some parameter. To me, the most natural kind of parameter to use is what Dagster calls a [mode
](https://docs.dagster.io/overview/modes-resources-presets/modes-resources). I prefer this to a solid configuration parameter because (a) my configuration parameters concern the values I'm computing-- whether I pre-compute visualizations or not is a kind of meta-parameter; (b) the visualization parameter should be shared across all solids. Really it is a kind of logging configuration, but I don't think that I can use dagster's built-in logging API because I am trying to control execution of notebook cells.
So I'd like to use separate modes to toggle this visualization behavior. My problem is, from what I can tell from the mode
documentation, modes simply configure resource/logger/storage/executor keys-- I can't figure out how to read the name of the mode itself from the context during execution. Is this possible?
More generally, is there some more appropriate Dagster abstraction I should be using to control this behavior?
Finally, it occurred to me while thinking through this that it would be nice if dagstermill itself were capable of varying its notebook cell execution dependent on cell metadata. That way, the visualizations I described above could be sequestered in specially tagged cells, and the notebook execution engine could conditionally execute them depending on the mode. I'm pretty sure that Papermill supports this functionality.Istvan Darvas
12/09/2020, 12:01 PMmax
12/09/2020, 5:59 PMTed Conbeer
12/09/2020, 8:03 PMAndy H
12/09/2020, 10:40 PMIstvan Darvas
12/10/2020, 4:57 PMDimitris Stafylarakis
12/10/2020, 9:32 PMuser
12/11/2020, 2:33 AMcat
12/11/2020, 2:42 AMcat
12/11/2020, 2:44 AMrelease
branch. Most of the commits are going into master
and will be released with 0.10.0
(Jan 2021) 🍒user
12/11/2020, 4:30 AMNoah K
12/11/2020, 8:44 AMNoah K
12/11/2020, 8:45 AMNoah K
12/11/2020, 8:45 AMNoah K
12/11/2020, 8:53 AMNoah K
12/11/2020, 8:53 AMNoah K
12/11/2020, 10:02 AMNoah K
12/11/2020, 10:12 AMNoah K
12/11/2020, 10:14 AM/
in themNoah K
12/11/2020, 10:14 AMIstvan Darvas
12/11/2020, 7:52 PMAyrton Bourn
12/12/2020, 9:08 AMDagsterInvalidDefinitionError
. I tried moving the loop into a composite_solid but received the same error.
Is there a standard way to handle loops in Dagster?