wondering if there any plans to integrate Dagit wi...
# announcements
k
wondering if there any plans to integrate Dagit with a Notebook front-end like e.g. JupyterLab? at Twitter, most of our user’s laptops don’t have access to interact with the services they need to experiment with and iterate on DS/ML pipelines - so we tend to do most Pipeline/DAG iteration in batch Compute jobs or more interactively in online Jupyter Notebooks (accessed w/ JupyterLab + IDE remoting). it looks like Dagit is already a webapp, so getting it running as a plugin on the notebook server (a jupyter “serverextension”) + an iframe-style JupyterLab extension could be trivial. this should easily extrapolate to other customers with common modes of Notebooks-as-a-DS/ML-Development-Environment (e.g. Google Cloud’s AI Platform) for drop-in Dagster dev support. here’s an example of Dask’s JupyterLab integration for inspiration:

https://www.youtube.com/watch?v=EX_voquHdk0#t=1m42s

s
Hey Kris thanks for the question.
Which aspect of dagit would you find valuable to be in a notebooking context? The pipeline viewer? The execution viewer?
In general we envision the notebook as one of the nodes in our graph, rather than defining the graph itself.
Would love to here more specifics of what you would expect
k
yeah, we see it as both: 1) an executable node in the graph of shape
execute_notebook(notebook_in) -> notebook_out
2) a development mode for defining adhoc (and maybe soon, productionized) Pipelines (and sub-components) for iterative development the high level idea there is that if your Pipeline is encoded in a notebook (that is also parameterizable and schedulable, via an external orchestrator), that loading that notebook up to iterate on it allows for fluid movement from Experimentation <-> Production use cases. our Notebook environments also have full monorepo integration, so it’s also just as easy to load libraries up for iterative development all the way up to sending a Phabricator review and landing that. prior art for ref/context: TFX’s Iterative Notebook RFC: https://github.com/tensorflow/community/blob/master/rfcs/20190815-tfx-notebook.md Kubeflow’s Fairing project: https://www.kubeflow.org/docs/fairing/fairing-overview/ Kubeflow’s Kale project: https://medium.com/kubeflow/automating-jupyter-notebook-deployments-to-kubeflow-pipelines-with-kale-a4ede38bea1f
The pipeline viewer? The execution viewer?
yes, both.
s
interesting. so there would potentially be multiple notebooks open. one for defining and running pipelines.
and the others (if you have them in dagstermill) would be defining compute
also kale is wild!
k
yep, that could happen for sure.