Muhammad Jarir Kanji
09/05/2023, 8:11 PMext architecture
link is broken, as it points to an internal repo.
Additionally, if the ext
protocol formalizes how structured and unstructured messages are passed to/from Dagster, have you thought about making it so that the remote execution context is able to modify the state of the DAG in the Dagster server? Specifically, my first thought when reading through the proposal was that it could provide the basis for Dagster to do double-duty as a metadata and logging platform (which is what most ML experiment tracking platforms are at their core), without necessarily orchestrating anything (i.e., Dagster doesn't trigger the job itself; it just receives information about it).
It'd be awesome if I could:
• create, from an external environment using ext
, an asset that does not exist in the definitions
in the environment where the Dagster server is running; and
• run a process (like training an ML model) in a Jupyter notebook somewhere and stream related logs back to Dagster. This would, in essence, be like manually pressing the "Materialize" button in the Dagster UI, but triggered from outside the Dagster server.
The combination of these two things would allow me to potentially use Dagster as a replacement for an experiment tracking platform and also allow me to use Dagster as the "single pane of glass" for more experimental, ad-hoc work and production jobs that are properly defined as a DAG in Dagster (and orchestrated by it).
This is definitely coloring outside the lines of "pure" orchestration, though. If this makes sense and is at least somewhat interesting/feasible, I can also post in the GH Discussion.schrockn
09/06/2023, 2:30 PMschrockn
09/25/2023, 11:01 PMZach
09/26/2023, 2:45 PMext_context.get_external_process_env_vars
. It seems that these are needed on the external process side of things to construct the context and other attributes for the message writers (please correct me if I'm reading this wrong). The thing about this platform that I'm integrating with is that it doesn't provide a mechanism for passing env vars, just command-line parameters. Is this where the ExtTempFileContextInjector
could come in? I guess what I'm confused about with using this object instead is I see ext_context.get_external_process_env_vars()
calls in most of the examples being used to pass in context info. Is this just not needed if I'm using a file-based context injector? If it's not needed, then how do I get information to the external process about where the file is that contains the context info?Daniel Gafni
09/27/2023, 3:09 PMdagster-ext
instead of dagster-ext-process
Daniel Gafni
09/27/2023, 3:50 PMextras
?Daniel Gafni
09/27/2023, 7:14 PM<http://context.log.info|context.log.info>(f"Waiting for output DataFrame at {output_df_path}...")
timeout = 10 * 60 * 60
start = time.time()
while time.time() - start < timeout:
if UPath(output_df_path).exists():
break
time.sleep(60)
Perhaps we can use Dagster events for this. It would be nice if there was a standard method of doing itAkshay Verma
10/03/2023, 9:47 AMpipes
? It might be helpful for us since we are looking at things like running tasks in clusters, existing codebases, and databricks environments (all overlapping each other in various degree).Radek Tomšej
10/04/2023, 3:56 AMMultiEnvironmentExecutor
(https://github.com/dagster-io/dagster/pull/11735) will not be implemented?schrockn
10/06/2023, 4:57 PMIgnas Kizelevičius
10/10/2023, 11:29 AMdwall
10/13/2023, 4:19 PMQwame
10/13/2023, 8:04 PMdwall
10/13/2023, 8:15 PMschrockn
10/16/2023, 9:21 AMThomas Aubry
10/17/2023, 11:49 AMRunLauncher
to the new Dagster Pipes ?Karsten Gebbert
10/26/2023, 8:04 AMQuentin Gaborit
11/03/2023, 6:09 PMdagster-pipes
. Looking forward to see the library available for other languages it’s a nice addition to already many features.
In the meantime it seem to potentially unlock something I wanted to implement with pants build system, where each assets module would be packaged as a pex file in a docker image. It seems that passing an executable instead of a python entrypoint to the cmd
parameter does not work though. Am I missing something?Craig Austin
11/16/2023, 6:33 AM@op
to use PipesSubprocessClient
successfully.