Hi again and thanks a ton for Dagster :smile: I'm ...
# announcements
w
Hi again and thanks a ton for Dagster 😄 I'm wondering if there's an easy way to make dynamic pipelines (whose shape depends on some parameters) reconstructable, since
reconstructable
can only be applied to module-level functions with no arguments. Writing/reading a temp file works but I'd rather not if there's another way (and definitely looking forward to 0.9.0 if it addresses dynamic DAGs!)
s
Unfortunately there’s no straightforward way to do that (unless I am missing something @alex). But that would be interesting to do potentially as we could probably concoct a scheme to capture the arguments and serialize them, but that would be a substantial task. Can you give exact context now how your are parameterizing pipeline creation and why?
Re: the task you mentioned, I’m also in interested in hearing about your exact use case. We want to support some aspect of what I refer to as dynamic orchestration. However we also explictly don’t want to repurpose dagster as a map-reduce engine, which we feels should be handled by computational runtimes such as spark or dask.
w
Ah ok, thanks
That makes sense re: not wanting to reinvent map-reduce
👍 1
Basically the use case is dynamic parallelization, where solids are spawned to process some number of inputs which isn't known until runtime
the "processing" is some arbitrary Singularity-containerized workflow in an HPC environment where Dagster deploys to Dask, which deploys to some resource manager like PBS/SLURM
it could just be done in Dask, but would like to use Dagster if possible 🙂
Do you have any subset of dynamic orchestration in mind that Dagster may end up supporting, out of curiosity?
s
Fire and forget parallel runs would be the obvious thing. The “map” but not the “reduce” component that require fine-grained coordination
🎉 1
👍 1
w
that would be awesome. just the ability to "map" some arbitrary number of solids over a collection would cover our use case
s
Great to know