Hi all. Can somebody tell me is there some way to ...
# announcements
a
Hi all. Can somebody tell me is there some way to implement pipeline similar to map-reduce?
t
Can you elaborate on the question a bit? Are you looking for more of a map > shuffle > reduce that dagster manages? I'd think you could do this by nature of unleashing with Spark and let the Spark framework manage it, but managing it through Solids woudl be difficult if you don't know data-splits beforehand. Maybe Dagster parallelism has some possibilities, but I haven't explored.
👍 2
s
It’s a non-goal (currently) for us to support generalized map reduce. We consider that a feature of a compute substrate (like Spark, distributed Dask, Hive, etc) rather than the orchestration substrate. We may look at supporting some very coarse-grained parallelism (e.g. firing N runs off simultaneously that don’t need to coordinate in a real way) but I would suggest pushing any sort of map reduce operation down to a layer like spark as @tseader suggests
a
If i should write logic with using spark or dask for a what in this case i need to use dakster? I hoped that I will able to expres my logic in terms of dagster but execution will performing on dask cluster. For example such framework as prefect.io supports this ability