Hi! I'm on a team that uses both Dagster and Datad...
# dagster-feedback
n
Hi! I'm on a team that uses both Dagster and Datadog, and recently worked on integrating Dagster with Datadog APM tracing to enable better visibility into op execution speeds (including calls to third party libraries). To make this work, I had to build out a custom multiprocess executor that inherits from your MultiprocessExecutor class but also handles tracer config, span initialization, and trace context inheritance across processes. We are mildly concerned about issues coming up if the Executor evolves -- are there any anticipated changes to Executor method definitions coming up? Additionally, this could be avoided if there was a way to share one instance of a resource across multiple processes when using the MultiprocessExecutor. Is this a possibility or something that's been considered?
m
Hey folks! Reviving this thread—my team is hitting the same use case where we’d like to use Datadog APM to ship traces for multiprocessed pipelines so we can view analytics for Dagster runs in the same place that we do for our other services. The trouble here is that as far as I know, the child_process_executor used for
MultiprocessExecutor
doesn’t make it easy to transmit context from the pipeline-orchestrating parent process to the step-running child processes at execution time. Looking at the ddtrace docs for cross-process tracing and comparing with the process boundary for the
child_process_executor
, it seems like we could satisfy this use case pretty easily if Dagster added support for “hooks” at this process boundary—one in the pipeline execution/parent process to extract context and pass it through the
args
to
Process
, and one in the step execution/child process to extract that context and install it before the step runs. Would it be a reasonable ask for Dagster to build in this support? Or, as Natalie suggested, would it be reasonable to add support for resources that are instantiated once for the pipeline and are in some way shared between child processes, rather than instantiated once per child? Thanks!
👍 2