https://dagster.io/ logo
#ask-community
Title
# ask-community
j

Johannes Müller

07/17/2023, 9:36 AM
Hi, I basically just used a job to execute an entire legacy pipelines, which ran fine in less than 10 minutes. Now I refactored my job to use
op
operations for each sub function to have proper logging in Dagster. However now it takes serveral minutes before each function is executed and I need some idea on how to solve this 😄 Between each
Launching subprocess
and
Executing step "X" in subprocess.
it takes 5 minutes, and I have no visibility of what is happening here -.- Starting a subprocess can't take that long, or? I tailed all the logs while the job was running and didn't see anything related.
a

alex

07/19/2023, 7:43 PM
Starting a subprocess can’t take that long, or?
this time also includes importing dependencies and creating the
Definitions
/
repository
. To go back to the
pipeline
default of in process execution, you can set run config in the
execution:
section to select
in_process
or bind the
in_process_executor
directly to your
job
definition
j

Johannes Müller

08/16/2023, 6:39 AM
Thanks Alex, I'll give it a try!
5 min overhead per operation still seems excessive, though :D
a

alex

08/16/2023, 1:42 PM
5 min overhead per operation
yea this is not an expected amount unless you have some very heavy lifting happening at process init / import time. If you are not sure what to attribute the cost to, you can use a profiler like https://github.com/benfred/py-spy or specifically or profile import time using https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPROFILEIMPORTTIME