https://dagster.io/ logo
#ask-community
Title
# ask-community
l

louis bertolotti

08/11/2023, 10:58 AM
Hello ! We currently have 4 jobs which use graphs in order to reuse some ops to avoid code duplication. One of those ops is called
raw_cleaning
. However, when those jobs are running in parallel, they fail since the jobs appear to use the wrong op value : for instance,
job_a
executes
raw_cleaning
with a dataframe from
job_b
and
job_b
receives a dataframe from
job_a
,
job_c
or
job_d
when we log them. The behavior is completely erratic and it seems to only be governed by which job finishes first. For information, we configured 5 concurrent jobs. When we rerun those jobs alone with no concurrent jobs, they work correctly. Can we reuse an op defined once in parallel jobs, or will this trigger issues with jobs getting a random value ? If this is the case, how can we avoid code duplication ? Thanks for your precious help !
m

Mathieu Grosso

08/11/2023, 12:37 PM
I have the exact same issue Sometime files from an ops are files that are process on another job. I was not allowed to know why 2 files are being inverted in pipeline when running at the same time