I'm noticing something mysterious. I have 4 celery...
# announcements
c
I'm noticing something mysterious. I have 4 celery workers set up with Redis and I'm running a pipeline with about 6 serial solids in it. I wrote the solids before setting up celery so they read and write a json file to and from disk instead of passing as an intermediate. But somehow the downstream solids are succeeding even on celery. How is this possible? It seems kind of like all of the solids in a pipeline are always getting run on the same worker
n
hmm what does your pipeline config look like? you have the
execution:
key set to configure the pipeline to run on Celery? if so, might be worth looking at flower to confirm whether this is indeed happening w/ tasks getting scheduled on the same worker
c
flower is showing a pretty even distribution across workers
n
and the read/writes of this JSON file are from local disk?
c
yup
n
that is indeed strange… and you’re able to confirm that the downstream solids are successfully reading that json file off disk, despite being run on a different worker?
c
my suspicion is that they are somehow always getting run on the same worker, but yeah it seems that the downstream solids are always able to read the file
n
yeah, either way its surprising behavior
c
yup
it sort of reminds me of the issue i was having when i was using sqs where it kept creating a new queue for each run... something along those lines where it's causing all solids to get run on the same worker
i also wonder if it could simply be a weird effect in how tasks are getting distributed on celery