https://dagster.io/ logo
#ask-community
Title
# ask-community
d

Drew You

03/24/2023, 6:16 PM
I'm writing an iomanager, is there a good guide to how to deal w/ parallelism/multiprocessing here? Specifically, for time based partitions my iowriter is not parallelizable but for static partitions it is. Ideally, I'd be able to specify this in a way that also works w/ MultiPartitions.
j

jamie

03/24/2023, 7:36 PM
hey @Drew You are you looking for advice for how to determine if an output is from a time partition vs a static partition vs a multi partition?
d

Drew You

03/24/2023, 7:41 PM
So, I know that. I'm more concerned w/ how dagster deals w/ multiprocessing. Is there a way to guarantee that a backfill will only be handled by a single thread?
j

jamie

03/24/2023, 7:44 PM
cool, just wanted to make sure i knew which part of the question you were wondering about. You could change the executor you use for the specific assets/jobs that use that io manager to be the inprocess executor instead of the multi-process executor https://docs.dagster.io/deployment/executors#specifying-executors https://docs.dagster.io/deployment/executors#example-executors
d

Drew You

03/25/2023, 9:05 PM
Hmmm I think I'm missing something here.
I have a single asset that is a very large timeseries and have defined my timepartition iowriter to be an append to a parquet file. (load parquet file + add latest partition to the end)
This is single writer, so my intuition is I should have an easy way to specify this. The time series is large so I don't want to rematerialize the whole thing each time I get a new partition, but I also don't want to force anything that depends on it to run in a single threaded executor
j

jamie

03/27/2023, 2:08 PM
ok i think i understand the problem better now. I agree that changing the executor for every single pipeline doesn’t make sense. We don’t have a built in way to achieve the kind of thing you want. So you could look at other solutions to ensure that two instances of the io manager aren’t writing to the file at the same time. one could be maintaining a lock file that the io manager needs to hold in order to write to the parquet file
2 Views