Xiaotian Yu
05/18/2023, 5:06 PMMultiPartitionsDefinition
) matters so much (in the asset materialization step). after changing 1 of the 2 dimensions of StaticPartitionsDefinition
(StaticPartitionsDefinition
from size ~5000 to 3), the materialization time per partition (1.3 kb large pickle) is reduced from 3s to 0.1s. This is a huge loss considering the amount of partition I have: assuming the dimension of partition is (5000, 100), and no parallelization, the time loss is (3-0.1) * 5000 * 100 = 4 hours * 100 = 400 hours (otherwise it would be 0.1 * 5000 * 100 = 14 hours).
( p.s. I also tried dynamically getting the list of 5000-length string list, the time took per partition went to ~15s.)
The io manager is the default pickle manager, and the dagster version is 1.3.4.
Would be great if I'm doing something wrong or it gets fixed!Xiaotian Yu
05/18/2023, 5:20 PMXiaotian Yu
05/19/2023, 12:15 AMXiaotian Yu
05/22/2023, 1:04 PMowen
05/22/2023, 8:23 PM