Marco
03/03/2023, 5:35 PMSpencer Nelson
03/03/2023, 5:38 PMmultiprocessing.shared_memory
(https://docs.python.org/3/library/multiprocessing.shared_memory.html) to achieve thisDanny Steffy
03/03/2023, 5:46 PMMarco
03/03/2023, 5:47 PMSpencer Nelson
03/03/2023, 5:47 PMfrom contextlib import contextmanager
from dagster import io_manager, PickledObjectFilesystemIOManager
@io_manager
@contextmanager
def fs_io_manager_with_cleanup():
fs_io_mgr = PickledObjectFilesystemIOManager()
yield fs_io_mgr
do_cleanup(fs_io_mgr.base_dir)
Marco
03/03/2023, 5:58 PMSpencer Nelson
03/03/2023, 6:04 PMAn important nuance is that resources are initialized (and torn down) once per process. This means that if using the in-process executor, which runs all steps in a single process, resources will be initialized at the beginning of execution, and torn down after every single step is finished executing. In contrast, when using the multiprocess executor (or other out-of-process executors), where there is a single process for each step, at the beginning of each step execution, the resource will be initialized, and at the end of that step’s execution, the finally block will be run.
Marco
03/03/2023, 6:07 PMSpencer Nelson
03/03/2023, 6:08 PMMarco
03/03/2023, 6:10 PMSpencer Nelson
03/03/2023, 6:11 PMMarco
03/03/2023, 6:24 PMsandy
03/03/2023, 11:29 PMMarco
03/10/2023, 10:47 AM