Dias Wesley

03/14/2023, 2:25 PM
Hi great community, I am looking for someone who can help me to understand how to write an IOmanager for PySpark.

Tim Castillo

03/14/2023, 7:43 PM
Hi Dias! Glad to help you out here. Which IO manager are you basing yours off of and what part doesn't work?

Dias Wesley

03/14/2023, 9:14 PM
Hi tim, I'm trying to do spark job  and it looks like if I want to read data from various sources dat I have to set up an IOmanager (define path file, handle input and handle output ) I'm new in dagster and I don't really understand the main concepts of those files. I thought it will be straightforward but obviously not for me. class LocalParquetIOManager(IOManager): def _get_path(self, context): return os.path.join(context.run_id, context.step_key, def handle_output(self, context, obj): obj.write.parquet(self._get_path(context)) def load_input(self, context): spark = SparkSession.builder.getOrCreate() return local_parquet_io_manager(): return LocalParquetIOManager() HOW TO UNDERSTAND THE BLUE LINE ABOVE????? Thanks for your help. regards, Dias