Hi tim,
I'm trying to do spark job and it looks like if I want to read data from various sources dat I have to set up an IOmanager (define path file, handle input and handle output ) I'm new in dagster and I don't really understand the main concepts of those files.
I thought it will be straightforward but obviously not for me.
class LocalParquetIOManager(IOManager): def _get_path(self, context): return os.path.join(context.run_id, context.step_key, context.name) def handle_output(self, context, obj): obj.write.parquet(self._get_path(context)) def load_input(self, context): spark = SparkSession.builder.getOrCreate() return spark.read.parquet(self._get_path(context.upstream_output))@io_managerdef local_parquet_io_manager(): return LocalParquetIOManager()
HOW TO UNDERSTAND THE BLUE LINE ABOVE?????
Thanks for your help.
regards,
Dias