https://dagster.io/ logo
Title
l

Le Yang

04/16/2023, 5:30 PM
Is it best practice, or even possible, to invoke I/O Manager obtained via "required_resource_keys" inside an OP body? My use case is to load all sheets from an Excel file to individual data warehouse tables where the Excel file will be specified via config params. I have obtain the DbIOManager resource inside the OP body. I can call "handle_output" method of the manager after constructing the necessary config params with "build_output_context" i.e. passing host/user/database etc. My goal is to re-use the resource as they are provided to the OP w/o having to know the specific config params required to instantiate the resource, since the whole idea of resource is to be able to re-use them.
I end up checking the IOManager type before reading directly the necessary config parameters from the scoped resource.
o

owen

04/17/2023, 8:36 PM
hi @Le Yang! I wouldn't exactly say that this is best practice, as it requires (as you note) manually building an OutputContext. However, I think you've arrived at the best solution for now, as it allows you to take advantage of pre-built functionality which is largely similar to what you need, and there aren't any better hooks to use for the DbIOManager than
handle_output
.
l

Le Yang

04/17/2023, 8:45 PM
Appreciate the confirmation. I know it is not pretty, to use resources configured as I/O Managers w/o using the I/O Manager pattern. For now I have what I needed. It would be nice for DbIOManager to call DbClient.connect() with current configurations if no override is provided via OutputContext.
I just came across the Dynamic Output section. Is it a new feature? I don't remember seeing it while reading the docs. It looks promising? I can perhaps wrap the fanned outputs into another OP w/ I/O manager?
o

owen

04/17/2023, 9:37 PM
ah Dynamic Outputs have been around for awhile, but I think something like what you're describing would work (it'd have a similar level of hack to your existing solution, as you'd still need to manually generate OutputContexts that are in the shape that the DbIOManager expects, but might be conceptually cleaner)
l

Le Yang

04/18/2023, 1:47 PM
I played around with the Dynamic Output feature, I think it can work. Since the built in DbIOManager gets table name from either the asset_key or output_name - neither of them can be changed due to the need to define the mapping function statically - I can have a custom class that extends DbIOManager to retrieve the table name from metadata if specified, which can be provided per mapped run. I quite like the Dynamic Output interface, the fact that you can execute each mapped slice independently from each other is pretty cool. The run diagram display is intuitive as well. May have to wait for another day though.