Hi! I’ve been trying to build a Pythonic IO Manage...
# ask-community
i
Hi! I’ve been trying to build a Pythonic IO Manager for delta lake, which currently can write spark/pandas dfs, but fails to read spark dfs due to this error message:
Copy code
Unknown resource `spark`. Specify `spark` as a required resource on the compute / config function that accessed it.
My spark resource is supplied to my root definition, and Ops/Assets can interact with it directly.
Copy code
RESOURCES_LOCAL = dict(
    ...,
    spark=PySparkResource(...) # My custom resource
)

DEPLOYMENTS = {"local", "staging", "prod"}
resources_by_deployment_name = {
    "prod": RESOURCES_PROD,
    "local": RESOURCES_LOCAL,
}
deployment_name = os.environ.get("DAGSTER_DEPLOYMENT", "local")
assert deployment_name in DEPLOYMENTS

defs = Definitions(
    assets=all_assets,
    resources=resources_by_deployment_name[deployment_name],
    schedules=[],
    sensors=all_sensors,
)
Is there way to set a required resource directly on a
ConfigurableIOManager
? Or is the best practice to wrap that with the
@io_manager
decorator. I tried digging through the docs + this channel, and this is the closest issue I could find.
c
Hi Ian, I think you might be seeing this error because you haven't specified
spark
as a resource on your IO manager. You can do this by defining your io manager like:
Copy code
@io_manager(required_resource_keys={"spark"})
def my_io_manager(context):
    context.resources.spark(...)
    return ....