https://dagster.io/ logo
#ask-ai
Title
# ask-ai
g

Giacomo Colombo

03/12/2024, 8:42 PM
is it possible to create a resource that contains a dict of dict in the repository ? and to have only 1 method decorated that return a dict of dict.. For instance, I want to merge these 2 resources in one : What you suggest to unify the sensors in one?
Copy code
@resource(config_schema={'one_dataset_type': str})
def dataset_type_resource(init_context):
    one_dataset_type = init_context.resource_config['one_dataset_type']
    return {'one_dataset_type': one_dataset_type}


@resource(config_schema={'one_dataset_destination': str})
def dataset_destination_resource(init_context):
    one_dataset_destination = init_context.resource_config['one_dataset_destination']
    return {'one_dataset_destination': one_dataset_destination}

def = Definition{
...
        resources={
            "dataset_type_resource": {"one_dataset_type": curr_type},
            "dataset_destination_resource": {"one_dataset_destination": curr_destination},

}
s

Scout

03/12/2024, 8:42 PM
Yes, it is possible to create a single resource that contains a dictionary of dictionaries in a repository. You can define a resource that encapsulates multiple configurations and returns a nested dictionary structure. Here's an example based on the documentation's guidance on creating configurable resources:
Copy code
python from dagster import resource, ConfigurableResource, InitResourceContext  class CombinedDatasetResource(ConfigurableResource):     def __init__(self, dataset_type: str, dataset_destination: str):         self.dataset_type = dataset_type         self.dataset_destination = dataset_destination      def get_config(self):         return {             'dataset_type': self.dataset_type,             'dataset_destination': self.dataset_destination         }  @resource(config_schema={     'one_dataset_type': str,     'one_dataset_destination': str }) def combined_dataset_resource(init_context: InitResourceContext):     return CombinedDatasetResource(         dataset_type=init_context.resource_config['one_dataset_type'],         dataset_destination=init_context.resource_config['one_dataset_destination']     )  # Usage in Definitions defs = Definitions(     # ...     resources={         "combined_dataset": combined_dataset_resource.configured({             "one_dataset_type": curr_type,             "one_dataset_destination": curr_destination         })     } )
In this example,
CombinedDatasetResource
is a class that takes two parameters,
dataset_type
and
dataset_destination
, and provides a method
get_config
that returns a dictionary with these values. The
combined_dataset_resource
function is decorated with
@resource
and uses the
config_schema
to define the expected configuration schema. When configured, it returns an instance of
CombinedDatasetResource
with the provided configuration. This approach allows you to unify the two separate resources into one while still providing a structured configuration. The
defs
object then uses the
combined_dataset
resource with the configured values for
one_dataset_type
and
one_dataset_destination
.
g

Giacomo Colombo

03/13/2024, 2:16 PM
@Scout now that my resource is changed and I have a dict of dict ...how to retrieve the type and the destination of the combined_dataset inside the asset method ? this is my old asset :
Copy code
@asset(
    required_resource_keys={
        'partition_year_resource',
        'dataset_type_resource',
        'dataset_destination_resource'
    },
def my_asset(context):
     dataset_type = context.resources.dataset_type_resource['one_dataset_type']
     dataset_destination = context.resources.dataset_destination_resource['one_dataset_destination']
     do_some_logic()
s

Scout

03/13/2024, 2:16 PM
To retrieve the type and destination from the
combined_dataset
resource inside an asset method, you would access the properties of the
CombinedDatasetResource
instance through the
context.resources
object. Here's how you can modify your asset to work with the new combined resource:
Copy code
python from dagster import asset  @asset(required_resource_keys={'combined_dataset'}) def my_asset(context):     # Access the combined resource instance     combined_resource = context.resources.combined_dataset          # Retrieve the type and destination from the combined resource     dataset_type = combined_resource.dataset_type     dataset_destination = combined_resource.dataset_destination          # Now you can use dataset_type and dataset_destination in your logic     do_some_logic()
In this updated asset definition,
required_resource_keys
now only includes
'combined_dataset'
, which refers to the single resource that contains both the dataset type and destination. The
my_asset
function then accesses the
combined_dataset
resource from the
context.resources
and retrieves the
dataset_type
and
dataset_destination
directly from the resource instance. This approach simplifies the asset's dependencies by consolidating them into a single resource, making the asset's code cleaner and easier to manage.
g

Giacomo Colombo

03/13/2024, 2:22 PM
@Scout No it does not work. I get this error: dagster._core.errors.DagsterSubprocessError: During multiprocess execution errors occurred in child processes: In process 20848: dagster._core.errors.DagsterInvalidInvocationError: 'CombinedDatasetResource' is a Pythonic resource and does not support item assignment, as it inherits from 'pydantic.BaseModel' with frozen=True. If trying to maintain state on this resource, consider building a separate, stateful client class, and provide a method on the resource to construct and return the stateful client. Maybe it was simpler to define the resource without creating the CombinedDatasetResource class but just define a dict with 2 keys
s

Scout

03/13/2024, 2:22 PM
I apologize for the confusion. It seems there was a misunderstanding in how the resource should be accessed. If you are encountering an error related to
pydantic.BaseModel
and frozen instances, it suggests that the resource is being treated as a Pydantic model, which is immutable by default. Given the error message, it seems that the
CombinedDatasetResource
class may not be necessary, and a simpler approach could be used. You can define a resource that returns a dictionary directly, without wrapping it in a class. Here's how you can define such a resource:
Copy code
python from dagster import resource  @resource(config_schema={     'one_dataset_type': str,     'one_dataset_destination': str }) def combined_dataset_resource(init_context):     return {         'dataset_type': init_context.resource_config['one_dataset_type'],         'dataset_destination': init_context.resource_config['one_dataset_destination']     }  # Usage in asset @asset(required_resource_keys={'combined_dataset'}) def my_asset(context):     # Access the combined resource dictionary     combined_resource = context.resources.combined_dataset          # Retrieve the type and destination from the combined resource dictionary     dataset_type = combined_resource['dataset_type']     dataset_destination = combined_resource['dataset_destination']          # Now you can use dataset_type and dataset_destination in your logic     do_some_logic()
In this revised example, the
combined_dataset_resource
function returns a dictionary directly, and the asset
my_asset
accesses the dictionary using the standard dictionary key access method. This should resolve the issue with the immutability of the resource and allow you to use the resource as intended.
3 Views