Hi! I've defined an asset that transforms a datafr...
# ask-community
j
Hi! I've defined an asset that transforms a dataframe (renaming, deleting columns). The columns I want to rename and delete are defined in a yml file. I have the feeling that there are several ways of doing this. Which would be the best?
Option 1:
Copy code
def build_asset():
    @asset
    def transform(df):
        config = load_yml_file()
        df = df[config["columns_to_keep"]]
        return df
    return transform
Option 2:
Copy code
def build_asset():
    config = load_yml_file()
    @asset
    def transform(df):
        df = df[config["columns_to_keep"]]
        return df
    return transform
Option 3:
Copy code
class MyConfig(ConfigurableResource):
    def load(self):
        return load_yml_file()

def build_asset():
    @asset
    def transform(df, config: MyConfig):
        df = df[config.load()["columns_to_keep"]]
        return df
    return transform
another option?
o
Hi @Jordan! this is pretty subjective, but personally I'd go with option 1 as it's the simplest, and the other options don't really provide you with additional benefits. Assuming you only have a single filepath, you also wouldn't even need to wrap that in a build_asset() function, you could just do:
Copy code
@asset
def transform(df):
    config = load_yml_file()
    df = df[config["columns_to_keep"]]
    return df
j
Yes thanks, I agree, it's quite subjective. In the example above, this is the slightly simplified version. I'm building this asset in a pattern factory and potentially the path of the file to be loaded could change, so maybe it would be more relevant to use the 3rd option to pass the path as a parameter to the resource?
👍 1
o
the trouble with the resource approach is that you'll end up needing a different resource per asset (as each resource will need to be configured differently), so I'd still go with option 1
j
Very clear, thank you.