https://dagster.io/ logo
Title
a

Alexander Voishchev

10/05/2022, 3:12 PM
Hello everyone. Is there any way to get the current asset data? I need to materialize some asset as DataFrame, not to replace (i.e. delete and insert), but to append to the current dataframe (database table), so each materialization will increase the volume of the data of the asset. For example, I need to append to the dataframe (database table) some data from the external source:
@asset
def asset1():
    df = get_external_data()
    return df

@asset
def asset2(asset1):
    # Load the current data of asset2 (state from previous materialization)
    df = get_asset2_data()
    # Append the data from asset1
    df = pd.concat(df, asset1)
    # Return the comdined result
    return df
Is there any elegant way to do that without custom IOManager? P.S. I found an example with an asset which returns itself data, as far as I understand:
examples/docs_snippets/docs_snippets/concepts/assets/asset_w_context.py
@asset(required_resource_keys={"api"})
def my_asset(context):
    # fetches contents of an asset
    return context.resources.api.fetch_table("my_asset")
But it is an abstract code, as I wrote above.
s

sandy

10/05/2022, 4:18 PM
Hey @Alexander Voishchev - you might be able to use the functionality described here: https://docs.dagster.io/concepts/assets/software-defined-assets#loading-asset-values-outside-of-dagster-runs