I am currently using an asset B that gets an input from anot dagster #ask-ai

I am currently using an asset B that gets an input...

Greg H

03/01/2024, 11:26 AM

I am currently using an asset B that gets an input from another asset A and in the asset B I'd like to call an API using input from asset A. Is the asset B that does the API call an asset or should it rather be a job with an Op? There is no real output for the asset B but I'd like to log the error code or perhaps add that as metadata.

Scout

03/01/2024, 11:26 AM

Based on the documentation, if you are working with assets and one asset depends on another, you can continue to use the asset-based approach for your workflow. In your case, asset B depends on asset A and performs an API call using the input from asset A. Even though asset B does not produce a traditional data output, you can still consider it an asset if you want to track the API call as part of your data pipeline. You can use the API call's response, such as the error code, as metadata for asset B. This metadata can be logged or materialized using

AssetMaterialization

to track the execution of the API call within the Dagster framework. Here's a simplified example based on the documentation:

Copy code

python from dagster import asset, AssetMaterialization  @asset def asset_A():     # Your logic to produce data for asset A     return data_for_B  @asset def asset_B(input_from_A):     # Your logic to call the API using input from asset A     response = call_api(input_from_A)          # Log the API call response as metadata     yield AssetMaterialization(         asset_key="asset_B",         metadata={"api_response_code": response.status_code}     )          # Since there's no real output, you can yield None or perform other actions as needed     yield None

In this example,

asset_B

takes the output of

asset_A

as input, makes an API call, and logs the response code as metadata using

AssetMaterialization

. Even though

asset_B

does not produce a data output, it is still considered an asset because it is part of the data pipeline and its execution is tracked by Dagster. If you find that your workflow does not fit the asset model well, or if you prefer to use a more task-oriented approach, you could use a job with an op. However, based on the documentation and the scenario you described, it seems appropriate to model

asset_B

as an asset, especially if you want to leverage Dagster's asset tracking and logging capabilities.

Open in Slack

Previous Next