< Scout> how can I get a list of all asset keys for all asse dagster #ask-ai

Join Slack

<@U0667DNC02Y> how can I get a list of all asset k...

# ask-ai

Darryl Ng

02/23/2024, 1:45 AM

@Scout how can I get a list of all asset keys for all assets selected using dbt_assets decorator?

Scout

02/23/2024, 1:45 AM

To get a list of all asset keys for assets selected using the

dbt_assets

decorator, you can utilize the

selected_asset_keys

attribute available in the

AssetExecutionContext

context object within your

@dbt_assets

function. This attribute provides access to the asset keys that are currently being selected for execution. Here's an example from the documentation that demonstrates how you might use `selected_asset_keys`:

Copy code

python from dagster import asset, AssetExecutionContext from dagster_dbt import DbtCliResource, dbt_assets  @dbt_assets(manifest=MANIFEST_PATH) def my_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):     # ... your dbt logic here ...     pass  @asset def downstream_asset(context: AssetExecutionContext):     # Accessing selected asset keys from the dbt_assets     selected_keys = context.selected_asset_keys     # Do something with the selected asset keys     # ...

In this example,

selected_asset_keys

is used within a downstream asset to access the keys of the assets that are selected by the

@dbt_assets

decorator. You can adapt this pattern to your specific use case to retrieve and utilize the list of asset keys as needed within your Dagster pipeline.

Darryl Ng

02/23/2024, 1:49 AM

@Scout would it be possible to get the list of asset keys of the selected dbt_assets after the assets have been declared rather than waiting for execution time to get the selected_asset_keys. I would like to know what sources are depended on by the selected models to create SourceAssets with observable.

Scout

02/23/2024, 1:49 AM

To determine the asset keys of the selected

dbt_assets

and their dependencies, such as dbt sources, you would typically need to analyze the dbt manifest file. The dbt manifest file contains metadata about your dbt project, including models, sources, and their relationships. Dagster provides utilities to parse dbt's manifest file and create asset keys for dbt models and sources. However, the documentation does not explicitly provide a built-in function to retrieve all asset keys for selected dbt assets before execution time. The

selected_asset_keys

attribute is available during execution within the

AssetExecutionContext

. If you want to create

SourceAssets

with observability based on the dbt sources that are dependencies of your selected models, you would need to parse the dbt manifest file to identify these relationships. You can then use this information to define your

SourceAssets

accordingly. Here's a conceptual approach based on the documentation: 1. Parse the dbt manifest file to identify the sources and models. 2. Determine the dependencies between models and sources. 3. Create

SourceAssets

for the identified dbt sources. 4. Use the

observable

decorator to add observability to these

SourceAssets

. The documentation mentions helper methods like

get_asset_key_for_source

and

get_asset_keys_by_output_name_for_source

that can be used to retrieve asset keys for dbt sources. You can use these methods to define your

SourceAssets

with the correct asset keys that correspond to the dbt sources. Please note that this approach requires custom logic to parse the dbt manifest and is not a built-in feature of Dagster's

dbt_assets

decorator. If you need further assistance with this process or if there are updates in Dagster's capabilities that could simplify this task, I recommend checking the latest documentation or reaching out to the Dagster community for support.

Darryl Ng

02/23/2024, 1:53 AM

@Scout I can already create SourceAssets for the sources found in manifest.yaml, however, there are many sources that are depended on by models that are excluded in dbt_assets, which means that we are unnecessarily creating an observable for a SourceAsset that will never be used. I would like to get a list of all selected dbt models in order to filter the sources that will be used.

Scout

02/23/2024, 1:53 AM

As of my knowledge cutoff in early 2023, Dagster does not provide a built-in way to get a list of all selected dbt models before execution time directly from the

dbt_assets

decorator. The selection of dbt models typically happens at runtime, and the

selected_asset_keys

attribute is available within the

AssetExecutionContext

during execution. However, you can achieve the desired outcome by using dbt's selection syntax and the dbt manifest file to determine which models will be selected and, by extension, which sources are depended on by those models. Here's a conceptual approach to filter the sources based on selected dbt models: 1. Use dbt's selection syntax to define which models you want to include or exclude in your

dbt_assets

decorator. 2. Parse the dbt manifest file to identify all models and sources, as well as their dependencies. 3. Apply the same selection criteria used in the

dbt_assets

decorator to filter the models in the manifest file. 4. From the filtered list of models, determine which sources they depend on. 5. Create

SourceAssets

only for the sources that are dependencies of the filtered models. This approach requires you to implement custom logic to parse the manifest file and apply the selection criteria. You can use dbt's Python API or directly parse the manifest file, which is a JSON file containing all the necessary information about models and sources. Here's a high-level example of how you might parse the manifest file and apply selection criteria:

Copy code

python
import json

# Load the dbt manifest file
with open('target/manifest.json') as f:
    manifest = json.load(f)

# Define your selection criteria (this should match what you use in dbt_assets)
selected_models = set()  # Populate this set with the names of models you want to include

# Filter the models based on your selection criteria
for model_id, model in manifest['nodes'].items():
    if model['resource_type'] == 'model' and model['name'] in selected_models:
        # This model is selected, so process its dependencies
        for source_id in model['depends_on']['nodes']:
            if source_id.startswith('source.'):
                # This is a source dependency for a selected model
                source = manifest['sources'][source_id]
                # Create a SourceAsset for this source or add it to a list for later processing
                # ...

# Now you have a list of sources that are dependencies of the selected models

Please note that this is a simplified example and does not include the actual selection logic, which can be complex depending on your dbt project's structure and the selection syntax you use. You may need to replicate some of the selection logic that dbt uses internally or leverage dbt's Python API to perform the selectio

Scout

02/23/2024, 1:54 AM

n. Keep in mind that the capabilities of Dagster and its integrations with dbt may have evolved since my last update, so I recommend checking the latest documentation or reaching out to the Dagster community for any new features or best practices that could assist with this task.

18 Views

Open in Slack

Previous Next