https://dagster.io/ logo
#ask-ai
Title
# ask-ai
d

Darryl Ng

02/23/2024, 1:45 AM
@Scout how can I get a list of all asset keys for all assets selected using dbt_assets decorator?
s

Scout

02/23/2024, 1:45 AM
To get a list of all asset keys for assets selected using the
dbt_assets
decorator, you can utilize the
selected_asset_keys
attribute available in the
AssetExecutionContext
context object within your
@dbt_assets
function. This attribute provides access to the asset keys that are currently being selected for execution. Here's an example from the documentation that demonstrates how you might use `selected_asset_keys`:
Copy code
python from dagster import asset, AssetExecutionContext from dagster_dbt import DbtCliResource, dbt_assets  @dbt_assets(manifest=MANIFEST_PATH) def my_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):     # ... your dbt logic here ...     pass  @asset def downstream_asset(context: AssetExecutionContext):     # Accessing selected asset keys from the dbt_assets     selected_keys = context.selected_asset_keys     # Do something with the selected asset keys     # ...
In this example,
selected_asset_keys
is used within a downstream asset to access the keys of the assets that are selected by the
@dbt_assets
decorator. You can adapt this pattern to your specific use case to retrieve and utilize the list of asset keys as needed within your Dagster pipeline.
d

Darryl Ng

02/23/2024, 1:49 AM
@Scout would it be possible to get the list of asset keys of the selected dbt_assets after the assets have been declared rather than waiting for execution time to get the selected_asset_keys. I would like to know what sources are depended on by the selected models to create SourceAssets with observable.
s

Scout

02/23/2024, 1:49 AM
To determine the asset keys of the selected
dbt_assets
and their dependencies, such as dbt sources, you would typically need to analyze the dbt manifest file. The dbt manifest file contains metadata about your dbt project, including models, sources, and their relationships. Dagster provides utilities to parse dbt's manifest file and create asset keys for dbt models and sources. However, the documentation does not explicitly provide a built-in function to retrieve all asset keys for selected dbt assets before execution time. The
selected_asset_keys
attribute is available during execution within the
AssetExecutionContext
. If you want to create
SourceAssets
with observability based on the dbt sources that are dependencies of your selected models, you would need to parse the dbt manifest file to identify these relationships. You can then use this information to define your
SourceAssets
accordingly. Here's a conceptual approach based on the documentation: 1. Parse the dbt manifest file to identify the sources and models. 2. Determine the dependencies between models and sources. 3. Create
SourceAssets
for the identified dbt sources. 4. Use the
observable
decorator to add observability to these
SourceAssets
. The documentation mentions helper methods like
get_asset_key_for_source
and
get_asset_keys_by_output_name_for_source
that can be used to retrieve asset keys for dbt sources. You can use these methods to define your
SourceAssets
with the correct asset keys that correspond to the dbt sources. Please note that this approach requires custom logic to parse the dbt manifest and is not a built-in feature of Dagster's
dbt_assets
decorator. If you need further assistance with this process or if there are updates in Dagster's capabilities that could simplify this task, I recommend checking the latest documentation or reaching out to the Dagster community for support.
d

Darryl Ng

02/23/2024, 1:53 AM
@Scout I can already create SourceAssets for the sources found in manifest.yaml, however, there are many sources that are depended on by models that are excluded in dbt_assets, which means that we are unnecessarily creating an observable for a SourceAsset that will never be used. I would like to get a list of all selected dbt models in order to filter the sources that will be used.
s

Scout

02/23/2024, 1:53 AM
As of my knowledge cutoff in early 2023, Dagster does not provide a built-in way to get a list of all selected dbt models before execution time directly from the
dbt_assets
decorator. The selection of dbt models typically happens at runtime, and the
selected_asset_keys
attribute is available within the
AssetExecutionContext
during execution. However, you can achieve the desired outcome by using dbt's selection syntax and the dbt manifest file to determine which models will be selected and, by extension, which sources are depended on by those models. Here's a conceptual approach to filter the sources based on selected dbt models: 1. Use dbt's selection syntax to define which models you want to include or exclude in your
dbt_assets
decorator. 2. Parse the dbt manifest file to identify all models and sources, as well as their dependencies. 3. Apply the same selection criteria used in the
dbt_assets
decorator to filter the models in the manifest file. 4. From the filtered list of models, determine which sources they depend on. 5. Create
SourceAssets
only for the sources that are dependencies of the filtered models. This approach requires you to implement custom logic to parse the manifest file and apply the selection criteria. You can use dbt's Python API or directly parse the manifest file, which is a JSON file containing all the necessary information about models and sources. Here's a high-level example of how you might parse the manifest file and apply selection criteria:
Copy code
python
import json

# Load the dbt manifest file
with open('target/manifest.json') as f:
    manifest = json.load(f)

# Define your selection criteria (this should match what you use in dbt_assets)
selected_models = set()  # Populate this set with the names of models you want to include

# Filter the models based on your selection criteria
for model_id, model in manifest['nodes'].items():
    if model['resource_type'] == 'model' and model['name'] in selected_models:
        # This model is selected, so process its dependencies
        for source_id in model['depends_on']['nodes']:
            if source_id.startswith('source.'):
                # This is a source dependency for a selected model
                source = manifest['sources'][source_id]
                # Create a SourceAsset for this source or add it to a list for later processing
                # ...

# Now you have a list of sources that are dependencies of the selected models
Please note that this is a simplified example and does not include the actual selection logic, which can be complex depending on your dbt project's structure and the selection syntax you use. You may need to replicate some of the selection logic that dbt uses internally or leverage dbt's Python API to perform the selectio
n. Keep in mind that the capabilities of Dagster and its integrations with dbt may have evolved since my last update, so I recommend checking the latest documentation or reaching out to the Dagster community for any new features or best practices that could assist with this task.
2 Views