Darryl Ng
02/23/2024, 1:45 AMScout
02/23/2024, 1:45 AMdbt_assets
decorator, you can utilize the selected_asset_keys
attribute available in the AssetExecutionContext
context object within your @dbt_assets
function. This attribute provides access to the asset keys that are currently being selected for execution. Here's an example from the documentation that demonstrates how you might use `selected_asset_keys`: python from dagster import asset, AssetExecutionContext from dagster_dbt import DbtCliResource, dbt_assets @dbt_assets(manifest=MANIFEST_PATH) def my_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource): # ... your dbt logic here ... pass @asset def downstream_asset(context: AssetExecutionContext): # Accessing selected asset keys from the dbt_assets selected_keys = context.selected_asset_keys # Do something with the selected asset keys # ...
In this example, selected_asset_keys
is used within a downstream asset to access the keys of the assets that are selected by the @dbt_assets
decorator. You can adapt this pattern to your specific use case to retrieve and utilize the list of asset keys as needed within your Dagster pipeline.Darryl Ng
02/23/2024, 1:49 AMScout
02/23/2024, 1:49 AMdbt_assets
and their dependencies, such as dbt sources, you would typically need to analyze the dbt manifest file. The dbt manifest file contains metadata about your dbt project, including models, sources, and their relationships. Dagster provides utilities to parse dbt's manifest file and create asset keys for dbt models and sources. However, the documentation does not explicitly provide a built-in function to retrieve all asset keys for selected dbt assets before execution time. The selected_asset_keys
attribute is available during execution within the AssetExecutionContext
. If you want to create SourceAssets
with observability based on the dbt sources that are dependencies of your selected models, you would need to parse the dbt manifest file to identify these relationships. You can then use this information to define your SourceAssets
accordingly. Here's a conceptual approach based on the documentation: 1. Parse the dbt manifest file to identify the sources and models. 2. Determine the dependencies between models and sources. 3. Create SourceAssets
for the identified dbt sources. 4. Use the observable
decorator to add observability to these SourceAssets
. The documentation mentions helper methods like get_asset_key_for_source
and get_asset_keys_by_output_name_for_source
that can be used to retrieve asset keys for dbt sources. You can use these methods to define your SourceAssets
with the correct asset keys that correspond to the dbt sources. Please note that this approach requires custom logic to parse the dbt manifest and is not a built-in feature of Dagster's dbt_assets
decorator. If you need further assistance with this process or if there are updates in Dagster's capabilities that could simplify this task, I recommend checking the latest documentation or reaching out to the Dagster community for support.Darryl Ng
02/23/2024, 1:53 AMScout
02/23/2024, 1:53 AMdbt_assets
decorator. The selection of dbt models typically happens at runtime, and the selected_asset_keys
attribute is available within the AssetExecutionContext
during execution.
However, you can achieve the desired outcome by using dbt's selection syntax and the dbt manifest file to determine which models will be selected and, by extension, which sources are depended on by those models. Here's a conceptual approach to filter the sources based on selected dbt models:
1. Use dbt's selection syntax to define which models you want to include or exclude in your dbt_assets
decorator.
2. Parse the dbt manifest file to identify all models and sources, as well as their dependencies.
3. Apply the same selection criteria used in the dbt_assets
decorator to filter the models in the manifest file.
4. From the filtered list of models, determine which sources they depend on.
5. Create SourceAssets
only for the sources that are dependencies of the filtered models.
This approach requires you to implement custom logic to parse the manifest file and apply the selection criteria. You can use dbt's Python API or directly parse the manifest file, which is a JSON file containing all the necessary information about models and sources.
Here's a high-level example of how you might parse the manifest file and apply selection criteria:
python
import json
# Load the dbt manifest file
with open('target/manifest.json') as f:
manifest = json.load(f)
# Define your selection criteria (this should match what you use in dbt_assets)
selected_models = set() # Populate this set with the names of models you want to include
# Filter the models based on your selection criteria
for model_id, model in manifest['nodes'].items():
if model['resource_type'] == 'model' and model['name'] in selected_models:
# This model is selected, so process its dependencies
for source_id in model['depends_on']['nodes']:
if source_id.startswith('source.'):
# This is a source dependency for a selected model
source = manifest['sources'][source_id]
# Create a SourceAsset for this source or add it to a list for later processing
# ...
# Now you have a list of sources that are dependencies of the selected models
Please note that this is a simplified example and does not include the actual selection logic, which can be complex depending on your dbt project's structure and the selection syntax you use. You may need to replicate some of the selection logic that dbt uses internally or leverage dbt's Python API to perform the selectioScout
02/23/2024, 1:54 AM