Josh
02/23/2024, 10:03 AMScout
02/23/2024, 10:03 AMaggregated_logs
asset that depends on processed_logs
, which is an external asset, the aggregated_logs
asset can be defined with a dependency on processed_logs
. However, since Dagster doesn't control scheduling or materializing external assets, it's up to you to keep their metadata updated. Materialization for external assets will be disabled in the Dagster UI, so you would need to use other approaches like REST API, sensors, Python API, or logging events using ops to report asset materializations. If an external upstream asset is updated, you would need to ensure that the metadata for that asset is updated in Dagster using one of the mentioned methods. Once Dagster is aware of the update to the external asset, it can trigger auto-materialization policies for downstream assets that depend on it, assuming those policies are configured to respond to such updates.Josh
02/23/2024, 12:06 PMScout
02/23/2024, 12:06 PMpartition_defs
for an external asset. Instead, it discusses the concept of external assets and how they can be used to represent and track assets that are not produced by Dagster ops. To define partitions for an external asset, you would typically use the PartitionsDefinition
class or one of its subclasses to specify how the asset is partitioned. Here's a general approach based on how partitions are defined for software-defined assets: 1. Define your partition scheme using one of the PartitionsDefinition
subclasses, such as StaticPartitionsDefinition
, DynamicPartitionsDefinition
, or a custom partition definition. 2. Use the AssetSpec
class to define your external asset, including the partition definition in the asset specification. 3. Register the external asset with your Dagster repository using external_assets_from_specs
or similar methods. Here's a hypothetical example of how you might define an external asset with daily partitions: python from dagster import AssetSpec, DailyPartitionsDefinition, external_assets_from_specs # Define the partition scheme for the external asset daily_partitions_def = DailyPartitionsDefinition(start_date="2023-01-01") # Define the external asset with the partition scheme my_external_asset = AssetSpec( key="my_external_asset", description="An external asset partitioned daily.", partitions_def=daily_partitions_def, # Additional metadata can be included here ) # Register the external asset with the repository external_assets = external_assets_from_specs([my_external_asset])
Please note that this is a general example, and the actual implementation may vary based on the specific version of Dagster you are using and the details of your external asset. Always refer to the latest Dagster documentation for the most accurate and up-to-date guidance on working with external assets and partition definitions.Josh
02/23/2024, 12:10 PMpartitions_def
attributeScout
02/23/2024, 12:10 PMAssetSpec
does not have a partitions_def
attribute. In Dagster, the concept of partitions is typically applied to assets that are managed within the Dagster system. For external assets, which are not created or managed by Dagster ops, the partitioning information is not directly applied in the same way as it is for internal assets. Instead, you would track the metadata about partitions for an external asset outside of Dagster and then use sensors or other mechanisms to update Dagster about changes to those external assets, including information about their partitions. Dagster's asset catalog would then reflect the state of those external assets, including any partitioning metadata you provide. If you need to work with external assets that are partitioned, you would typically handle the partitioning logic outside of Dagster and then use Dagster's APIs to inform the asset catalog about the state of those assets. This might involve creating custom sensors or using the Python API to update asset metadata. For more specific guidance on how to handle partitioned external assets in Dagster, I recommend checking the latest documentation or reaching out to the Dagster community for best practices, as the handling of external assets may evolve over time.