Terry Lines
02/13/2024, 7:06 PMScout
02/13/2024, 7:06 PMMultiPartitionsDefinition
class to specify how the asset is partitioned along multiple dimensions. Here's a step-by-step guide based on the documentation:
1. Import the necessary classes: You'll need to import MultiPartitionsDefinition
along with other classes such as DailyPartitionsDefinition
, StaticPartitionsDefinition
, and asset
from the dagster
module.
2. Define the partitions: Create instances of PartitionsDefinition
for each dimension you want to partition by. For example, you might use DailyPartitionsDefinition
for a time dimension and StaticPartitionsDefinition
for a categorical dimension.
3. Create a MultiPartitionsDefinition
: Combine your partition definitions into a MultiPartitionsDefinition
by passing a dictionary where keys are dimension names and values are the corresponding PartitionsDefinition
instances.
4. Define your asset: Use the @asset
decorator to define your asset function. Pass the MultiPartitionsDefinition
instance to the partitions_def
parameter of the @asset
decorator.
5. Access partition information: Inside your asset function, you can access the partition information through the AssetExecutionContext
context parameter. Specifically, you can use context.partition_key.keys_by_dimension
to get the keys for each dimension.
Here's an example code snippet from the documentation that illustrates these steps:
python
from dagster import (
AssetExecutionContext,
DailyPartitionsDefinition,
MultiPartitionsDefinition,
StaticPartitionsDefinition,
asset,
)
@asset(
partitions_def=MultiPartitionsDefinition(
{
"date": DailyPartitionsDefinition(start_date="2022-01-01"),
"color": StaticPartitionsDefinition(["red", "yellow", "blue"]),
}
)
)
def multi_partitions_asset(context: AssetExecutionContext):
if isinstance(context.partition_key, MultiPartitionKey):
<http://context.log.info|context.log.info>(context.partition_key.keys_by_dimension)
In this example, the asset multi_partitions_asset
is partitioned by two dimensions: date and color. It will have a partition for each combination of color and date, such as red|2022-01-01
, yellow|2022-01-01
, blue|2022-01-01
, and so on.
Please note that as of the latest information in the documentation, multipartitions definitions are currently limited to two dimensions.
If you have further questions or need assistance with a specific use case, feel free to ask! Partitioning assets | Dagster button Document how to access the partition inf... button Thank you for your feedback!Terry Lines
02/13/2024, 7:07 PMTerry Lines
02/13/2024, 7:18 PMScout
02/13/2024, 7:18 PM@multi_asset
decorator, which allows you to define multiple assets produced by a single computation. Each asset can have its own partitioning scheme.
Here's how you can define a multi-asset with partitioned underlying assets:
1. Import the necessary classes: Import multi_asset
, AssetOut
, and any classes needed for defining partitions, such as DailyPartitionsDefinition
, StaticPartitionsDefinition
, etc.
2. Define the multi-asset: Use the @multi_asset
decorator to define your multi-asset. Specify the outputs using the outs
parameter, where each output is an instance of AssetOut
. You can assign a PartitionsDefinition
to each AssetOut
to define how that particular asset is partitioned.
3. Implement the computation: In the body of the multi-asset function, implement the logic that computes the outputs. You will return a dictionary where the keys match the names given in the outs
parameter, and the values are the computed data for each asset.
Here's an example based on the documentation that shows how to define a basic multi-asset:
python
from dagster import AssetOut, multi_asset, DailyPartitionsDefinition
@multi_asset(
outs={
"my_daily_asset": AssetOut(partitions_def=DailyPartitionsDefinition(start_date="2022-01-01")),
"my_static_asset": AssetOut(partitions_def=StaticPartitionsDefinition(["a", "b", "c"])),
}
)
def my_multi_asset():
# Compute the data for each asset
daily_data = compute_daily_data()
static_data = compute_static_data()
# Return a dictionary with the data for each asset
return {
"my_daily_asset": daily_data,
"my_static_asset": static_data,
}
In this example, my_multi_asset
is a multi-asset that produces two partitioned assets: my_daily_asset
and my_static_asset
. The my_daily_asset
is partitioned daily starting from a specific date, and my_static_asset
has static partitions ["a", "b", "c"]
.
Please note that the actual computation logic (compute_daily_data
and compute_static_data
) would need to be implemented according to your specific use case.
Keep in mind that the example provided is a simplified illustration. Depending on your requirements, you may need to handle more complex partitioning logic or interactions between the assets. If you have a more complex scenario or need further assistance, please provide additional details, and I'll be happy to help! SpecificPartitionMapping between static-... button Thank you for your feedback!