Philip Shurpik
02/14/2024, 6:15 AMRunRequest(partition_key=client_name)
processing when new files are found.
But in assets it adds new partition, but in missing state - not materialize it.
My code is fully based on:
https://github.com/dagster-io/dagster/blob/master/examples/docs_snippets/docs_snip[…]cepts/partitions_schedules_sensors/dynamic_partitioned_asset.py
So have two questions:
1) Is dagster good fit for this type of pipelines? Or maybe I missed smth and should look for another solutions? :)
2) How to setup pipeline that new partition will not be added in missing state, but automatically will be processed / materialized ?Scout
02/14/2024, 6:15 AMDynamicPartitionsDefinition
for cases where you don't know the set of partitions ahead of time, and DynamicOutput
for dynamic orchestration within your pipeline.
To ensure that new partitions are automatically processed and materialized, you can use a combination of sensors and partitioned assets. The sensor can be configured to trigger a pipeline run when new client data is detected, and within your pipeline, you can define assets with partitioning logic that processes the data for that specific client.
Regarding the issue of partitions being added in a missing state, you might want to look into the DynamicPartitionsDefinition
feature, which allows for the dynamic addition of partitions based on runtime data. This feature is experimental, so you should be aware that some APIs may change in the future.
If you're following an example similar to the one in the Dagster repository and encountering issues with partitions being in a missing state, it could be related to how the sensor and partitioned assets are set up. It's important to ensure that the sensor correctly triggers the pipeline run for the new partition and that the pipeline logic is set up to materialize the asset for that partition.
In summary, Dagster is well-suited for the type of pipeline you're building,Scout
02/14/2024, 6:16 AM