KekronBekron
02/05/2023, 6:39 AMSean Davis
02/05/2023, 11:06 PMKekronBekron
02/06/2023, 2:14 AMsean
02/06/2023, 4:13 PMPartitionsDefinition
, where each partition corresponds to one of your domains, then you can process only the files for a specific domain in each run of the job. I highly recommend reading the full “Partitions assets and jobs” concept page to get a sense of what is possible:
https://docs.dagster.io/concepts/partitions-schedules-sensors/partitionsKekronBekron
02/07/2023, 10:29 AMKekronBekron
03/11/2023, 6:24 AMKekronBekron
03/11/2023, 6:27 AMif not context.instance.has_dynamic_partition(images_partitions_def.name, img_filename)
As someone new to the data orchestration world (no prior Airflow experience), at least I feel that there are way too many keywords to remember.sean
03/13/2023, 4:47 PMcontext
is the SensorEvaluationContext
• it gives you access to the Dagster Instance via instance
. The instance
provides APIs for all persisted data in Dagster
• there is an instance method called has_dynamic_partition
on the instance. You can use this to check if a partition already exists for a particular partitions definition (in this case, images_partitions_def
).
This feature is still under active development, so your feedback is appreciated.KekronBekron
03/14/2023, 2:29 AMKekronBekron
03/14/2023, 2:30 AMKekronBekron
03/14/2023, 2:33 AMKekronBekron
03/14/2023, 2:37 AMsean
03/14/2023, 8:20 PMBut then what decides what the dynamic partitioning is based on? I see the answer is img_filename, but where is that defined/decided?
img_filename
is just a local variable in the list comprehension that defines new_images
-- images_partitions_def
has no reference to it.
In fact the definition of images_partitions_def
is dead simple:
images_partitions_def = DynamicPartitionsDefinition(name="images")
Basically it’s just a container for arbitrary string partition keys. In this case, we happen to be filling it with keys derived from the filenames found inside the MY_DIRECTORY
directory, but images_partitions_def
knows nothing about this-- it’s more or less just a set of strings.