https://dagster.io/ logo
#ask-community
Title
# ask-community
j

Jordan

04/26/2023, 5:56 PM
Hi! Is there any particular reason why the
context.instance.add_dynamic_partitions
method can add multiple keys while the
context.instance.delete_dynamic_partition
method deletes a single key? Maybe
context.instance.delete_dynamic_partitions
could be added
c

claire

04/26/2023, 6:12 PM
Hi Jordan, yep, we've talked about this too and I agree it would be good to add. Curious--where are you adding/deleting partitions from, is it within a sensor or asset/op?
If it's within a sensor, you could do something like:
Copy code
return SensorResult(dynamic_partitions_requests=[dynamic_partitions_def.build_delete_request(list_of_partition_keys)])
which accepts a list of partition keys
j

Jordan

04/27/2023, 8:28 PM
Thanks! Yes I use these functions in an asset because I build my partitioning by querying a DB (the query time is sometimes long). This DB fluctuates very little, so I want to update my partitioning every 24 hours and have the possibility to manually update the partitioning from an asset to have more flexibility. In addition to updating my partitioning, I would like the asset to be able to take advantage of the DB call to create a resource that I could use in any asset (even non-partitioned assets for example). Currently I am using a csv file as an intermediary. Do you see another way that would make more use of Dagster concepts to do this? I have tested
Pythonic resources
but I feel like a DB call is made for each run. My current solution with csv file:
Copy code
@asset
def synchronyze(context):
    df = get_df_with_query()
 
    # Update partitioning with df
    ...
 
    df.to_csv(path)
 
@asset
def other_asset(context):
    ...
    df = pd.read_csv(path)
    ...
c

claire

04/28/2023, 3:06 PM
Hey Jordan--unfortunately resources are constructed once per process. This means that they are constructed in each asset/op step (assuming you are using the multiprocess executor), so a resource can't be initialized to be used globally. Another option would be to yield the dataframe as an output to
synchronyze
, and have
@other_asset
be downstream of
synchronyze
. This would allow all downstream assets to load the latest output of synchronize as an input instead of re-querying the database.
s

sandy

04/28/2023, 10:18 PM
@Jordan - looking at your code example, neither of those assets are dynamically partitioned. Is there a third asset that's dynamically partitioned? What's the relationship between the asset that you're calling
add_dynamic_partitions
from and the asset that's dynamically partitioned?
100 Views