Thibault
01/26/2023, 4:03 PMyuhan
01/27/2023, 5:40 PMsandy
01/31/2023, 12:26 AMThibault
01/31/2023, 8:18 AMGenerally, if you can model your use case using N partitions or N assets, we'd recommend N partitions.I'm a bit surprised by that, could you please elaborate ? I thought that Dagster was moving towards software defined assets, using partitions only for scheduling.
sandy
01/31/2023, 11:09 PMmeter_id % 100 == 0
and so forth?
I'm a bit surprised by that, could you please elaborate ? I thought that Dagster was moving towards software defined assets, using partitions only for scheduling.When I say N partitions, I mean a software-defined asset with N partitions (or a set of software-defined assets that all share the same N partitions)
Thibault
02/01/2023, 10:14 AM~100k is around the upper limit of what we support. We'd eventually like to support more than this, but I can't make promises about how soon this will be a good experience.Yes of course, I understand. That's why I'm trying to find a way to reduce the number of partitions we are working with 🙂
Why have a separate partition per meter? Is it that, when you launch a run, you want to be able to target a particular meter? Might it make sense to bucket a set of meters together? E.g. have a partition that includes all the meters whoseAlthough we could group some of them together, each meter typically belongs to a different functional scope. So a run failing for a meter shouldn't impact runs for other meters. That's why each meter has its own partition for every day. Besides, the API we're using for this is known to be rather unreliable at times for specific meters. So we can take for granted that for some meters, the API will fail once in a while. Bucketing a set of meters together would most likely result often in run failures because of a single meter in the bucket. Can you think of a way to avoid leaking one run failure onto other runs when bucketing a set of meters together, while also allowing visualization, retries, etc. ?and so forth?meter_id % 100 == 0
When I say N partitions, I mean a software-defined asset with N partitions (or a set of software-defined assets that all share the same N partitions)Ok, I think I get it. Ideally I would like to have one asset per meter over a partitioned schedule. But I haven't found a way to define assets dynamically
sandy
02/08/2023, 1:53 AMCan you think of a way to avoid leaking one run failure onto other runs when bucketing a set of meters together, while also allowing visualization, retries, etc. ?Alas there is not. Based on what you've described, having a partition per meter sounds like the best option.
Ok, I think I get it. Ideally I would like to have one asset per meter over a partitioned schedule. But I haven't found a way to define assets dynamicallyIn this week's release or next week's release, we're going to introduce dynamically partitioned assets. You'll be able to add (and remove) partitions dynamically and then launch runs tjat target those partitions.
Thibault
02/09/2023, 8:41 AMsandy
02/10/2023, 12:57 AMThibault
03/09/2023, 1:32 PM