https://dagster.io/ logo
Title
c

Chris Nogradi

03/04/2022, 12:48 AM
Is using a resource the best way to maintain state between each invocation of an op within a job?
p

prha

03/04/2022, 1:01 AM
Can you describe a little more about your use case? The answer might depend on the resource and on the executor. For example, the multiprocess executor executes each op in its own process, so any given resource might be instantiated for every invocation of an op.
c

Chris Nogradi

03/04/2022, 1:13 AM
Ah ok, this would be an op that is looking for a pattern in the data over a period of time longer than the delta of the dataframe operated on so that it would need to retain state between invocations.
p

prha

03/04/2022, 2:21 AM
In that case, the resource would need to have access to some persistent store, like a DB or a filesystem. Alternatively, you could use AssetMaterialization events to stash the metadata you are aggregating
c

Chris Nogradi

03/04/2022, 3:43 PM
@prha thanks for the info! I am guessing that the resource that has persistant store access is better than the AssetMaterialization since the latter does not seem to have an API to read from that I can find.