Mark Fickett03/30/2022, 2:58 PM
once per job run and then re-use it? When I do something like this:
I seem to be getting a new value wherever the resource is used, whereas I would like to get one value shared among all `@op`s. And I don't think simple memoization would work (especially on Kubernetes). I could produce the value from one
@resource def some_singleton_for_this_job(init_context): return uuid.uuid4().hex
and pass it into others, but that's going to add a lot of extra parameter passing. My actual use case is getting a trace context ID to use for all instrumentation in one job, so I can link together spans/events/etc .
rex03/30/2022, 3:12 PM
Mark Fickett03/30/2022, 3:13 PM
alex03/30/2022, 3:16 PM
compute a value as a @resource once per job run and then re-use it?It depends what executor you are using. For in process, the resource will only be initialized once per run and shared in memory. For multiprocess you will need to use the filesystem or other scheme to share the value between processes. For something like k8s/celery executors you will need to share the value via a database or service that all the machines participating can communicate with.
Mark Fickett03/30/2022, 3:18 PM
and passing it as a parameter may be simplest. Dagster's I/O manager is easier to use than setting up my own storage (: .