Hello ! We have a pipeline that run every hour. This pipeline needs to call an API, and, to do that, need to retrieve an Access token from our Auth provider.
The Access token is valid for 24 hours. So far, we retrieve a new token at every run, but, it could be great to fetch the token once a day, and reuse it, while it's valid.
Is there any way to store the token using one of dagster features ? Or maybe the best solution is to deploy a "shared" Redis, and have the job retrieve the token from Redis, and, if it's not in Redis, make the request ?
03/26/2021, 11:13 AM
my dagster experience is very limited, but have you considered using an io manager? i guess you could store the expected expiry time with the token, and then lazily re-generate if needed, or pro-actively generate it with schedule
03/26/2021, 11:11 PM
I personally would try to interact with the API via a resource. I know a resource is pipeline scoped, but I guess if you use it together with things like lru_cache or cachetools, you can implement it so it reuses the same token for 24 hours.
There is however an important remark here, whenever you use the non-basic executors (e.g. celery or k8s executor) solids or pipelines will be executed in separate processes (and even in separate containers). In that case memoization won't do much good and you will need to maintain a centralized copy of the access token (e.g. in a DB or a shared file). Not sure if in that case it's worth the hassle 🤔