Hello, I have two questions related to the dagster...
# ask-community
g
Hello, I have two questions related to the dagster-k8s retry code https://github.com/dagster-io/dagster/blob/754e4aabb2fe5ac1969d46591ff7e5ff0d9a7b43/python_modules/libraries/dagster-k8s/dagster_k8s/client.py#L81: - It seems it was chosen to not retry all k8s api calls. Is there a specific reason for this? This seems strange to me since all of these calls can potentially encounter network issues. - Did the devs consider using a retry library such as tenacity or expand the current retry functionality? You can currently only retry at a specific interval, and there is no way to, for example, increase the wait interval for each retry attempt.
d
Hi Gerben - are you using this client class yourself directly or running into issues because of a dagster feature that uses the client? Only asking because the points here are very reasonable but I don't think that class is currently considered part of the 'public' dagster API or intended for general usage
g
We are using our own implementation for running pods/jobs in k8s atm, so I was taking a look at the implementation of Dagster to see how you are handling retries in case of network issues. We usually implement something like tenacity retries with wait_exponential to space out the retry attempts, but for Dagster we maybe wanted to reuse the k8s_api_retry provided by the dagster-k8s lib. Is your recommendation to not use the k8s_api_retry functionality and implement our own retry for k8s api calls? If k8s_api_retry is not meant for public use like you mentioned, could that mean this function could possibly longer exist in future versions?
d
That's right - if it's not directly exported by the top-level dagster-k8s module, it's not considered part of the public API and could potentially change in future versions
g
Ok, thanks for the quick responses! We are planning on switching to dagster-k8s in the future so I was still wondering if you might have an answer(s) to the questions I raised earlier?
d
I think those both sound like good improvements - is that something you’d be able to file a GitHub issue for?
👍 1
g