It seems like the new auto-reload, once a gRPC con...
# announcements
n
It seems like the new auto-reload, once a gRPC connection is marked as Failed, it will never reconnect on its own?
s
It tries to reconnect on it’s own in the background before marking it as failed, but once it’s exhausted reconnect attempts it marks the location as failed
n
That sounds like a very bad thing
Why does it not keep retrying?
s
The thinking here was that if we’ve exhausted reconnect attempts, then it’s not an intermittent failure and something bad happened on the gRPC server, so we just surface the underlying error. Is there a failure case that you’re seeing where automatic reconnects would help?
n
Yes, once a fix is deployed it should reconnect automatically.
1
1
Requiring I go and click a button is kind of the opposite of the goal of the original automatic reload feature 🙂
Should I file a ticket about this? This is a major reversion in 0.10 AFAICT. It means that any deployment failure on a gRPC daemon requires manual intervention to resolve.
d
It's a reasonable request but I don't follow how its a regression - we didn't have any auto-reconnect functionality before
m
i presume we can just back off to a reasonable cadence of reconnect attempts?
n
@daniel Before it would never need to since it wouldn't notice anything was down?
And yeah, whatever it was doing before to pick up changes, just keep polling?
d
Ah, got it. We're actually holding off on the server monitoring feature for the release later today (I assume you're working off of master, it should be disabled now if you rebase). But this is good feedback that we can incorporate when it goes out in 0.10.1
n
👍
I'm still running my fork for that alter_sys_path change that I need to clean up
On the plus side, the auto-refresh did mostly work, though it confused the fancy definition DAG view a bit 😄
d
I think what would have happened before if the server updated under the hood (and you didn't refresh in dagit) is that dagit and what's actually on the server would fall out of sync. Which in many cases would be fine (but in some cases would break in confusing ways until you refreshed). Nearly certain dagit wasn't doing any polling. Your point about it showing an error and requiring intervention where it didn't always before is totally valid though
n
Yep, I mean for cases where the definition didn't change (i.e. just a code change in a solid) so before no reload would have been needed.
If one of those introduced a syntax error before, fix it and redeploy, no harm done
Now that would require manual poking
s
This makes sense, this is great feedback. I don’t see any downsides to this