Hi there! Quick Dagit+Postgres question: we are ho...
# ask-community
v
Hi there! Quick Dagit+Postgres question: we are hosting the Dagster UI in an environment which drops idle outbound connections after 30 minutes. Dagit however instantiates its DB connections in
optimize_for_dagit
with a pool of 1, keeping one single connection (and hoping it stays open). If the UI is not accessed for some time, the connection to the DB is lost and the page will appear to hang until refreshed. Is there a way to force Dagit to connect to the DB with different pool parameters (e.g.
pool_pre_ping
or
pool_recycle
) to try and mitigate this?
a
the page will appear to hang until refreshed
can you be more precise? All web requests hang until the server is restarted or a specific web page appears hung until it is refreshed?
v
It depends on the page. If accessing a specific run for instance, the logs will fail to retrieve and the bottom section will spin forever.
I was able to work around the issue by writing a small wrapper around dagit’s
main
and monkey patch the
create_engine
function to inject a
pool_recycle
argument set to 25 minutes.
r
I believe I have a similar issue with timeouts
Whenever I connect to the dagit ui in the morning I see connection errors. I have increased the memory and cpu for all parts (database and dagit) but that is not helping. I have a feeling (but don't know how to test it) that the disconnect happens because the database connection is lost after 4 hours (i believe the postgres standard) but dagit doesn't realise this. It would be nice if we could help dagit by putting timeouts on the connection from dagit too. So dagit first tries to re-establish a db connection before it tries to send a query.
It seems the timeouts on my side were caused by: Istio encrypts all connections between pods in my cluster. the default timouts for all TCP traffic is 15 seconds. that is what causing all the timouts. I can't send the timeout information to the postgresconnection (with pool) because that is not yet possible with dagster. Alternatives I'm now persuing are: (1) changing the net.ipv4.tcp_keepalive_time settings in sysctl.conf on the dagit pod dockerfile. (2) changing the settings on the dedicated dagster postgresdatabase pod
v
It is possible to work around the issue client side if you don't mind hacking around a bit.
I wrote a small python executable that imports dagit, patches the
create_engine
function to inject the
pool_recycle
parameter, then calls dagit’s
main
. Dirty, but does the trick.
r
could you send me an example?
v
Sorry, that would be in violation of the OSS policy of my company.
r
alright, that is understandable
🤐 1