Hi there Quick Dagit+Postgres question we are hosting the Da dagster #ask-community

Hi there! Quick Dagit+Postgres question: we are ho...

VxD

05/20/2022, 12:21 AM

Hi there! Quick Dagit+Postgres question: we are hosting the Dagster UI in an environment which drops idle outbound connections after 30 minutes. Dagit however instantiates its DB connections in

optimize_for_dagit

with a pool of 1, keeping one single connection (and hoping it stays open). If the UI is not accessed for some time, the connection to the DB is lost and the page will appear to hang until refreshed. Is there a way to force Dagit to connect to the DB with different pool parameters (e.g.

pool_pre_ping

pool_recycle

) to try and mitigate this?

alex

05/20/2022, 4:08 PM

the page will appear to hang until refreshed

can you be more precise? All web requests hang until the server is restarted or a specific web page appears hung until it is refreshed?

VxD

05/23/2022, 12:28 AM

It depends on the page. If accessing a specific run for instance, the logs will fail to retrieve and the bottom section will spin forever.

VxD

05/23/2022, 12:30 AM

I was able to work around the issue by writing a small wrapper around dagit’s

main

and monkey patch the

create_engine

function to inject a

pool_recycle

argument set to 25 minutes.

Roel Hogervorst

06/24/2022, 9:29 AM

I believe I have a similar issue with timeouts

Roel Hogervorst

06/24/2022, 9:35 AM

Whenever I connect to the dagit ui in the morning I see connection errors. I have increased the memory and cpu for all parts (database and dagit) but that is not helping. I have a feeling (but don't know how to test it) that the disconnect happens because the database connection is lost after 4 hours (i believe the postgres standard) but dagit doesn't realise this. It would be nice if we could help dagit by putting timeouts on the connection from dagit too. So dagit first tries to re-establish a db connection before it tries to send a query.

Roel Hogervorst

07/01/2022, 9:42 AM

It seems the timeouts on my side were caused by: Istio encrypts all connections between pods in my cluster. the default timouts for all TCP traffic is 15 seconds. that is what causing all the timouts. I can't send the timeout information to the postgresconnection (with pool) because that is not yet possible with dagster. Alternatives I'm now persuing are: (1) changing the net.ipv4.tcp_keepalive_time settings in sysctl.conf on the dagit pod dockerfile. (2) changing the settings on the dedicated dagster postgresdatabase pod

VxD

07/01/2022, 10:42 AM

It is possible to work around the issue client side if you don't mind hacking around a bit.

VxD

07/01/2022, 10:45 AM

I wrote a small python executable that imports dagit, patches the

create_engine

function to inject the

pool_recycle

parameter, then calls dagit’s

main

. Dirty, but does the trick.

Roel Hogervorst

07/01/2022, 12:26 PM

could you send me an example?

VxD

07/01/2022, 12:29 PM

Sorry, that would be in violation of the OSS policy of my company.

Roel Hogervorst

07/01/2022, 12:29 PM

alright, that is understandable

🤐 1

4 Views

Open in Slack

Previous Next