Jean-Pierre M
05/20/2021, 1:32 PMdaniel
05/20/2021, 2:13 PMJean-Pierre M
05/20/2021, 2:23 PMdaniel
05/20/2021, 2:25 PMJean-Pierre M
05/20/2021, 2:26 PMalex
05/20/2021, 2:31 PMdagit
problems is web requests - are there active users of the tool during this time? The background activity of the webserver when idle shouldn’t change as a function of activity (unless we are doing something we dont mean to)Jean-Pierre M
05/20/2021, 2:37 PMalex
05/20/2021, 2:42 PMdagit pods restart themselves on K8S but continue to be unresponsive and never recoverCan you be more precise when you say unresponsive? Do the static resources load and the data never shows up? Do you just get nothing? Does the web request time out?
Jean-Pierre M
05/20/2021, 3:58 PMdaniel
05/20/2021, 4:09 PMJean-Pierre M
05/20/2021, 5:43 PMdescribe pod
for the dagit pod and the logs (note that for the logs were empty, but using the --previous flag in K8S it gave me something)alex
05/20/2021, 5:50 PMJean-Pierre M
05/20/2021, 6:02 PMkubectl top pod
and I noticed it reaches cpu>4000 and memory>4000 as it gets through the runsalex
05/20/2021, 6:06 PMqueuedRunCoordinator
to limit the max simultaneous runs?Jean-Pierre M
05/20/2021, 6:10 PMalex
05/20/2021, 6:12 PMJean-Pierre M
05/20/2021, 6:18 PMalex
05/20/2021, 6:19 PMJean-Pierre M
05/20/2021, 6:20 PMalex
05/20/2021, 6:25 PMStatefulSet
is involved. It can probably take over more resources on the Node
but past that moving to a Node
with more resources takes time. So upfront request of more resources might get you on to a Node
that can handle it. I guess it depends whats available in your NodePool
Jean-Pierre M
05/20/2021, 6:26 PM