[16:47:51] I can't connect to yarn [17:28:17] groceryheist: I think it happened again :( [17:29:13] 2018-10-28 11:26:45,938 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 11085ms for sessionid 0x50664009c3b1017d, closing so [17:29:17] cket connection and attempting reconnect [17:29:20] 2018-10-28 11:26:45,939 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8072ms [17:29:23] No GCs detected [17:29:34] yeah same thing [17:30:21] !log restart yarn resource manager on an-master1002 to force failover to an-master1001 [17:30:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:31:11] groceryheist: now it works, thanks a lot for the ping [17:32:49] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: JVM pauses cause Yarn master to failover - https://phabricator.wikimedia.org/T206943 (10elukey) Happened again today: ``` 2018-10-28 11:26:45,938 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server i... [18:00:56] I can't find a metric that would justify 8s of hanging [18:01:03] even the kernel metrics looks fine [18:02:10] but it must be surely related to the new hardware [18:03:55] elukey: thanks! [18:24:29] Will investigate more tomorro [18:24:31] *tomorro [18:24:35] anyhow :) [20:01:04] 10Quarry, 10Google-Code-in-2018: Show execution time in the Recent Queries page - https://phabricator.wikimedia.org/T71264 (10Framawiki) [20:04:31] 10Quarry, 10Google-Code-in-2018: Show execution time in the Recent Queries page - https://phabricator.wikimedia.org/T71264 (10Framawiki) [20:05:38] 10Quarry, 10Google-Code-in-2018: Show execution time in the Recent Queries page - https://phabricator.wikimedia.org/T71264 (10Framawiki) Imported as https://codein.withgoogle.com/tasks/4789965222313984/. I'll mentor this task, with the help of @zhuyifei1999 . [22:41:37] (03CR) 10Nuria: [C: 031] "I think this a major improvement from original patch." (032 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/468964 (https://phabricator.wikimedia.org/T206968) (owner: 10Fdans) [23:22:50] (03PS1) 10John Erling Blad: Linted README.md [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/470294 [23:38:12] (03CR) 10Nuria: [V: 032 C: 032] "Thanks for doing changes." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/470294 (owner: 10John Erling Blad) [23:44:59] (03Merged) 10jenkins-bot: Linted README.md [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/470294 (owner: 10John Erling Blad)