[05:36:29] something about analytics1034 isn't happy ... running a spark job all executors finish in <10s, except 1034 on which all jobs take 1.3 to 1.7 minutes. Nothing seems particularly out of wack on the dashboards though. data count/record size is very consistent, I don't have any particular expanation, but its odd that with a couple hundred executors only 1034 is slow [05:39:08] ebernhardson: thanks for the heads up! we had an issue with 1034 and its eth0 negotited speed flapping, it might be that it needs to be replaced [05:42:41] ethtool eth0 on 1034: Speed: 100Mb/s [05:42:48] that would be 1Gb/s [05:44:04] later on during the day I'll drain the node, put it in maintenance so it will not be picked up by spark [05:52:20] elukey: thanks! [10:58:38] 10Analytics, 10Operations, 10ops-eqiad: Analytics1034 eth0 negotiated speed to 100Mb/s instead of 1000Mb/s - https://phabricator.wikimedia.org/T172633#3504238 (10elukey) [10:58:49] there you go --^ [11:02:59] !log stop yarn on analytics1034 to reload the tg3 driver - T172633 [11:03:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:03:01] T172633: Analytics1034 eth0 negotiated speed to 100Mb/s instead of 1000Mb/s - https://phabricator.wikimedia.org/T172633 [12:52:34] ebernhardson: o/ - I stopped yarn on an1034 but you still have some spark containers running, so you'll likely keep seeing delays [12:52:37] :) [17:31:22] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3504656 (10fdans) [17:39:08] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3504659 (10fdans) [18:17:05] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3504669 (10fdans) [18:24:34] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3504670 (10fdans) [18:25:18] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3447985 (10fdans) [18:27:59] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (3/4) - Data issues - https://phabricator.wikimedia.org/T170937#3448129 (10fdans) This bug is solved by D730 [18:30:06] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (3/4) - Data issues - https://phabricator.wikimedia.org/T170937#3504688 (10fdans) a:03fdans [23:40:17] 10Analytics, 10Analytics-EventLogging, 10Performance-Team: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#3504949 (10Ottomata) > analytics/statsv is configured contrary to your recommendation of not needing a consumer group if auto commit is disabled...