[07:10:38] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Improvements to Wikistats2 chart popups - https://phabricator.wikimedia.org/T192416 (10sahil505) [07:44:22] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Improvements to Wikistats2 chart popups - https://phabricator.wikimedia.org/T192416 (10sahil505) [08:06:47] (03PS1) 10Sahil505: Hide overlay when the cursor is no longer on top of graph [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/448852 (https://phabricator.wikimedia.org/T192416) [08:08:01] (03CR) 10Sahil505: [C: 04-1] "WIP" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/448852 (https://phabricator.wikimedia.org/T192416) (owner: 10Sahil505) [14:58:20] 10Analytics, 10Analytics-Cluster, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Can't write from Spark to local FS - https://phabricator.wikimedia.org/T200609 (10GoranSMilovanovic) [16:04:57] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [16:47:07] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.657 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [17:06:28] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [17:29:36] !log restart eventlogging on eventlog1002 after tons of disconnects (still not clear what happened) [17:29:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:30:37] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.812 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [17:36:49] really weird [17:37:24] Jul 28 16:18:09 eventlog1002 eventlogging-processor@client-side-00[11972]: %3|1532794689.071|ERROR|eventlogging-4ff3cbb0-6d7f-11e8-ac94-1418775b0d42-eventlog1002.eqiad.wmnet.11972#producer-1| 6/6 brokers are down [17:37:33] for some reason this was seen by EL [17:42:13] and it keeps crashing [17:42:15] lovely [18:02:37] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [18:17:28] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 3.187 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [18:36:57] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [19:19:17] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.203 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [19:39:38] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [19:45:48] PROBLEM - EventLogging overall insertion rate from MySQL consumer on graphite1001 is CRITICAL: CRITICAL: 20.00% of data under the critical threshold [10.0] https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=12&fullscreen&orgId=1 [19:59:26] (I have people at dinner, trying to solve this but I'll be able to work on it only later on ) [20:13:48] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [20:19:38] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.194 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [20:41:17] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [20:52:57] RECOVERY - EventLogging overall insertion rate from MySQL consumer on graphite1001 is OK: OK: Less than 20.00% under the threshold [50.0] https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=12&fullscreen&orgId=1 [21:23:27] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.627 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [21:47:37] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [22:26:27] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.267 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [22:51:28] PROBLEM - Throughput of EventLogging NavigationTiming events on einsteinium is CRITICAL: 0 le 0 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [23:27:58] RECOVERY - Throughput of EventLogging NavigationTiming events on einsteinium is OK: (C)0 le (W)1 le 1.378 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [23:38:58] PROBLEM - Check status of defined EventLogging jobs on eventlog1002 is CRITICAL: CRITICAL: Stopped EventLogging jobs: eventlogging-processor@client-side-10 eventlogging-processor@client-side-09 eventlogging-processor@client-side-08 eventlogging-processor@client-side-07 eventlogging-processor@client-side-06 eventlogging-processor@client-side-05 eventlogging-processor@client-side-04 eventlogging-processor@client-side-03 eventloggi [23:38:58] t-side-02 eventlogging-processor@client-side-01 eventlogging-processor@client-side-00