[00:00:06] urandom try sshing now?
[00:00:08] RECOVERY - Host xenon is UP: PING OK - Packet loss = 0%, RTA = 1.10 ms
[00:00:27] RECOVERY - MegaRAID on xenon is OK: OK: no disks configured for RAID
[00:00:37] YuviPanda: yeah, but it's the same thing as before
[00:00:38] RECOVERY - dhclient process on xenon is OK: PROCS OK: 0 processes with command name dhclient
[00:00:38] RECOVERY - salt-minion processes on xenon is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[00:00:46] acpi_pad is using a lot of cpu
[00:00:58] RECOVERY - DPKG on xenon is OK: All packages OK
[00:01:05] YuviPanda: do you think it's OK if i try unloading that kernel module?
[00:01:07] RECOVERY - Disk space on xenon is OK: DISK OK
[00:01:15] YuviPanda: https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5098951
[00:02:07] I'm going to ask forgiveness rather than permission on this one
[00:02:58] cool; it worked
[00:02:59] urandom yup
[00:03:19] urandom and open a ticket?
[00:03:27] YuviPanda: will do.
[00:03:35] thanks
[00:03:40] YuviPanda: thank you
[00:10:18] RECOVERY - cassandra-a service on xenon is OK: OK - cassandra-a is active
[00:11:17] RECOVERY - Restbase root url on xenon is OK: HTTP OK: HTTP/1.1 200 - 15273 bytes in 1.297 second response time
[00:13:18] RECOVERY - restbase endpoints health on xenon is OK: All endpoints are healthy
[00:14:40] urandom: just got back. i see it recovered. that bug talks about "if HT is disabled" so i checked if it is on xenon. enabled there though
[00:14:54] yeah
[00:15:56] mutante: it's wierd
[00:15:58] 06Operations, 10Cassandra: xenon.eqiad.wmnet: very high cpu utilization - https://phabricator.wikimedia.org/T141675#2507314 (10Eevans)
[00:16:02] 06Operations, 10Cassandra: xenon.eqiad.wmnet: very high cpu utilization - https://phabricator.wikimedia.org/T141675#2507327 (10Eevans) p:05Triage>03High
[00:16:09] RECOVERY - puppet last run on xenon is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[00:16:11] mutante, YuviPanda: ^^^
[00:16:39] should be good enough going into the weekend though
[00:16:51] thanks urandom
[00:17:01] yea, thanks, ack
[00:17:03] it's not production, but i don't want it creating pager fatigue
[00:17:21] :) es
[00:17:23] yes
[00:19:38] RECOVERY - cassandra-a CQL 10.64.0.202:9042 on xenon is OK: TCP OK - 0.004 second response time on port 9042
[00:23:53] 06Operations, 10Cassandra: xenon.eqiad.wmnet: very high cpu utilization - https://phabricator.wikimedia.org/T141675#2507314 (10Dzahn) yea, HT is enabled on xenon.. it seems to start here, when RT throttling gets activated 2680 Jul 29 22:23:40 xenon kernel: [10997327.180547] sched: RT throttling activated 268...
[00:32:08] PROBLEM - puppet last run on mw1177 is CRITICAL: CRITICAL: Puppet has 1 failures
[00:56:37] 06Operations, 10Cassandra: xenon.eqiad.wmnet: very high cpu utilization - https://phabricator.wikimedia.org/T141675#2507368 (10Eevans)
[00:57:47] RECOVERY - puppet last run on mw1177 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[02:20:18] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.12) (duration: 08m 10s)
[02:20:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[02:25:55] !log l10nupdate@tin ResourceLoader cache refresh completed at Sat Jul 30 02:25:55 UTC 2016 (duration 5m 37s)
[02:26:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[02:37:48] PROBLEM - puppet last run on dbstore2001 is CRITICAL: CRITICAL: puppet fail
[02:47:38] RECOVERY - cassandra-c CQL 10.192.48.56:9042 on restbase2009 is OK: TCP OK - 0.036 second response time on port 9042
[02:48:58] LALAALLLAALALLA
[02:49:01] LALAALLLAALALLA
[02:49:02] LALAALLLAALALLA
[02:49:04] LALAALLLAALALLA
[02:49:05] LALAALLLAALALLA
[02:49:06] LALAALLLAALALLA
[02:49:07] LALAALLLAALALLA
[02:49:08] LALAALLLAALALLA
[02:49:10] LALAALLLAALALLA
[02:49:11] LALAALLLAALALLA
[02:49:12] LALAALLLAALALLA
[02:49:13] LALAALLLAALALLA
[02:49:14] LALAALLLAALALLA
[02:49:15] LALAALLLAALALLA
[02:49:17] LALAALLLAALALLA
[02:49:22] LALAALLLAALALLA
[02:49:23] LALAALLLAALALLA
[02:49:28] LALAALLLAALALLA
[02:49:29] LALAALLLAALALLA
[02:49:30] LALAALLLAALALLA
[02:49:31] LALAALLLAALALLA
[02:49:32] LALAALLLAALALLA
[02:49:34] LALAALLLAALALLA
[02:49:35] LALAALLLAALALLA
[02:49:36] LALAALLLAALALLA
[02:49:38] CHAU
[03:03:48] RECOVERY - puppet last run on dbstore2001 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures
[05:44:37] PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 222, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-4/2/0: down - Core: cr1-codfw:xe-5/2/1 (Telia, IC-307235, 34ms) {#2648} [10Gbps wave]BR
[06:29:57] PROBLEM - puppet last run on pc1006 is CRITICAL: CRITICAL: puppet fail
[06:31:08] PROBLEM - puppet last run on mw2228 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:09] PROBLEM - puppet last run on wtp2008 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:18] PROBLEM - puppet last run on ms-be2022 is CRITICAL: CRITICAL: Puppet has 4 failures
[06:31:18] PROBLEM - puppet last run on ms-be2026 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:28] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:32:08] PROBLEM - puppet last run on restbase2006 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:32:58] PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:36:58] PROBLEM - puppet last run on analytics1042 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:56:28] RECOVERY - puppet last run on wtp2008 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[06:56:29] RECOVERY - puppet last run on ms-be2022 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[06:56:29] RECOVERY - puppet last run on ms-be2026 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures
[06:56:37] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:56:48] RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 224, down: 0, dormant: 0, excluded: 0, unused: 0
[06:57:08] RECOVERY - puppet last run on pc1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:57:18] RECOVERY - puppet last run on restbase2006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:58:08] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:58:19] RECOVERY - puppet last run on mw2228 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[07:02:27] RECOVERY - puppet last run on analytics1042 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[07:15:27] PROBLEM - Juniper alarms on asw-d-eqiad.mgmt.eqiad.wmnet is CRITICAL: JNX_ALARMS CRITICAL - No response from remote host 10.65.0.24
[07:17:09] RECOVERY - Juniper alarms on asw-d-eqiad.mgmt.eqiad.wmnet is OK: JNX_ALARMS OK - 0 red alarms, 0 yellow alarms
[08:52:49] RECOVERY - cassandra-c CQL 10.64.48.137:9042 on restbase1014 is OK: TCP OK - 0.006 second response time on port 9042
[09:05:49] 06Operations: reinstall snapshot1001.eqiad.wmnet with RAID, decomm snapshot1002,3,4 - https://phabricator.wikimedia.org/T140439#2507574 (10ArielGlenn)
[09:10:47] 06Operations, 10Dumps-Generation, 07HHVM, 13Patch-For-Review: Convert snapshot hosts to use HHVM and trusty - https://phabricator.wikimedia.org/T94277#2507578 (10ArielGlenn)
[09:14:53] 06Operations, 10Datasets-General-or-Unknown: reinstall snapshot1001.eqiad.wmnet with RAID, decomm snapshot1002,3,4 - https://phabricator.wikimedia.org/T140439#2507594 (10ArielGlenn)
[12:49:39] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: Puppet has 1 failures
[12:53:59] (03CR) 10Nemo bis: [C: 031] "Certainly ok as first step." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/301893 (https://phabricator.wikimedia.org/T131340) (owner: 10Jforrester)
[13:00:58] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:08:58] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:08:58] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:12:48] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:15:27] RECOVERY - puppet last run on cp3010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[13:22:38] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:24:38] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:30:28] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:38:37] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:40:28] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:42:27] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:44:27] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:44:27] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:46:18] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:50:17] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[13:54:09] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[13:55:30] PROBLEM - puppet last run on es2012 is CRITICAL: CRITICAL: puppet fail
[13:56:07] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:02:09] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:04:17] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:13:59] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:19:48] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:19:57] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:21:08] RECOVERY - puppet last run on es2012 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[14:21:48] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:27:35] (03PS4) 10MarcoAurelio: Expanding throttle limits for enwiki Edit-a-thon [mediawiki-config] - 10https://gerrit.wikimedia.org/r/301761 (https://phabricator.wikimedia.org/T141421)
[14:27:39] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:27:39] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:27:40] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:28:18] (03CR) 10MarcoAurelio: [C: 04-1] "Per task." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) (owner: 10Kharkiv07)
[14:29:37] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:31:48] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:31:57] PROBLEM - HP RAID on ms-be1025 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:33:57] RECOVERY - HP RAID on ms-be1025 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:35:47] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[14:37:17] PROBLEM - dhclient process on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:37:27] PROBLEM - configured eth on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:37:47] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[14:37:47] PROBLEM - Disk space on Hadoop worker on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:37:50] PROBLEM - Check size of conntrack table on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:37:50] PROBLEM - salt-minion processes on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:09] PROBLEM - puppet last run on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:09] PROBLEM - Hadoop DataNode on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:12] PROBLEM - Disk space on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:37] PROBLEM - YARN NodeManager Node-State on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:41] PROBLEM - DPKG on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:48] PROBLEM - MegaRAID on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:38:58] PROBLEM - Hadoop NodeManager on analytics1045 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:40:27] RECOVERY - YARN NodeManager Node-State on analytics1045 is OK: OK: YARN NodeManager analytics1045.eqiad.wmnet:8041 Node-State: RUNNING
[14:40:31] RECOVERY - DPKG on analytics1045 is OK: All packages OK
[14:40:48] RECOVERY - MegaRAID on analytics1045 is OK: OK: optimal, 13 logical, 14 physical
[14:40:48] RECOVERY - Hadoop NodeManager on analytics1045 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager
[14:41:09] PROBLEM - puppet last run on mw2224 is CRITICAL: CRITICAL: puppet fail
[14:41:17] RECOVERY - dhclient process on analytics1045 is OK: PROCS OK: 0 processes with command name dhclient
[14:41:19] RECOVERY - configured eth on analytics1045 is OK: OK - interfaces up
[14:41:49] RECOVERY - Disk space on Hadoop worker on analytics1045 is OK: DISK OK
[14:41:51] RECOVERY - Check size of conntrack table on analytics1045 is OK: OK: nf_conntrack is 0 % full
[14:41:51] RECOVERY - salt-minion processes on analytics1045 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[14:41:58] RECOVERY - puppet last run on analytics1045 is OK: OK: Puppet is currently enabled, last run 13 minutes ago with 0 failures
[14:41:58] RECOVERY - Hadoop DataNode on analytics1045 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.hdfs.server.datanode.DataNode
[14:42:08] RECOVERY - Disk space on analytics1045 is OK: DISK OK
[14:45:14] (03CR) 10MarcoAurelio: [C: 031] "Looks good to me. Needs to be rebased though." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/301807 (https://phabricator.wikimedia.org/T140566) (owner: 10Dereckson)
[14:55:37] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:01:38] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:06:17] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.48.44 on port 6479
[15:07:29] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:08:09] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 4933024 keys - replication_delay is 0
[15:08:47] RECOVERY - puppet last run on mw2224 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[15:09:28] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:09:37] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:13:27] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:13:33] (03PS1) 10BBlack: openssl (1.0.2h-1~wmf3) jessie-wikimedia; urgency=medium [debs/openssl] - 10https://gerrit.wikimedia.org/r/301920 (https://phabricator.wikimedia.org/T131908)
[15:15:19] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:17:18] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:23:08] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:25:07] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:31:17] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:35:07] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:35:08] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:41:08] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:42:58] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:42:58] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:44:57] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:47:56] (03CR) 10Muehlenhoff: [C: 031] "Looks good to me!" [debs/openssl] - 10https://gerrit.wikimedia.org/r/301903 (owner: 10BBlack)
[15:50:48] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[15:52:47] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[15:56:48] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:02:47] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:02:48] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:04:48] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:16:37] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:20:29] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:26:27] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:30:27] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:32:37] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:34:13] (03PS2) 10BBlack: openssl (1.0.2h-1~wmf3) jessie-wikimedia; urgency=medium [debs/openssl] - 10https://gerrit.wikimedia.org/r/301920 (https://phabricator.wikimedia.org/T131908)
[16:34:28] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:44:18] PROBLEM - HP RAID on ms-be1025 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:46:17] RECOVERY - HP RAID on ms-be1025 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:46:18] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:47:06] 06Operations, 10MediaWiki-Cache, 10Traffic: Possible increase in logged-out users being served cached outdated revisions - https://phabricator.wikimedia.org/T141693#2508537 (10Glaisher)
[16:47:19] 06Operations, 10MediaWiki-Cache, 10Traffic: Possible increase in logged-out users being served cached outdated revisions - https://phabricator.wikimedia.org/T141693#2508549 (10Glaisher) p:05Triage>03High
[16:49:27] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508550 (10Aklapper)
[16:49:42] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2507607 (10Aklapper)
[16:49:45] 06Operations, 10MediaWiki-Cache, 10Traffic: Possible increase in logged-out users being served cached outdated revisions - https://phabricator.wikimedia.org/T141693#2508558 (10Aklapper)
[16:52:50] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508560 (10Aklapper) Quoting Glaisher from T141693: > Multiple reports at enwiki and OTRS. > * https://en.wikipedia.org/wiki/Wikipedia:Help_desk#Why_No_Text_In_Article....
[16:53:22] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508563 (10Aklapper) (Wondering if belated syncing of ProofRead status on Wikisource reported in T141692 might be related to caching issues.)
[16:54:49] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508566 (10Boshomi)
[16:56:32] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508567 (10Glaisher) >>! In T141687#2508563, @Aklapper wrote: > (Wondering if belated syncing of ProofRead status on Wikisource reported in T141692 might be related to...
[16:59:58] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[16:59:58] PROBLEM - HP RAID on ms-be1025 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[16:59:58] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:02:08] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:06:07] RECOVERY - HP RAID on ms-be1025 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:11:00] 06Operations, 10Phabricator, 06Project-Admins, 06Triagers: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706#2508576 (10Danny_B)
[17:11:59] 06Operations, 10Phabricator, 06Project-Admins, 06Triagers: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706#1722432 (10Danny_B)
[17:15:58] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:30:59] 06Operations, 06WMF-NDA-Requests: Please add me to #WMF-NDA - https://phabricator.wikimedia.org/T94238#2508587 (10Danny_B)
[17:31:48] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:31:48] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:31:48] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:33:48] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:35:49] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:39:48] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:41:47] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:45:38] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:53:28] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:55:27] PROBLEM - HP RAID on ms-be1024 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[17:57:18] RECOVERY - HP RAID on ms-be1024 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[17:59:18] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:04:48] PROBLEM - puppet last run on mw1246 is CRITICAL: CRITICAL: Puppet has 1 failures
[18:07:27] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:09:18] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:13:17] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:15:17] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:20:58] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:24:58] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:26:57] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:26:58] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:28:57] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:32:18] RECOVERY - puppet last run on mw1246 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:34:59] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:36:12] (03Abandoned) 10Tpt: Deploy the Kartographer extension to meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/298042 (https://phabricator.wikimedia.org/T139787) (owner: 10Tpt)
[18:36:53] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508701 (10Boshomi) T141695 is also the same
[18:38:57] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:38:58] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:40:06] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2508709 (10Boshomi)
[18:41:45] 06Operations, 10MediaWiki-Cache, 10Traffic: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2507607 (10Boshomi) in T141695 @Gestrid wrote: >I work in the English Wikipedia's Teahouse (a place for new users to ask questions and get answers), where we have recen...
[18:42:49] PROBLEM - HP RAID on ms-be1025 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[18:44:47] RECOVERY - HP RAID on ms-be1025 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[18:46:38] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[19:06:28] PROBLEM - HP RAID on ms-be1023 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[19:10:28] PROBLEM - HP RAID on ms-be1026 is CRITICAL: CHECK_NRPE: Socket timeout after 40 seconds.
[19:12:18] RECOVERY - HP RAID on ms-be1023 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[19:16:21] RECOVERY - HP RAID on ms-be1026 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[19:18:17]