[02:23:04] 10netops, 10Operations, 10fundraising-tech-ops, 10Patch-For-Review: Move codfw frack to new infra - https://phabricator.wikimedia.org/T171970#3526886 (10ayounsi) some answers from Juniper about the other issues noticed: - Presence of core dumps ``` /var/crash/corefiles: total blocks: 70484 -rw-r--r-- 1 r... [10:30:44] 10netops, 10Operations, 10monitoring, 10Patch-For-Review, 10User-fgiunchedi: Evaluate LibreNMS' Graphite backend - https://phabricator.wikimedia.org/T171167#3527271 (10fgiunchedi) Indeed it looks like librenms sends both metrics with whitespace in the name and metrics without values: ``` librenms.asw-b-... [11:02:48] 10netops, 10Operations, 10monitoring, 10Patch-For-Review, 10User-fgiunchedi: Evaluate LibreNMS' Graphite backend - https://phabricator.wikimedia.org/T171167#3527329 (10fgiunchedi) 05Resolved>03Open Reported upstream at https://github.com/librenms/librenms/issues/7167 and https://github.com/librenms/l... [12:31:19] bblack: cp3036 went down at around 12:10 (last entries in syslog), when icinga alerted I checked the mgmt (which didn't show any output at all), so I depooled it and powercycled the server [12:32:36] this brought it back up, but there's nothing kernel errors logged [12:33:25] I'm leaving it depooled for now [12:39:45] not really sure what other hardware debug options we have without a local dc-ops [12:40:00] or we simply repool and see whether it reoccurs [15:28:01] 10Traffic, 10Operations, 10Page-Previews: [Spike] Investigate the increase in the number of requests to Swift after the Page Previews deploy - https://phabricator.wikimedia.org/T173422#3527818 (10phuedx) [15:28:26] 10Traffic, 10Operations, 10Page-Previews: [Spike] Investigate the increase in the number of requests to Swift after the Page Previews deploy - https://phabricator.wikimedia.org/T173422#3527831 (10phuedx) [15:35:16] 10Traffic, 10Operations, 10Page-Previews: Investigate the increase in the number of requests to Swift after the Page Previews deploy - https://phabricator.wikimedia.org/T173422#3527879 (10phuedx) [15:55:10] 10Traffic, 10Operations, 10Page-Previews, 10Readers-Web-Backlog (Tracking): Investigate the increase in the number of requests to Swift after the Page Previews deploy - https://phabricator.wikimedia.org/T173422#3527965 (10Jdlrobson) [17:12:46] 10Traffic, 10Operations, 10Page-Previews, 10Readers-Web-Backlog (Tracking): Investigate the increase in the number of requests to Swift after the Page Previews deploy - https://phabricator.wikimedia.org/T173422#3528243 (10fgiunchedi) So the increase in swift requests seem to be cyclic (daily) and correspon... [17:38:52] 10netops, 10Operations, 10ops-eqiad: eqiad: rack frack refresh equipment - https://phabricator.wikimedia.org/T169644#3528372 (10Cmjohnson) this has been slow progress...During the initial racking, all the screws were tightened too tight and now have to be drilled off. [19:49:06] 10Traffic, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 10Operations, and 15 others: Purge Varnish cache when a banner is saved - https://phabricator.wikimedia.org/T154954#3528888 (10DStrine) @Pcoombe can you verify if this is working? [20:06:12] 10Traffic, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 10Operations, and 15 others: Purge Varnish cache when a banner is saved - https://phabricator.wikimedia.org/T154954#3528988 (10Pcoombe) 05Open>03Resolved Yes, and life is much easier. Thanks! [20:21:29] 10Traffic, 10Operations: Degraded RAID on cp1008 - https://phabricator.wikimedia.org/T171028#3529010 (10Cmjohnson) a:03ema @ema I replaced the ssd and reinstalled. All yours! resolve once you confirmed everything is okay