[07:43:20] 10Traffic, 10Operations: cp4024 kernel errors - https://phabricator.wikimedia.org/T174891#3576445 (10ema) p:05Triage>03Normal [07:46:27] 10Traffic, 10Operations: cp4024 kernel errors - https://phabricator.wikimedia.org/T174891#3576064 (10ema) Thanks @elukey! Yeah cp4024 might be having hardware issues. The system was down yesterday at 9ish AM UTC. I've power-cycled it and it came back online fine, but then after some hours it started with the l... [11:53:23] gehel: around? Time to merge https://gerrit.wikimedia.org/r/#/c/375354/? [11:54:46] ema: give me 5' to get a coffee and I'm all yours [11:55:08] nice [11:59:38] ema: I'm all yours! [12:01:45] gehel: OK, merging [12:05:50] gehel: I've forced a puppet run on cp4005 and maps requests are being served fine, running puppet elsewhere now [12:06:05] ok [12:13:56] gehel: all good as far as I can tell, let me know if you notice anything strange! [12:14:20] ema: nothing strange yet... I'll keep an eye on it and let you know [12:14:22] thanks! [12:16:25] gehel: I'm now looking for rate limited requests, this is the command FTR: [12:16:26] 10netops, 10Operations, 10ops-esams: Setup esams atlas anchor - https://phabricator.wikimedia.org/T174637#3568685 (10faidon) @mark assigned asset tag `WMF4203` to this device. The image has also been generated (for AS43821) and can be found on install1002. [12:16:28] varnishncsa -n frontend -q 'ReqHeader eq "Host: maps.wikimedia.org" and RespStatus eq 429' [12:16:52] none so far in esams [12:20:57] looking good! (not that I expected anything else :) [12:22:39] 10Traffic, 10Discovery, 10Discovery-Analysis, 10Maps, and 3 others: What is a reasonable per-IP ratelimit for maps - https://phabricator.wikimedia.org/T169175#3389649 (10Gehel) Rate limiting has been enabled by @ema. Everything is looking good so far. This task can be closed and we'll open up follow up tas... [13:03:13] 10Traffic, 10Operations: Recurrent 'mailbox lag' critical alerts and 500s - https://phabricator.wikimedia.org/T174932#3577349 (10fgiunchedi) [13:26:19] LMK what you think of ^, I noticed mailbox lag-related restarts got worse, maybe just lately? probably once a day or so [13:36:12] 10Traffic, 10DBA, 10Operations: Substantive HTTP and mediawiki/database traffic coming from a single ip - https://phabricator.wikimedia.org/T166695#3577497 (10Marostegui) 05stalled>03Resolved Closing this as this has not happened again in months. If it happens again, let's reopen and follow up [15:15:30] 10netops, 10Operations, 10monitoring, 10User-fgiunchedi: Grafana dashboards for librenms graphite data - https://phabricator.wikimedia.org/T171823#3577956 (10fgiunchedi) I checked librenms' readings for current/voltage and indeed match what's being pushed to graphite. The default aggregation in graphite we... [16:33:50] 10Traffic, 10Operations, 10Wikimedia-Logstash: Varnish does not vary elasticsearch query by request body - https://phabricator.wikimedia.org/T174960#3578104 (10mobrovac)