[07:29:38] 10Traffic, 10Cloud-VPS, 10Operations, 10Toolforge, 10Patch-For-Review: Wikimedia varnish rules no longer exempt all Cloud VPS/Toolforge IPs from rate limits (HTTP 429 response) - https://phabricator.wikimedia.org/T213475 (10akosiaris) I 've added the capacity to varnish puppet code to augment the wikimed... [09:24:25] 10Traffic, 10Operations, 10Proton, 10Reading-Infrastructure-Team-Backlog, and 3 others: Document and possibly fine-tune how Proton interacts with Varnish - https://phabricator.wikimedia.org/T213371 (10akosiaris) >>! In T213371#4932956, @pmiazga wrote: > @Tgr I assume you're still waiting for answers from @... [09:37:09] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10fgiunchedi) >>! In T211661#4931840, @ori wrote: >>>! In T211661#4931056, @fgiunchedi wrote: >> And indeed I share the concerns already mentioned, na... [10:15:00] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10Gilles) I thought we would start with a very low percentage and ramp it up gradually. And yes, I thought our beloved swift proxy is where it would l... [10:16:39] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10jijiki) Is it possible to hold this a bit for until after we upgrade all Thumbor servers to stretch? Two birds with one stone :) [10:23:37] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10Gilles) I'd argue that we don't want both changes to happen around the same time. And this is probably less prone to emergency bugfixes than the Str... [11:02:24] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10jijiki) >>! In T211661#4934183, @Gilles wrote: > I'd argue that we don't want both changes to happen around the same time. And this is probably less... [12:03:53] 10Traffic, 10Operations, 10ops-eqsin: Degraded RAID on cp5010 - https://phabricator.wikimedia.org/T214274 (10jijiki) p:05Triage→03Normal [12:04:09] 10Traffic, 10Operations: cp nodes still try to OCSP staple the already expired digicert-2017 certificate - https://phabricator.wikimedia.org/T215103 (10Vgutierrez) 05Open→03Resolved a:03Vgutierrez After merging the change, the following commands have been issued over cumin: ` rm -f /etc/update-ocsp.d/dig... [12:10:11] 10HTTPS, 10Traffic, 10Operations: en.wikipedia.com [sic] serves an invalid certificate - https://phabricator.wikimedia.org/T214253 (10Joe) p:05Triage→03Low [12:37:48] 10Traffic, 10MobileFrontend, 10Operations, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10Joe) p:05Triage→03Normal [12:48:47] 10Domains, 10HTTPS, 10Traffic, 10DNS, 10Operations: Merge Wikipedia subdomains into one, to discourage censorship - https://phabricator.wikimedia.org/T215071 (10jijiki) p:05Triage→03Normal [12:50:54] 10Traffic, 10MobileFrontend, 10Operations, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10Krenair) (People interested in merging subdomains may also be interested in {T215071} which is about mergi... [13:02:47] 10Traffic, 10DNS, 10Operations, 10fundraising-tech-ops: remove IBM/Silverpop 1024-bit domain key - https://phabricator.wikimedia.org/T214525 (10jijiki) p:05Triage→03Normal [13:10:44] 10Domains, 10HTTPS, 10Traffic, 10DNS, 10Operations: Merge Wikipedia subdomains into one, to discourage censorship - https://phabricator.wikimedia.org/T215071 (10BBlack) The linked ESNI ticket is kind of a random user question ticket, and not actually one created for working on it (which still off in the... [13:41:25] 10Domains, 10HTTPS, 10Traffic, 10DNS, 10Operations: Merge Wikipedia subdomains into one, to discourage censorship - https://phabricator.wikimedia.org/T215071 (10BBlack) p:05Normal→03Low Expounding on the lamentations above in a more realistic triage sort of sense: * It's a very complex project which... [15:49:27] https://blog.powerdns.com/2019/02/07/the-big-dns-privacy-debate-at-fosdem/ [15:57:21] damn! where were we while that was happening? [15:57:23] :_( [15:59:43] fetching the stickers! [16:06:54] 10Certcentral: Rename the Certcentral project - https://phabricator.wikimedia.org/T207389 (10BBlack) Sounds good to me! [16:07:48] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10ori) I don't understand the preference for sampling Swift requests rather than Varnish requests. You'd have greater resilience to overload (for the... [16:10:53] 10Traffic, 10Operations, 10ops-eqsin: Degraded RAID on cp5010 - https://phabricator.wikimedia.org/T214274 (10Vgutierrez) since @ayounsi is going to eqsin datacenter later this month maybe we could join efforts and replace sdb. ^^ @RobH [20:33:25] 10Traffic, 10Operations, 10ops-eqsin: Degraded RAID on cp5010 - https://phabricator.wikimedia.org/T214274 (10RobH) Ok, I opened a support request with dell to ship a replacement SSD to eqsin: Confirmed: Request 986142470 was successfully submitted. [20:38:00] 10Traffic, 10Operations, 10Performance-Team, 10media-storage: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10Gilles) The "initial ramp up" might not ever be done, if we reach a point where the writes and deletes introduced are creating too much overhead, we... [21:05:20] 10Traffic, 10Operations, 10ops-eqsin: Degraded RAID on cp5010 - https://phabricator.wikimedia.org/T214274 (10RobH) Oh, just the output from troubleshooting on the system. The system should show TWO SSDs and only sees one now: ` robh@cp5010:~$ cat /proc/mdstat Personalities : [raid1] [linear] [multipath]... [21:12:48] 10Traffic, 10Operations, 10ops-eqsin: Degraded RAID on cp5010 - https://phabricator.wikimedia.org/T214274 (10Vgutierrez) that's right, the kernel shutdown sdb due to the errors, that's why is not even listed on lshw [21:19:42] 10Traffic, 10Operations, 10ops-eqsin: Degraded RAID on cp5010 - https://phabricator.wikimedia.org/T214274 (10Vgutierrez) here is the log line: `Jan 21 01:39:21 cp5010 kernel: [7472184.163052] sd 1:0:0:0: [sdb] Stopping disk`