[02:50:57] 10Traffic, 10Operations, 10Wikimedia-Stream: stream.wikimedia.org - redirect http(s) to docs - https://phabricator.wikimedia.org/T70528#3381109 (10Krinkle) 05Resolved>03Open While the certificate issue has been resolved, and RCStream (at `stream.wikimedia.org/rc/`) is indeed being deprecated and replaced... [03:00:09] 10HTTPS, 10Traffic, 10Commons, 10MediaWiki-File-management, and 3 others: ForeignAPIRepo wrongly returns non-protocol-relative URLs for original "thumbs" - https://phabricator.wikimedia.org/T50133#3381113 (10Krinkle) 05stalled>03Open [04:11:52] 10HTTPS, 10Traffic, 10MediaWiki-General-or-Unknown, 10Operations: Make default interwiki map links protocol-relative - https://phabricator.wikimedia.org/T33327#3381133 (10demon) 05Open>03declined [07:43:10] 10Traffic, 10Commons, 10Operations, 10Wikimedia-Site-requests, and 2 others: Allow anonymous users to change interface language on Commons with ULS - https://phabricator.wikimedia.org/T161517#3381283 (10Nikerabbit) >>! In T161517#3380185, @BBlack wrote: > This task has gotten a bit confusing. I wrote a pr... [10:21:24] 10Traffic, 10MobileFrontend, 10Operations, 10Reading-Web-Backlog, and 3 others: Remove disableImages handling from VCL - https://phabricator.wikimedia.org/T168013#3381692 (10phuedx) @ema: Last week's train ran, so you should be able to merge and deploy {332d2ca8abac7393494b11a96e3b02834f209b4b} now. [13:36:30] 10HTTPS, 10Traffic, 10Commons, 10MediaWiki-File-management, and 3 others: ForeignAPIRepo wrongly returns non-protocol-relative URLs for original "thumbs" - https://phabricator.wikimedia.org/T50133#534558 (10Anomie) >>! In T50133#1614931, @Tgr wrote: > The relevant part is `ApiQueryImageInfo.php` [[ https:/... [14:23:02] 10netops, 10Operations, 10fundraising-tech-ops: BGP session between pfw clusters flapping - https://phabricator.wikimedia.org/T164777#3382359 (10ayounsi) 05Open>03Invalid Going to replace the pfw soon, not worth investigating more, unless it's causing visible issues. [14:24:24] 10Traffic, 10DNS, 10Operations, 10Services (watching): icinga alerts on nodejs services when a recdns server is depooled - https://phabricator.wikimedia.org/T162818#3382365 (10GWicke) [14:26:33] 10netops, 10Operations, 10monitoring: Setup flow monitoring of *internal* network traffic - https://phabricator.wikimedia.org/T79755#3382373 (10ayounsi) 05Open>03Resolved Alerts added to the dashboard (not tied to nagios, but shows up in the "single pane of glass" dashboard in LibreNMS. I think that sati... [14:30:42] 10Traffic, 10Commons, 10Operations, 10Wikimedia-Site-requests, and 2 others: Allow anonymous users to change interface language on Commons with ULS - https://phabricator.wikimedia.org/T161517#3382391 (10BBlack) Ok I think I was confused as to the state of the uselang hack. It looks like it already works,... [14:46:27] 10netops, 10Operations: Filter outgoing BGP announcements on AS regex - https://phabricator.wikimedia.org/T83037#3382464 (10ayounsi) 05Open>03Resolved a:03ayounsi I believe the previous comment fixes the initial request, other improvements (like community no-export or as-path will be investigated in larg... [14:56:47] 10Traffic, 10Commons, 10Operations, 10Wikimedia-Site-requests, and 2 others: Allow anonymous users to change interface language on Commons with ULS - https://phabricator.wikimedia.org/T161517#3382537 (10Nikerabbit) Thanks for the detailed reply which explains why it isn't so simple as I thought. I need to... [15:30:02] 10Traffic, 10Operations, 10Phabricator, 10Patch-For-Review, 10User-fgiunchedi: phab.wmfusercontent.org "homepage" yields a 500 - https://phabricator.wikimedia.org/T166120#3382702 (10mmodell) 05Open>03Resolved a:03fgiunchedi [16:39:16] we've just reverted image scaling back to mw due to T168949 btw, I was looking into how to get a list of affected urls and then ban [16:39:16] T168949: Proper thumbnails of portrait photos not being generated; serious display issues - https://phabricator.wikimedia.org/T168949 [16:45:42] bblack: re google API usage, is there a task with details? [16:50:47] gwicke: what I was thinking of specifically in that case was a heavy volume of reqs coming from *.googleusercontent.com to the deprecated RCStream entrypoint, using plain HTTP instead of HTTPS (we'd like them to use HTTPS, preferably also switch to EventStreams) [16:52:21] https://phabricator.wikimedia.org/T168919#3380934 is the only task ref I'm aware of [16:52:30] (from me last night heh) [16:56:00] I *think* with the rcstream case it may not really be a google project originating the traffic, more likely a 3rd party user of GCE computing resources? [16:56:12] I really don't know [17:20:37] 10Traffic, 10MobileFrontend, 10Operations, 10Reading-Web-Backlog, 10Patch-For-Review: Remove disableImages handling from VCL - https://phabricator.wikimedia.org/T168013#3383537 (10Jdlrobson) [17:21:06] 10Traffic, 10MobileFrontend, 10Operations, 10Reading-Web-Backlog, 10Patch-For-Review: Remove disableImages handling from VCL - https://phabricator.wikimedia.org/T168013#3352978 (10Jdlrobson) [17:28:23] bblack: yeah, that doesn't sound like anything official from google; they tend to be really good about setting proper UAs [17:28:43] and definitely don't use googleusercontent.com [17:35:49] bblack: I've collected a list of file names to give to purgeList, that's about 700k files (in my home on terbium). is there a recommended batch size for this type of thing? [17:36:06] the relevant task is T168949 [17:36:06] T168949: Proper thumbnails of portrait photos not being generated; serious display issues - https://phabricator.wikimedia.org/T168949 [17:59:50] 10Traffic, 10Operations: stream.wikimedia.org: remove legacy rcstream/socket.io HTTPS redirect hole punches - https://phabricator.wikimedia.org/T168919#3380816 (10BBlack) [18:01:18] 10Traffic, 10Operations: stream.wikimedia.org: remove legacy rcstream/socket.io HTTPS redirect hole punches - https://phabricator.wikimedia.org/T168919#3380816 (10BBlack) Answering my own timeline question, it looks like it was announced that RCStream goes away July 7th! [18:02:41] godog: eep I don't know? :) [18:02:44] thinking [18:03:15] bblack: heh turns out purgeList isn't even what I was looking for :( thanks tho! [18:03:24] I can probably do something with it [18:03:27] it isn't going to purge files from swift, just issue htcp [18:03:45] oh right, yeah I need to wait for swift before htcp [18:03:59] but regardless, there might be more-efficient ways for us to do the purge on the varnish side than via purgeList [18:04:47] ah! I'm assuming you'd prefer a list of urls in that case? [18:05:14] maybe, or we might just roll through some cache wipes in such a large case. I don't know for sure. I'm peeking at your lists and thinking about things so far. [18:14:26] thanks! I'm trying to figure out how to purge that list from swift [18:14:52] note that the issue affects only jpegs with soft rotation, so the list needs further trimming [18:16:21] I guess the initial ~700k list is all jpegs generated by thumbor in the right timeframe? [18:17:33] yeah, filtering down for jpegs is ~500k but still [18:18:01] "soft rotation" means some jpeg metadata that tells the rendering client to rotate for itself, or? [18:18:45] yeah, in exif metadata there's the orientation degrees value [18:18:57] ok [18:21:25] godog: would filtering on a range of Last-Modified datestamps work for the varnish level stuff maybe? I don't recall if those are actually set correctly on thumb generation [18:22:46] or we can just shove the list through htcp, that's ok too :) [18:23:14] I don't know if purgeList takes a custom multicast address or not [18:23:43] bblack: hehe last-modified would be set for files in swift and it'd be correct yeah, though answers straight from thumbor wouldn't have l-m I believe [18:23:47] 239.128.0.112 is the standard global multicast for HTCP that most MW things end up using, probably purgeList too [18:24:09] but 239.128.0.113 only targets the upload cluster, avoiding the purge spike towards the text caches [18:24:52] I'd guess it'd use the general one yeah, purgeList doesn't seem to take a multicast address no [18:26:26] so on the general address (.112), our "normal" rate these days averages about 4800 purge/sec [18:27:33] so if you, say, managed to limit this new purgeList run to ~10% of that at ~480/sec, it would take about 18 minutes to run through them at that speed [18:28:17] I imagine purgeList has its own perf issues, it may not even be able to run that fast. But I guess you could call it with batches and check the speed it runs at, possibly adding some inter-batch delays if necc [18:28:40] (the ~18 mins was based on ~500K URLs) [18:29:50] nice, yeah that's faster than I imagined actually! there would be some fan out too due to mw asking swift for the list of thumbs and purge those individually [18:30:13] so ok anything around 40-50/s sounds definitely ok [18:34:16] also yeah I suspect purgeList for files isn't going to be that fast, including the roundtrip to swift [18:43:57] right, ok [18:45:51] anyways gilles has fixed the bug and we've turned thumbor back on [22:17:02] 10Traffic, 10Operations, 10Patch-For-Review: Unprovision cache_misc @ ulsfo - https://phabricator.wikimedia.org/T164610#3239748 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by bblack on neodymium.eqiad.wmnet for hosts: ``` ['cp4001.ulsfo.wmnet', 'cp4002.ulsfo.wmnet', 'cp4003.ulsfo.wmnet', 'cp4... [23:04:38] 10Traffic, 10Operations, 10Patch-For-Review: Unprovision cache_misc @ ulsfo - https://phabricator.wikimedia.org/T164610#3384666 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['cp4001.ulsfo.wmnet', 'cp4002.ulsfo.wmnet', 'cp4003.ulsfo.wmnet', 'cp4004.ulsfo.wmnet'] ``` Of which those **FAILED**:... [23:06:05] 10Wikimedia-Apache-configuration, 10Wikimedia-Site-requests, 10Patch-For-Review, 10Wikidata-Sprint: Create a URL rewrite to handle the /data/ path for canonical URLs for machine readable page content - https://phabricator.wikimedia.org/T163922#3384669 (10Ladsgroup) I cherry-picked the patch in beta cluster... [23:22:41] 10Traffic, 10Operations, 10ops-ulsfo, 10Patch-For-Review: replace ulsfo aging servers - https://phabricator.wikimedia.org/T164327#3384704 (10BBlack) [23:22:43] 10Traffic, 10Operations, 10Patch-For-Review: Unprovision cache_misc @ ulsfo - https://phabricator.wikimedia.org/T164610#3384701 (10BBlack) 05Open>03Resolved a:03BBlack I had to manually fix up salt keys and do final reboots on 4001+4003, all should be sane and consistent now (except for a couple of IPM...