[00:00:57] 10Traffic, 06Operations: Content purges are unreliable - https://phabricator.wikimedia.org/T133821#2351728 (10ori) >>! In T133821, @BBlack wrote: > Therefore, it's easy for a race condition to occur where an upper-layer cache gets purged of the item, then immediately gets a new request for the item, and then r... [00:59:46] 10Traffic, 06Operations: Content purges are unreliable - https://phabricator.wikimedia.org/T133821#2351822 (10BBlack) @ori - thanks for the links! It's good to know it's only one extra purge, I wasn't even sure of that. Technically, even if the delay is high enough, 2 purges can be insufficient to stop all r... [01:49:15] 10Traffic, 06Mobile-Apps, 06Operations, 06Wikipedia-Android-App-Backlog: WikipediaApp for Android hits loads.php on bits.wikimedia.org - https://phabricator.wikimedia.org/T132969#2351878 (10Mholloway) [02:10:41] 07HTTPS, 10Traffic, 06Operations, 10Wiki-Loves-Monuments: configure https for www.wikilovesmonuments.org - https://phabricator.wikimedia.org/T118388#2351919 (10Dzahn) You are right, it's fine with www, there it gets an A rating and looks fine to me https://www.ssllabs.com/ssltest/analyze.html?d=www.wikil... [02:14:17] 07HTTPS, 10Traffic, 06Operations, 10Wiki-Loves-Monuments: configure https for www.wikilovesmonuments.org - https://phabricator.wikimedia.org/T118388#2351922 (10Dzahn) >>! In T118388#2351919, @Dzahn wrote: > Ideally with and without www would both be on the cert as "SANs" and after looking closer i see t... [05:25:13] 07HTTPS, 10Traffic, 06Operations, 05MW-1.27-release-notes, 13Patch-For-Review: Insecure POST traffic - https://phabricator.wikimedia.org/T105794#2352082 (10Whatamidoing-WMF) My team [[https://de.wikipedia.org/wiki/Benutzer_Diskussion:Merlissimo#Heads-up:_your_bot_may_break_soon.21 |left a message on wiki... [05:44:52] 10Traffic, 06Operations: Content purges are unreliable - https://phabricator.wikimedia.org/T133821#2352086 (10ori) >>! In T133821#2245711, @BBlack wrote: > However, we reverted this because it seemed to make the race issues worse at the time. How did you know? Do we have a way of tracking how often we hit the... [06:57:45] ema, bblack: the new nginx package fails to install on the to-be-decommed cp1*/cp3 hosts since the nginx config is now outdated and still references https://phabricator.wikimedia.org/P3208 [06:58:05] as workaround to resolve the puppet failures I would simply remove nginx from these systems [07:00:01] but I think it would be best to go ahead and ditch these from site.pp/dns etc. and power them down, the actual unracking/wiping can still be done by the dc ops later on [07:51:13] 07HTTPS, 10Traffic, 06Operations, 05MW-1.27-release-notes, 13Patch-For-Review: Insecure POST traffic - https://phabricator.wikimedia.org/T105794#2352261 (10Qgil) @Steinsplitter is active in this task and he might have suggestions about next steps. CCing @Bmueller just in case she has other ideas. [08:35:51] 07HTTPS, 10Traffic, 06Operations, 05MW-1.27-release-notes, 13Patch-For-Review: Insecure POST traffic - https://phabricator.wikimedia.org/T105794#2352372 (10Bmueller) @Qgil, thanks for letting me know. @Andrew already emailed me last night. I'm going to talk to some people and we try to figure something out. [08:52:58] moritzm: +1 for removing nginx from the spares as a temporary workaround [08:56:03] ema: I did that for a few, but it's not really trivial, the postrm scripts of the new module packages are failing since they can't run the nginx binary [08:56:35] and downgrading to the stock debian version won't work either since the nginx config is littered with options from post-jessie (like socketreuse etc.) :-) [08:56:52] annoying [09:02:45] moritzm: on cp1043 I've replaced nginx.conf with nginx.conf.dpkg-dist [09:03:08] I guess we could get away with that [09:03:46] oh, and rm /etc/nginx/sites-available/unified [09:03:57] but why keep these actually running? with the outdated cfg etc they won't be usable as emergency replacements anyway [09:04:33] yeah that's a good point [09:04:53] let's wait for bblack, perhaps he had something in mind [09:04:55] my proprosal would be to perform the non-dc-ops tasks from https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Reclaim_or_Decommission [09:05:18] and whenever someone is in the esams datacenter, the remaining unracking etc. can be done [09:05:20] ema: ok [10:30:53] 10Traffic, 10Wikimedia-Apache-configuration, 10DNS, 06Operations: Create moon.wikimedia.org and redirect it to https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon - https://phabricator.wikimedia.org/T136557#2352598 (10MartinRulsch) >>! In T136557#2342785, @BBlack wrote: > Unclear from the description: Is... [12:33:37] re: nginx, if the immediate problem is solved I'd leave them up. they still need proper disk wiping, whatever that procedure is (I assume we have one that works for SSD). [12:40:22] disks will be wiped (or shredded in case of SSDs) when unracked, but if we shut them down (and drop from site.pp etc) we're avoiding further maintenance overhead on the OS level [12:40:50] or are they still being used for tests or else? [12:44:12] 07HTTPS, 10Traffic, 06Operations, 05MW-1.27-release-notes, 13Patch-For-Review: Insecure POST traffic - https://phabricator.wikimedia.org/T105794#2352863 (10BBlack) Note T121279 is more-specific to the Merlbot/Java issues and has some recent traffic at the bottom too. [12:45:24] moritzm: they're not being used, and no plans to re-use them as spare/replacement caches at this time, either. [12:46:29] moritzm: we don't know their final fate technically, though. depending on warranty date and other needs, they may get re-used elsewhere. I *think* (not sure) the usual course now is to leave them up in some kind of spare/standard role and puppeting until someone makes the call on complete decom or reuse/spare? [12:47:48] moritzm: also, I hope the disk wipe/shred isn't only when unracked. it should be on role-switch to any new usage at least for these nodes, too. We don't want the private keys that are on those SSDs subject to compromise by some less-secure application role they may land in next. [12:48:29] moritzm: but since they're SSDs, commandline file-shred does basically nothing. we need whole-disk secure erase. [12:49:39] https://ata.wiki.kernel.org/index.php/ATA_Secure_Erase [12:52:19] shred as in physically destroyed :-) it [12:52:29] oh ok :) [12:52:30] at least that's why I read in a previous ticket by Rob [12:52:54] still, I wonder if our current policies cover the need to shred cp* disks on re-use in another role [12:53:14] (afaik we haven't reused them in other roles before, but still) [12:54:44] https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Reclaim_or_Decommission indicates the wiping is only done when decomissioning/unracking [12:54:57] but might be out of date as other wikitech pages, not sure [14:29:16] 10Traffic, 06Operations: Set up LVS connection sync - https://phabricator.wikimedia.org/T136944#2353061 (10BBlack) [14:29:35] 10Traffic, 06Operations: Set up LVS connection sync - https://phabricator.wikimedia.org/T136944#2353074 (10BBlack) [16:44:04] 10Traffic, 10Wikimedia-Apache-configuration, 10DNS, 06Operations: Create moon.wikimedia.org and redirect it to https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon - https://phabricator.wikimedia.org/T136557#2353461 (10BBlack) I'm not so much asking about the total timeline, but about whether the intent i... [19:02:45] 07HTTPS, 10Traffic, 06Operations, 05MW-1.27-release-notes, 13Patch-For-Review: Insecure POST traffic - https://phabricator.wikimedia.org/T105794#2354003 (10BBlack) Graph from logstash of insecure req rate over the past 28 days in 12h increments: {F4108994} [20:14:43] 07HTTPS, 10Traffic, 06Operations, 10Wiki-Loves-Monuments: configure https for www.wikilovesmonuments.org - https://phabricator.wikimedia.org/T118388#2354315 (10Dzahn) strictly the ticket is resolved because it just says to configure it for "www". but would be nice if we can get this fixed too since the cer...