[06:37:05] I am in the ganeti5002's mgmt console, nothing from racadm getsel but the tty is frozen [06:39:08] but from gnt-instance list I don't see anything running on it [06:40:16] powercycling [06:48:16] of course it was expired downtime https://phabricator.wikimedia.org/T261130 [06:50:13] ok so from https://wikitech.wikimedia.org/wiki/Ganeti#Shutdown_a_node_for_a_prolonged_period_of_time it seems that I explicitly need to re-add it to join the cluster, so it is safe to shutdown now [08:44:52] huh, that's weird. `wmf-auto-reimage` says "Still waiting for reboot after 20.0 minutes", but the machine came back up 14mins ago [08:45:15] ohh. it didn't reboot into the installer. sigh. [08:48:24] yeah I should add a check for that [08:48:27] sometimes happen [08:48:37] kormat: if you force a pxe reboot from the console [08:48:46] the reimage will continue [08:48:58] if you succeed before the tineout [08:49:53] to make it rebbot into d-i [08:50:29] ahh. the motherboard was replaced, but the MAC wasn't updated in puppet [08:50:41] :/ [08:50:49] * volans|off back off [08:50:57] :) [09:42:39] Back in 2005 or so, we had a bunch of dead HP/Compaq DL series servers that went dead after a power booboo (getting L1 and L2 instead of L1 and N...), so they we had the MBs replaced. The HP tech went so far as to reprogram the new boards' MACs to what the old boards had, whcih was nice. [12:42:07] nobody tell vo.lans, but the `sre.hosts.downtime` cookbook is _super_ useful [12:43:25] you've pinged him three times with that sentence [12:43:37] haha [12:44:02] if you write "cookbook" on a public channel he gets an email too [12:44:36] the third time someone says the c o o k b o o k word he'll get a marine radiogram over RTTY [12:47:30] disgusting!And the fourth time there will be an ICBM? [12:47:56] wtf?! where did that first word come from!? [16:40:11] irc to nmea gateway [16:50:06] I'm afraid to ask... what happens if you turn out the lights and say cookbook into a mirror 3 times [16:52:26] please do *not* summon Mirror Universe Riccardo [17:08:50] :D [20:38:25] Looking for help on moving https://phabricator.wikimedia.org/T238285 forward, which is an issue with some traffic layer (Varnish? ATS?) implicitly doing a url rewrite that is in violation of an IETF spec and making some non-article content inaccesible to end users. [20:38:44] specifically it appears to be corrupting a trialing semi-colon [20:38:50] trailing* [20:55:23] Krinkle: why didn't you try your repro on an appserver? [20:55:53] oh, I see, and v.gutierrez did some similar testing at https://phabricator.wikimedia.org/T238285#5674573 [20:59:43] cdanis: aye, kind of, that's directly to ats individually I think? [20:59:54] not all the way down to apache [21:00:11] yeah [21:00:26] sorry, I didn't read the full backlog and it's rather late on a Friday here [22:30:34] Krinkle: the issue hasn't changed. there's an ATS "bug" here because ATS is ancient enough to care about ; as a delimiter and its parser is buggy with it [22:30:49] I mean, it is a bug, it's just not a bug most people run into [22:30:57] (people being ATS users) [22:31:58] the bug was already pointed at in vg's old posts on that ticket [22:33:23] [and most of the comments by all parties in that tickets make some pretty bold assertions in this context, re: standards and what is or isn't right or appropriate here, I think I could nitpick them all to death, but that's not really the point] [22:33:56] but nothing with this is as simple as "traffic layer is violating the IETF spec, please stop!" [22:36:04] I'm pretty sure a careful review of the IETF specs would find multiple places where they are self-contradictory :) [22:36:43] well that too, for sure :) [22:37:28] I'm actually not even confident the bug is in the parsing side of it, it might be in the reconstruction part, which is just failing to put the delimiter back in place when it's followed by no param characters [22:37:41] but a more-modern http codebase probably never would've parsed for ;param in the first place [22:40:04] ignoring the ancientness of the ";param" construct at the heart of this, it's otherwise equivalent to what would happen with a trailing ? for modern query args [22:41:09] (which is that it should be preserved, but in this case it's not) [22:42:23] to make matters more-confusing, our Varnish config explicitly decodes semicolon to match how wikimedia handles it in titles [22:43:16] anyways, it's probably easy to repro this as a testcase for ATS itself, and it probably needs a bug filed at that level and/or a patch to fix [23:16:09] bblack: ack, there's definitely some cases where a middleware muddles with the path component, such as our url encoding normalization for MW titles in VCL to avoid unpurged cache entries for browsers that encode things weirdly, and Apache e.g. has ways to rewrite in a way that appends stuff to a query string, which it would thus need to understand, parse and serialize again to some extent [23:17:18] but at the same time, for the case where a user of the server isnt' explicitly doing that, I'd expect transparent pass-through since whether query strings are in use at all is an app-level decision. E.g. for OAuth and jobrunner we use HMAC over the full path component for security, which would break even if middlewere did superficial changes only. [23:18:11] an application may use ? & or ; or / anywhere in its path for whatever purpose [23:18:28] anyway, I missed that the issue was already pinpointed to ATS. [23:19:06] I'd file an upstream ticket, but I'm not sure I'll be able to describe a repro in a way that makes sense to upstream.