[00:44:48] bd808: anomie: around? we were about to deploy https://gerrit.wikimedia.org/r/#/c/271618/ but it turned out to require a bunch of further backports and ori would like extra eyes for testing [00:45:19] nevermind, we are putting it off [00:45:21] is the trains were running on time ... [00:45:26] s/is/if/ [00:58:22] * Krinkle is looking for a brave soul to merge https://gerrit.wikimedia.org/r/270485/ [00:58:24] * Krinkle is looking for a brave soul to merge https://gerrit.wikimedia.org/r/270485 [01:06:38] * RoanKattouw looks at what that is, closes browser tab, runs away in fear [01:22:30] Krinkle: we're not cutting a new branch tomorrow right? [01:22:36] bd808: indeed. [01:22:45] bd808: wmf.14 was cut last week and per greg is going out tomorrow [01:22:52] and next week we'll cut a fresh branch that contains two weeks of changed [01:22:54] changes* [01:22:57] I'll +2 after swat is done [01:23:03] thx [01:23:11] unless legoktm beats me to it :) [01:57:38] TimStarling: It seems parser tests also default script path to / which is rather interesting (note, not empty string) [01:58:58] yeah, the wiki is totally broken if you set it to /, everything becomes protocol-relative [02:02:32] my test wikis are set up each in their own virtual host with $wgScriptPath = '', $wgArticlePath = '/index.php/$1', which is easy to set up [02:07:25] TimStarling: I see. Yeah, that's neat [02:09:04] TimStarling: Curiously enough though, both parserTest.inc and phpunit/NewParserTest use $wgScript = '/index.php'; $wgScriptPath = '/'; $wgArticlePath = '/wiki/$1'; [02:09:45] It also resets wgStylePath and wgExtensionAssetsPath, so it basically covers most if not all derivatives of wgScriptPath [02:14:53] Submitted https://gerrit.wikimedia.org/r/#/c/272659/ [02:15:02] Let's hope it passes, though it probably won't. [02:32:12] do you think https://gerrit.wikimedia.org/r/#/c/271748/ is too ugly to consider? [02:32:34] adding output parameters to half a dozen functions to propagate errors through a whole lot of layers that didn't have them before? [02:36:25] well, look at that. It passed. why I shall [02:36:29] https://gerrit.wikimedia.org/r/#/c/272659/ [02:45:02] you want me to merge that now? [02:46:25] Yeah, that'd be great :) [08:41:31] all right, we've got another misplaced session [08:44:28] MaxSem: ... -> _security? [08:45:14] nah, it was reported publicly [08:45:35] link? [08:46:04] https://phabricator.wikimedia.org/T126069#2054327 [08:47:13] oh, ffs. [08:49:03] tgr: ^ [08:51:54] <_joe_> legoktm: phone them. [08:52:03] <_joe_> also please yes, _security [17:52:28] "Undefined index: id in /srv/mediawiki/php-1.27.0-wmf.13/includes/session [17:52:43] "/CookieSessionProvider.php on line 129" [18:19:10] *grumble* mobileFE unstubs $wgUser in load.php [18:29:38] bd808: https://gerrit.wikimedia.org/r/#/c/269843/ ? [19:23:39] csteipp: are you about? [20:00:59] ori: Yeah [20:07:34] bd808: i still see the cookie provider thing int he logs [20:07:54] aude: yeah it's in .14 but not .13 [20:08:09] I made a backport patch but I'm not sure when that will get deployed [20:08:56] ok [20:08:59] aude: if you want to deploy it -- https://gerrit.wikimedia.org/r/#/c/272786/ [20:09:06] i can [20:09:28] need to run scap, because of introducing new i18n messages :/ [20:09:42] i can do your patch first [20:09:45] we've got another .13 backport too at https://gerrit.wikimedia.org/r/#/c/272733/ and a .14 at https://gerrit.wikimedia.org/r/#/c/272734/ [20:09:57] ok [20:10:18] * bd808 knows .14 isn't actually live but we've been patching the branch I think [20:10:59] yeah [20:12:04] csteipp: do you object to https://gerrit.wikimedia.org/r/#/c/272795/ / https://phabricator.wikimedia.org/T125455#2054194 ? i'd like to back it out to help isolate the redis ops/sec increase and bd808 said he's ok with it, but that it was you who requested it [20:12:47] ori: I think we can pull it and take another run at the logging later [20:12:59] +1 [20:13:09] ori: Yeah, I'm fine with that [20:13:19] thanks [20:13:24] bd808: can you cr? [20:14:05] thanks [20:18:35] * aude waits for jenkins... [20:20:54] bd808: is it ok to deploy the cache backend patches as part of scap? [20:21:07] aude: yeah that should be fine [20:21:13] the patches look good to me [20:21:14] ok [20:22:24] doing the cookie patch now... [20:24:46] anomie: from https://gerrit.wikimedia.org/r/#/c/269066/ https://gerrit.wikimedia.org/r/#/c/269654/ https://gerrit.wikimedia.org/r/#/c/269649/ https://gerrit.wikimedia.org/r/#/c/269646/ which ones do you think need backport to make sure your MW_NO_SESSION_HANDLER patch works as expected? [20:25:14] all four? [20:26:26] tgr: https://gerrit.wikimedia.org/r/#/c/269066/ and https://gerrit.wikimedia.org/r/#/c/269646/ should be sufficient. [20:27:46] anomie: what should be called instead of wfMessage() to avoid unstubbing wgUser? [20:28:39] bd808: wfMessage() is ok as long as you make sure to set a language on the Message before trying to use it. ->inContentLanguage() works, or you can call ->getLanguage() on the ResourceLoaderContext. [20:29:12] (and pass that language to $message->inLanguage(), of course) [20:29:37] hmm ok I'm looking at one that calls ->plain() [20:29:44] https://phabricator.wikimedia.org/P2657 [20:30:47] so just ->inContentLanguage()->plain() I guess should fix [20:31:45] bd808: Probably that's another one that shouldn't be in MobileFrontendHooks::onResourceLoaderGetConfigVars()... [20:32:25] I don't think this actually varies on user. It is just a side effect [20:32:53] it varies on wiki [20:33:15] wfMessage() by default uses $wgLang, which depends on $wgUser->getOptions(). [20:33:32] ori: tgr you want all your things in when i do scap? [20:34:14] aude: yes please [20:34:29] or just https://gerrit.wikimedia.org/r/#/c/272814/ and https://gerrit.wikimedia.org/r/#/c/272812/ ? [20:35:16] AaronSchulz: have you seen ? [20:35:19] are you ok with it? [20:39:33] ori: tgr jenkins doesn't like https://gerrit.wikimedia.org/r/#/c/272641/ :/ [20:40:11] think i'll just take the revert patches https://gerrit.wikimedia.org/r/#/c/272814/ and https://gerrit.wikimedia.org/r/#/c/272812/ and proceed for now [20:40:26] and let you handle the other ones when i am done, if they are not super urgent [20:40:46] seems OKish [20:44:25] anomie: ugh this Message + wgUser rabbit hole is deep [20:44:46] can we replace $wgUser with an anon when the sessions are disabled? [20:45:15] bd808: I know. [20:46:53] bd808: That would work, although whether it's really the "right" thing to do is debatable. For just the messages, we could probably as well replace $wgLang with $wgContLang unless things are actually *parsing* messages instead of just using ->plain(). [20:47:20] anomie: I found one that is parsing :/ [20:48:19] in a method with the comment "FIXME: This hack shouldn't be needed anymore after fixing T111833" [20:57:17] bd808: BTW, at the moment I've got mw1017 hacked so if you add "&anomie=1" to a load.php URL it'll show you a stacktrace if MW_NO_SESSIONS would blow things up. e.g. https://www.wikidata.org/w/load.php?debug=false&lang=en&modules=mw.config.values.wbSiteDetails&skin=monobook&anomie=1 [20:57:50] :( [20:58:58] anomie: nice. I noticed that there were other blocker showing up [21:27:27] Krinkle: Should we support parsing (e.g. $message->text(), $message->parse()) from load.php? [21:35:52] anomie: how do I fix this one? https://phabricator.wikimedia.org/P2659 [21:36:14] the message object there is using inContentLanguage() already [21:37:05] bd808: That's the question I just asked Krinkle. If the answer is "yes", then we'll have to set $wgUser to an anon. If "no", then I don't know what VE should do there. [21:37:26] did you file a task on this one yet? [21:37:33] I was about to if not [21:38:08] anomie: resourceloader as a framework does not (yet) provide parsed messages. However generated-data modules can provide html of messages they parse wihtin their clas [21:38:19] that is a supported use case that I don't think we have a workaround for at the moment [21:38:24] Lots of extensions do and need that [21:38:36] However they must not use a user context however [21:38:40] which means that load.php needs sessions :/ [21:38:47] No, because it doesn't get cookies [21:38:53] or that we always give an anon [21:38:56] Indeed [21:39:13] "doesn't get cookies"? [21:39:16] Parser should only need user id/name, and not a whole user object [21:39:24] bd808: It used to be stripped at Varnish layer [21:39:28] but that got lost along the way [21:39:33] I haven't re-introduced it yet [21:39:54] ah. a casualty of the bits to text migration? [21:39:58] but parsing a message is not uncommon for RL modules [21:40:07] bd808: shortly after that actually, but yeah. [21:40:26] I think that means that RL should clobber $wgUser with an anon user object [21:41:08] because otherwise parser may unstub wgUser which would trigger session state [21:41:12] bd808: We do that already [21:41:15] see REsourceLoaderContext [21:41:21] it's new User; [21:41:28] in 99% of requests (which don't have &user=..) [21:41:39] Just needs to be fiddled around to be passed there somehow [21:41:42] that doesn't change $wgUser though [21:41:45] Yeah [21:41:47] see https://phabricator.wikimedia.org/P2659 [21:41:50] well, stuff shouldn't use $wgUser [21:41:57] parser does [21:42:10] and Message fires up Parser [21:42:14] and then boom! [21:43:21] bd808: * @warning $wgUser or $wgTitle or $wgRequest or $wgLang. Keep them away! [21:43:23] -- Parser.php [21:43:25] no matches [21:43:33] ParserOptions [21:43:34] aha [21:43:39] that message that is blowing up already sets inContentLanguage() [21:43:56] It's not even for the language [21:44:04] whcih is also passed by url &lang= [21:44:06] probably for gender [21:45:23] It's MessageCache's fault [21:45:28] it constructs ParserOptions without arguments [21:45:41] also $wgUser->isSafeToLoad() seems like a like [21:45:44] like a lie* [21:45:55] DO we still need that method? [21:46:01] I think it's obsolete now, right? [21:46:20] It's needed for anything that's called from a hook before the end of Setup.php. Like MessageCache. [21:46:43] User:isSafeToLoad is brand new [21:46:53] Yeah, I saw it being added [21:47:08] but... anomie should it look at the new define? [21:47:22] It's just that anything that uses is being fixed because we don't want it. so maybe we want to just log things in User.php if called to early instead of patching code to check it first. [21:47:31] In what scenario do we want code to check it and use it anyway? [21:47:48] use something* else that is [21:47:57] Anyway, load.php [21:47:59] So one possible fix is to have User::isSafeToLoad() always return false when MW_NO_SESSIONS is defined. Then Message parsing blows up on Scribunto → SpecialVersion::getVersion() → wfMessage( 'parentheses' ). There are a few ways to fix that one, but I wonder how many other moles there are to whack along those lines. [21:48:03] Krinkle: e.g. https://phabricator.wikimedia.org/T56193 [21:48:24] but those moles currently don't check isSafeToLoad either [21:48:32] so I'd rather we just fix it properly when we do whack them [21:48:40] I think MW_NO_SESSIONS should just set an anon $wgUser [21:48:55] Yeah [21:48:57] That would make sense. [21:49:06] We can still make it throw if it tries to make the anon a session [21:49:09] (and should) [21:49:27] but for plain access I guess we can allow it. [21:50:04] Depends on whether anticipate incorrect use of $wgUser. [21:50:11] otherwise the problem is that $wgUser is evil but a reality. killing off all use of it will take a year of solid work [21:50:18] If static.php uses wgUser, that would be bad, and we'd want to detect that [21:51:00] load.php can parse things, so we allow it for now. Though MessageCache is probably the only exception and we know Message/Parser won't do bad things with the user. [21:51:10] using an anon user makes sense as a fallback, but it should still log warnings IMO because accessing a user object after MW_NO_SESSION very likely means that the code will do unexpected things [21:51:22] so maybe fixing MessageCache will do. I'd like to keep wgUser = anon out of load.php if possble as it would make it harder to detect incorrect use of it. [21:51:31] Agreed. [21:51:32] such as using the wrong language [21:51:35] tgr: message->text() is going to do it like every time [21:52:02] well, message->text() means someone expected some message to be localizable [21:52:07] and it won't be [21:52:25] I'm not sure I follow [21:52:52] message->text() looks at the user object to see what translation to use [21:53:24] if it instead uses site language, which is the effect of making wgUser an anon, that is probably unexpected [21:53:53] less so for WMF wikis I guess which did that in the past anyway, due to bits [21:54:04] even setting inContentLanguage() won't stop wgUser from unstubbing via MessageCache [21:54:07] tgr: It's more that message->text() goes into the parser if the message contains "{{", and once you go into the parser there are a million ways something could be trying to use $wgUser (or RequestContext::getMain()->getUser(), same thing as far as this goes) [21:54:13] Message doesn't access $wgUser. It does access $wgLang though, as default language. [21:54:40] If we change that to mainContext->getLanguage() instead, we can have load.php set that. [21:54:48] if the message contains wikitext it goes to the parser [21:54:49] which it can because it's one of the url parameters. [21:55:29] for the parser, setting an anonymous user should be fine [21:55:48] hard to imagine any use case where stuff like stub thresholds would be relevant in load.php [21:55:58] bd808: What's staged on tin and/or mw1017 right now? [21:56:03] https://etherpad.wikimedia.org/p/perf-20160223 [21:56:17] Not clear to me what's there at the moment [21:57:01] Krinkle: I don't know. aude can tell us what she merged and pulled there [21:57:18] first I'm seeing that etherpad [21:57:19] i am doing scap [21:59:31] bd808: OK. once aude is done we'll just clean-sync mw1017 and apply all the wmf13 patches on tin / sync on mw1017? [21:59:38] Then we can test there with debug header and stuff [21:59:42] Krinkle, bd808, tgr: So is the verdict that MessageCache should be creating the ParserOptions with an anonymous user for MW_NO_SESSIONS, and anything called from the parser that's using $wgUser or RequestContext::getMain()->getUser() instead of ParserOptions->getUser() is a bug? Or something else? [22:00:23] Yeah, if something beyond the MessageCache call is indirectly reaching into wgUser/getUser that should probably yield a warning/exception [22:00:41] It's worth a shot [22:00:41] Since we can't guruantee that use will be safe [22:00:51] Right now it will be an exception. [22:00:57] that sounds reasonable to me, but not familiar enough to tell if there are places where getting parseroptions would be really difficult [22:01:03] well, not really, we still have it se to 'warn', right? [22:01:04] hooks or whatnot [22:01:13] we don't hae NO_SESSION=1 or disable anywhere yet [22:01:25] (and shouldn't for al least a good week or so) [22:01:28] True, if MW_NO_SESSION = 'warn' it'll warn. [22:01:37] * Krinkle hopes so [22:01:38] I was considering the end state. [22:01:42] unless the installer uses the parser [22:01:47] sssh [22:02:00] or is MW_NO_SESSION_HANDLER unaffected? [22:02:05] The installer doesn't use MW_NO_SESSION anymore [22:02:21] If the installer uses the parser and ends up using User and it was working before, it'll work now. [22:02:33] if not, no, but then it would've been already broken I suppose. [22:03:53] Should we have MessageCache test MW_NO_SESSION itself, or should we have $wgUser->isSafeToLoad() return false which will also avoid exceptions from other things that are checking that? [22:04:16] I would change isSafeToLoad [22:04:16] * anomie is tired and doesn't want to make decisions [22:06:25] isSafeToLoad sounds neater [22:06:36] doesn't that logic exist already? [22:06:38] the installer uses the parser extensively, via Installer::parse() [22:07:17] but it is meant to throw an exception if DB access is attempted [22:07:38] as long as we backport the MW_NO_SESSION_HANDLER patch before anything that would make wgUser access in the parser throw exceptions when MW_NO_SESSION is enabled, we should be fine [22:07:43] which limits the number of parser functions you can use [22:07:49] and since it is only parsing messages and release notes docs, that's not a big problem [22:35:03] tgr: bd808: Do we wanna merge/test the 4 wmf13 patches at once or separately? [22:35:28] * bd808 defers to tgr [22:47:05] tgr: [22:49:35] Krinkle: no [22:49:50] they only affect the installer save one [22:50:22] so alltogether is fine, we can test it on mw1017 all at once (the wmf13 patches) [22:50:56] we should check that nothing blows up, but there is no way to test the patches specifically [22:51:07] unless you live-hack MW_NO_SESSION somewhere [22:51:32] k [22:57:16] merging the wmf13 patches [23:07:03] indeed. [23:07:03] * bd808 contemplates restarting his bouncer [23:07:05] "[Global Notice] Hi all, I need to restart one of the network hubs. This will break the network for a short while but it should only last a few minutes." [23:08:07] bd808: Thanks for the heads up :P [23:08:51] hoo: heh. not sure if all clients post global notices; it was a bit hidden for me [23:09:05] I think I didn't get it [23:09:21] After a few seconds I was just almost alone on kornbluth [23:13:41] wmf13 session manager changes are now live on mw1017 [23:13:43] tgr: [23:13:46] staged on tin [23:16:46] Krinkle_: looking [23:17:19] https://en.wikipedia.org/w/resources/src/startup.js will hit static.php on wmf13, enable WikimediaDebug to hit mw1017 [23:19:32] Interesting that ast vast majority of channel=session warnings come from fa.wikipedia. [23:20:20] did we have autologin on central wikisource? [23:22:28] cookie domain .wikisource.org covers it, right? [23:22:46] looks like that's broken, might not be related to the patches though [23:23:43] I haven't synced outside mw1017 so that's easy to verify [23:24:00] wmfstatic seems to be fine, not getting any warnings though [23:24:08] They'd be in logstash, right? [23:25:56] Krinkle_: The last time I looked the fawiki message were all from one broken bot (that we have poked the owner of a couple of times) [23:26:59] bd808: Why is that a warning in our code though? [23:27:28] autologin is super brittle, both with and without the patch, can't tell if it got more brittle (so probably no) [23:27:41] rezabot is sending busted cookies that make sessionmanager sad [23:27:44] I guess that's the CA autocreation race condition issue [23:29:44] adding define( 'MW_NO_SESSION', 'warn' ); to load.php in wmf13 on mw1017. [23:29:58] I had high hopes that the full grid restart in tool labs would fix the bot last week, but no such luck [23:30:17] bd808: rezabot was stopped [23:30:24] at least that's what the owner said [23:31:11] https://logstash.wikimedia.org/#dashboard/temp/AVMQeLecptxhN1Xael9V [23:31:13] tgr: well something is using that account name -- https://logstash.wikimedia.org/#dashboard/temp/AVMQeLF0ptxhN1Xael7b [23:31:21] https://logstash.wikimedia.org/#dashboard/temp/AVMQeLecptxhN1Xael9V - sessions + mw1017 filter [23:32:07] no stack traces? [23:32:32] nothing from static.php, but load.php is obviously emitting a few [23:32:46] bd808: is there a way to enable that? [23:32:50] I see some entries with ":SessionManager-backport-3" [23:32:54] I guess that was a live hack? [23:33:10] that's just me testing things [23:33:19] it's a username [23:33:22] Krinkle_: to get nice stacktraces in logstash we need to stick an exception in the log context [23:33:33] ah, ok [23:33:44] 'exception' => new Exception() [23:33:47] that would do it [23:34:06] might be worth live-hacking [23:34:28] We know about several load.php problems right now [23:34:44] https://phabricator.wikimedia.org/T127233 [23:35:17] plus the VE parseroptions one [23:35:54] the "Key "CentralAuthSource" changed" message is normal, it's some kind of autologin failure I think [23:36:04] well, maybe not normal, but not new [23:36:55] not sure about the other messages [23:38:01] , array( 'exception' => new Exception() ) ); + SessionManager.php [23:38:11] (but none are for a MW_NO_SESSION endpoint so no relation I suppose) [23:42:12] https://logstash.wikimedia.org/#/dashboard/temp/AVMQeLecptxhN1Xael9V has traces now [23:42:16] Krinkle_: https://www.mediawiki.org/w/load.php?debug=false&lang=en&modules=startup&only=scripts&skin=vector is an 503 (on mw1017) [23:42:21] yeah, fixed [23:42:29] indeed [23:42:37] forgot \ in fron of new Exception, courtesy of namespaces [23:44:08] anyway, we can play with load.php further. The infra is there now in wmf.13 and we can enable warnigns on mw1017 at any time this week. [23:44:13] tgr: Let's go on with wmf.14? [23:44:26] static.php is unaffected so I'll sync it to all servers now? [23:44:33] and we didn't find other side effects either [23:45:52] Krinkle_: wmf.14 where? [23:45:59] yeah, if static.php does not send warnings then it should be fine [23:45:59] https://etherpad.wikimedia.org/p/perf-20160223 [23:46:07] the wmf.13 patches are staged, Im syncing them now [23:46:18] we can then go forth with staging wmf14 patches which we can test on testwiki [23:46:28] testwiki is the only wiki runnig wmf14 right now [23:50:31] bd808: sync-dir has been stuck on the logo for about 4 minutes now. Showing no meaningful text at all. [23:51:01] Not even a message saying it started sync-masters [23:51:04] I guess it's linting? [23:51:34] probably. what did you sync-dir? [23:51:47] bd808: the whole branch dir [23:51:52] oh fuck [23:51:57] that will take hours [23:51:59] what? [23:52:09] the linter [23:52:12] Doesn't scap need to do the same thing? [23:52:15] it will lint every php file [23:52:17] no [23:52:23] scap does not lint [23:52:25] Ok, that's counter intuitive. [23:52:30] and that just made it worse [23:52:33] Well, jenkins should've linted [23:52:38] uhuh [23:52:44] sync-dir/sync-file is more likely to be a live hack or similar [23:52:47] And we never live commit and then push to gerrit [23:52:50] * Krinkle_ never does [23:53:00] Hm.. [23:53:02] ok [23:53:12] so how do I sync-dir two things in a branch dir [23:53:13] Fixing the site is more pressing than getting it in gerrit [23:53:16] in this case autoload.php and includes/ [23:53:19] sync-dir for anything but mw-config is almost always wrong [23:53:43] sync-file autoload first [23:53:46] and then the dir [23:54:06] I think sync-dir is used quite often for subdirs of extensions and even entire extensions [23:54:13] for e.g. extension.json and resources changes [23:54:20] Yeah, I keep using it for Wikidata [23:54:21] sure but that's a lot less than the whole branch [23:54:26] and it's not incredibly slow there [23:54:28] the linter actually stopped [23:54:30] exit non-zero [23:54:32] and we have quite a few loc of PHP [23:54:43] 23:52:39 sync-dir failed: Command 'find '/srv/mediawiki-staging/php-1.27.0-wmf.13' -name '*.php' -or -name '*.inc' -or -name '*.phtml' -or -name '*.php5' | xargs -n1 -P6 -exec php -l >/dev/null' returned non-zero exit status 123 [23:54:46] no useful error message though [23:55:45] ">/dev/null" so not useful messages to give [23:55:54] s/not/no/ [23:56:03] But 2>, right? [23:56:08] They go to stderr surely [23:56:22] I don't think so [23:56:25] .. [23:56:32] I wanna go back to how things were in 2011 [23:56:38] lol [23:56:44] not me [23:56:53] :P [23:57:13] confirmed, it writes to stdout [23:57:21] php -l, that is. [23:57:23] .php5? Why is that still in there? [23:58:16] labile, dried grapes [23:58:18] because we never took it out of the copy pasta that is scap [23:58:30] * Reedy shuns bd808 [23:58:44] or hysterical raisons [23:58:54] weren't the php5 entry points just taken out of master a few months ago? [23:59:01] More than that I think [23:59:07] I've a patch to remove that stupid wiki.phtml [23:59:32] Reedy: Think there are still incoming links to that [23:59:36] Reedy: Dont get too excited now. We only just finished removing UseMod instructions from our documentation. [23:59:44] hysterical raisons are the tastiest [23:59:48] hoo: behave [23:59:54] :D