[05:58:42] Fundraising Sprint N*E*R*D, Fundraising Sprint ODB, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, and 2 others: Publishing translations for central notice banners fails - https://phabricator.wikimedia.org/T104774#1443975 (AndyRussG) I edited both translated messages (added and removed a... [08:28:18] Fundraising Sprint N*E*R*D, Fundraising Sprint ODB, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, and 2 others: Publishing translations for central notice banners fails - https://phabricator.wikimedia.org/T104774#1444175 (Nikerabbit) I don't have permissions to set the state to published... [14:11:17] (CR) Awight: [C: 2] "Looks right, needs test!" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/223833 (owner: Cdentinger) [14:12:36] (Merged) jenkins-bot: import phone in wmf_civicrm [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/223833 (owner: Cdentinger) [16:16:33] Hi Jeff_Green, yt? Do you know much about i18n message caching? There's a banner that isn't showing the correct translation in Japanese even though the Mediawiki namespace page has the right info... https://phabricator.wikimedia.org/T104774 [16:21:28] AndyRussG|mob: unfortunately I don't [16:23:02] Jeff_Green ah ok thx neway! I'll check on ops [17:49:53] fundraising-tech-ops: overhaul fundraising cluster monitoring - https://phabricator.wikimedia.org/T91508#1445916 (Jgreen) [18:01:25] Fundraising Tech Backlog: [BUG] GC Japan donation from 7/9 has no donor details - https://phabricator.wikimedia.org/T105537#1445998 (MBeat33) NEW [20:43:23] Hi ejegg cwdent|afk XenoRyet... Sorry I had to be AFK... I guess no news on the banner translation issue...? Diging in some more now... [20:43:50] sorry, no ideas here! [20:44:09] ejegg: ;) thanks neway! [21:07:36] (CR) XenoRyet: "Change AstroPayAuditTest so we're not testing both things at the same time." [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/220944 (https://phabricator.wikimedia.org/T104718) (owner: Awight) [21:46:28] (PS23) Ejegg: AstroPay audit glue module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/220944 (https://phabricator.wikimedia.org/T104718) (owner: Awight) [21:46:40] XenoRyet: ^^ [21:47:00] (PS24) Ejegg: AstroPay audit glue module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/220944 (https://phabricator.wikimedia.org/T104718) (owner: Awight) [22:01:56] ejegg: Cool, looks good to me [22:02:04] AndyRussG: I'm taking a quick look at the CN translation message group failure, sounds bad. I remember seeing some extra-Byzantine code in there. [22:02:21] awight: hi! cool, thx! [22:02:41] awight: I don't understand why the translation message is good on wiki, but doesn't show up in the banner [22:03:08] awight: I haven't yet tried to do the same translate action with debug on to trace the error... [22:03:47] Is the message group published? (looking) [22:03:51] I was about to try asking again on #wikimedia-operations... [22:03:53] Yes! [22:03:56] The translation workflow thing is the scariest part [22:04:12] Heh I had ignored it until now 8p [22:04:29] It was published, and then I edited it, re-published it, and got the same error (apprently) [22:04:38] I might have mentioned, I had to write a subclass that used magic methods to delegate everything, to accomplish what should have been a trivial workflow thing [22:04:48] (not for CN, elsewhere) [22:05:04] Have you found the mw error log? [22:05:18] awight: no... [22:05:19] Sorry to hand-wave, it takes me 10 minutes to find as well [22:05:21] The CentralNotice code doesn't seem too byzantine - just calling Title::newFromText with the key, then Revision::newFromTitle and grabbing the contents [22:05:39] ejegg: hi! [22:05:46] hi awight! [22:05:50] awight: np! I don't actually know how to look... [22:05:57] ejegg: includes/BannerMessageGroup.php [22:05:59] or maybe I did and forgot? arg... [22:06:26] AndyRussG: https://meta.wikimedia.org/w/index.php?title=Special:Translate&action=proofread&group=Centralnotice-tgroup-wm2015register&language=ja&filter= [22:06:32] these are all "unreviewed" [22:07:19] ooh, updateBannerGroupStateHook sounds relevant [22:07:31] AndyRussG: I don't think you have permissions to review? https://meta.wikimedia.org/w/index.php?title=Special:ListUsers&group=translationadmin [22:07:42] awight: Yes, that's what I did last night [22:07:49] And that's when I got the error [22:08:01] And it updated the Mediawiki: namsepace messages, but not the banner! [22:08:27] Here's what the banner looks like (you may have to clear cookies to see): https://meta.wikimedia.org/w/index.php?title=Wikinews/Licensure_Poll/GFDL_CC-BY-SA/O/For&banner=wm2015register&uselang=ja&force=1 [22:08:42] Ooooh, right. ejegg it's possible we make an additional restriction on who can publish translations for CN? [22:08:47] awight: ejegg: but the Mediawiki namespace messages are different: https://meta.wikimedia.org/wiki/MediaWiki:Centralnotice-wm2015register-text1/ja [22:08:52] https://meta.wikimedia.org/wiki/MediaWiki:Centralnotice-wm2015register-text2/ja [22:08:55] awight: BannerMessageGroup sets published right to centralnotice-admin [22:09:04] in getMessageGroupStates [22:09:12] AndyRussG: those both show up in ja for me [22:09:35] Are you sure about tha banner? It's hard to tell if you don't know the script and look closely [22:09:47] There should be two messages in the banner, both ending in an exclamation point [22:11:31] AndyRussG: aha, sorry, you're saying it displays the older message [22:11:38] awight: correct [22:12:02] I'm guessing some message caching thing [22:12:17] That's why I was about to go ask on #-operations [22:12:23] AndyRussG: That sounds more like simply failing to set the message group state to published [22:12:48] awight: but doesn't it not update the Mediawiki: namespace page until it's published? [22:13:02] Do you have Extension:Translate and $wgNoticeUseTranslateExtension=true on your dev box? [22:13:16] Just setting it up... [22:13:27] k [22:13:50] awight: Let's fix first the banner then after that find the bug that caused the banner to weird out,make sense? [22:14:14] AndyRussG: https://wikitech.wikimedia.org/wiki/Logs [22:14:15] Am I wrong to think that if the Mediawiki: pages show one thing, and the banner shows another there's something wrong? [22:14:16] sure [22:14:31] No, I think that's to be expected [22:15:26] awight: why? I don't understand how thiat is, then [22:15:46] Thanks for the logslink!!! [22:16:02] Actually, I think the right path is under something crazy like CNBannerMessage: [22:16:05] looking now... [22:16:27] yeesh [22:16:55] I did see that namespace in the code, but I thought it was for the still-in-review messages [22:18:16] BannerMessageGroup line 167 + copies from CNBannerMessage nanespace to Mediawiki ns [22:18:18] yah, comment in CentralNotice.php says the CN_BANNER namespace is for staging and is world editable, and that they're moved to MW namespace on published [22:18:26] AndyRussG: I'm catching up w/ you now. just found that copy code, too [22:18:34] oh yeah, and that's what the code actually does... [22:19:11] awight: ejegg: Basically, there have to be two versions of the messages, one that the translators are working on and reviewing, and the other, the published copy (once the message has been published) [22:20:08] Cool, the three of us do have rights to promote these msgs, V [22:20:09] https://meta.wikimedia.org/w/index.php?title=Special:ListUsers&group=centralnoticeadmin [22:21:01] AndyRussG: I think you're right that the MediaWiki: namespace message should be the one that shows up, I'm trying to prove that to myself though [22:21:23] awight: well, Mediawiki: is where all i18n translations live in general, no? [22:21:48] Or at least, customized versions of messages, I think? [22:22:27] yeah I'm pretty certain [22:22:47] Is anyone else creeped out by the Special:Translate interface not showing the previous version of the message anywhere? [22:23:04] https://meta.wikimedia.org/wiki/MediaWiki:Edit/es [22:23:06] https://meta.wikimedia.org/wiki/MediaWiki:Edit/en [22:23:11] https://meta.wikimedia.org/wiki/MediaWiki:Edit/fr [22:24:44] https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCentralNotice/a9aadc667713754c2f4cc038e85311fdd8cf61e4/includes%2FBannerRenderer.php#L134 [22:25:05] awight: BannerMessage->getContents is definitely using MW namespace [22:25:18] ^ that's where the bannerrenderer gets the banner HTML from, no? As if it were a i18n message [22:25:36] And I guess when that contains messages, they also get substituted, recursively [22:25:37] yes [22:25:38] yeah [22:25:59] Sorry to attempt to derail you with the other namespace :) [22:26:24] heheh np, important to check out that route! [22:26:59] You see what I mean about the banner having different content? [22:27:50] argh, force=1 doesn't work [22:27:56] Yes, I definitely see the old message [22:28:18] Needs more emphasis!! [22:28:38] Yeah gotta clear cookies each time... They grabbed some JS from I dunno where... [22:28:43] I'm so glad there's punctuation I can cling to, in unknown script ;) [22:29:02] I think the hiding is homebrewed, https://meta.wikimedia.org/wiki/Talk:CentralNotice/Calendar#Limiting_impressions_on_Wikimania_2015_registration_announcement maybe [22:30:36] Heh interesting, homebrewring the homebreweres [22:30:59] I was remembering that when you deploy new i18n messages, you have to do some thing to kick the cache [22:31:39] So I was thinking, maybe whatever error happened when I published the messages (in the translate interface) updated them (in the MW: namespace) but didn't kick the cache as it should, somehow [22:32:05] I'm tempted to just edit those MW: pages manually (again adding and removing a space or something) just to se what happens [22:32:11] ejegg: awight ^ [22:32:18] Fundraising Tech Backlog, MediaWiki-extensions-CentralNotice: CentralNotice: "Preview all approved translations" is dead - https://phabricator.wikimedia.org/T105558#1446538 (awight) NEW [22:33:05] awight: heh I was wondering about that link... Did it used to work? [22:33:06] AndyRussG: oooh, tempting [22:33:23] should I click or should I not? [22:33:24] AndyRussG: yeah, at one point I had the banner previews in iframes [22:33:30] hehe no harm in trying [22:33:40] K I'm gonna do that [22:33:42] oooh fancy! [22:33:47] Fundraising Tech Backlog, MediaWiki-extensions-CentralNotice: CentralNotice: "Preview all approved translations" is dead - https://phabricator.wikimedia.org/T105558#1446546 (Ejegg) [22:34:01] That'd be fun! [22:34:41] ejegg: Thanks for the archaeology! [22:35:09] fwiw, I think I made the duplicate, not (other) Katie ... [22:35:16] heh, i was about to re- file that bug yesterday [22:35:41] Embarrassment to go around ;) [22:35:55] So much interest in multilingual translations! [22:36:13] I got real feisty when I thought we could have banner previews in iframes. It wreaked all kinds of havoc with people's browsers, unfortunately. [22:38:45] heheh that's what iframes are for? [22:38:59] awight: ejegg: looks like editing the MW: namespace pages directly fixed the baner! [22:39:06] s/baner/banner/ [22:39:18] baner = something that banes [22:39:22] AndyRussG: Rad! Vindication for your cache theory [22:39:37] better than batting on the banee [22:39:41] Just double checking the messages are the same... [22:42:20] ejegg: awight: confirmed, the banner messages are now identical to the ones in the Translate interface [22:42:51] huh, good to have it fixed, but I'm still clueless how that could have helped [22:43:23] There must be some cache jiggling that happens when you update a message, that somehow didn't happen when I re-published the messages from the Translate interface last night... (?) [22:43:49] Yah, the l10n cache has some kind of weirdly long expiry, I think [22:44:27] I'm curious about the EDIT_FORCE_BOT flag we pass to doEditContent... [22:45:07] oh nvm, docstring is just 'Mark the edit a "bot" edit regardless of user rights' [22:45:27] Ah yeah I remember having seen that lovely text elsewhere :) [22:45:49] Maybe we can find an error that happened at the same time as my edit yesterday? https://meta.wikimedia.org/w/index.php?title=MediaWiki:Centralnotice-wm2015register-text1/ja&action=history [22:48:01] AndyRussG: I'm reading that exception.log includes a short hex fingerprint you can grep for, that should have showed up in the onwiki error, fwiw [22:49:07] awight: Hmmm... where in the onwiki error? When I published from the translate interface, it showed a short text (the one quoted) in a JS altert box. So I didn't catch any more details, that's why I was gonna try it again w/ dev browser tools on [22:50:39] "Change of state failed" was all it said? alert box? whoa [22:52:25] Fundraising Sprint N*E*R*D, Fundraising Sprint ODB, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, and 2 others: Publishing translations for central notice banners fails - https://phabricator.wikimedia.org/T104774#1446587 (AndyRussG) The banner is fixed now! It started to appear correctly... [22:52:42] awight: I'm pretty sure that was about it! [22:53:36] Looks like nikerabbit and ejegg are already suspecting an exception in the CN TranslateEventMessageGroupStateChange hook [22:54:09] that's all i can think of... [22:54:49] looks like wfRunHooks allows the exception to bubble up... [22:55:00] If there are no reports it sounds like a pretty edgy case [22:55:30] https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FTranslate/882d197ed13757d44387cf669956fa302bcf750d/resources%2Fjs%2Fext.translate.workflowselector.js#L142 [22:55:58] It could be ops's fault ;p [22:56:10] hehe, well at least it's repeatable [22:56:49] well not quite! The time it happened and the reporter reported it, the Mediawiki: namespace message _didn't_ update--and when I got the message, it did! [22:57:03] I mean, when I got the error [22:57:41] yikes! [22:58:28] 'tux-workflow-status-triangle' ? [22:58:54] I'm looking for where to put breakpoints before I try it again... [22:59:09] Maybe something weird about this specific message? [22:59:15] A bug in PHP UTF8 code? [23:00:28] Rats, this ssh key doesn't work on the WMF bastion yet [23:00:39] Sorry I can't check the logs... [23:01:44] One sec, i'll take a look [23:01:45] hrm, https://logstash.wikimedia.org/#/dashboard/elasticsearch/hhvm [23:02:32] I'll get the time of Andrew's encounter of the bug kind [23:02:49] 22:37, 10 July 2015‎ [23:03:15] probably a few minutes after that ^ [23:05:24] awight: ejegg: from the JS code, it looks like the error happens when an API call comes back with an error. I'm gonna try again just w/ the network tab open... [23:06:59] AndyRussG: if it helps narrow it down, NikeRabbit was saying that it's probably coming from BannerMessageGroup::updateBannerGroupStateHook [23:07:14] oh wait, that was the direct edit time [23:07:20] yep [23:07:22] translation was 5:28 [23:07:28] wat [23:07:50] That would be earlier than the edit... [23:08:06] Sounds like non-atomic behavior from doEditContent or something [23:08:07] AndyRussG: you said you got the error last night, right? [23:08:10] awight: ejegg: K just edited and re-published the message again and got the same change-of-state alert dialog box error message thing again [23:08:16] ejegg: yep! [23:08:23] This one? https://meta.wikimedia.org/w/index.php?title=MediaWiki:Centralnotice-wm2015register-text1/ja&action=history [23:08:31] Cos there's no new log there (yet) [23:08:45] gah. nvm me [23:08:56] I see what you're all up to now: actually working [23:09:23] awight: ejegg: it's a post to api.php that never comes back [23:09:28] oooh [23:09:39] oh, hmm [23:09:52] not seeing anything new in exception.log [23:10:24] action:"groupreview" [23:10:24] format:"json" [23:10:24] group:"Centralnotice-tgroup-wm2015register" [23:10:24] language:"ja" [23:10:26] state:"published" [23:10:28] token:"sekret sekret sekret" [23:11:19] (those are the post params) [23:12:07] Wish I could help or hinder more, gotta run! [23:13:34] lemme check fatal.log [23:13:48] ejegg: a thanks! [23:14:07] ejegg: Should find one in the last 5 min or so [23:14:36] huh, nothing in that file either [23:17:02] AndyRussG: can you find the name of the server you're hitting? [23:18:34] ejegg: hmm... Where would that be? It's meta.wikimedia.org, but beyond that... [23:18:47] lemme see... [23:20:08] look for a comment like "Parsed by mw1218" [23:20:18] where? [23:20:31] oh, i guess the api request might hit a different server [23:20:43] that's in the html [23:21:06] The response from the api.php call? But there was no response... [23:22:11] the comment's in the html for the wiki page, but yeah, api call might be a different server [23:23:48] so no response at all, not even a 500 error? [23:25:57] ejegg: doesn't look like it! In the dev tools it's a grey ball with no response code and an empty response [23:28:10] hmm [23:28:29] looking through the apache logs now [23:28:52] not much info in them [23:30:16] oh, here's something. [23:30:19] Jul 10 23:20:48 mw1212: [proxy_fcgi:error] [pid 16659:tid 139932358133504] [client 10.64.32.107:29795] AH01070: Error parsing script headers, referer: https://meta.wikimedia.org/w/index.php?title=Special:Translate&group=Centralnotice-tgroup-wm2015register&language=es&filter=%21translated&action=translate [23:30:32] woot! [23:30:52] Error parsing script headers, huh? [23:31:07] strange that it's not the api endpoint, and in es rather than ja though [23:31:16] Oh, referer. That's probably correct [23:31:55] oh yeah, there are 3 lines just like it below [23:31:58] lemme see the langs [23:32:51] all es [23:32:55] I think crazy AndyRussG probably has his browser set to "nostalgia" language [23:33:19] "script" headers jumps out as unusual [23:33:49] that's by far the most common error in apache2.log [23:35:22] yuck, I don't even want to repeat my cursory search results, http://serverfault.com/questions/421398/apache-php-fpm-random-error-parsing-script-headers-seg-faults [23:36:11] nah my browser is all boring English... (only my keyboard layout is nostalgia) [23:36:56] ejegg: ^ above I pasted the api.php request headers... I can look 4 the main HTML request headers [23:37:04] Might be easier to reproduce locally at this point [23:38:10] The time was something like between 23:05 and 23:11 UTC [23:41:08] huh, nothing with that kind of referrer in the apache error log in that timeslot [23:41:29] ejegg: how are you searching? Maybe my ua string would help? Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0 Iceweasel/39.0 [23:42:20] just grepping for wm2015register [23:42:36] ejegg: might not get all the way to apache, but exception.log instead? [23:43:04] I think for apache to cry foul, the script needs to return a 5xx status [23:43:08] looked there first and didn't see anything likely [23:43:14] but let me look again [23:44:43] Yeah! I mean, it wasn't an error that made apache return an http error status code... Something actually curled up and died on the farm [23:45:02] ejegg: how about if I trigger the error again at an agreed exact time? [23:45:55] yeah, the only api exceptions around then are from officewiki, and some notification fetches from enwiki [23:46:11] AndyRussG: sure, give it another shot! [23:46:43] ejegg: K, I'm gonna aim for 23:50:00 UTC [23:46:56] * AndyRussG gets out atomic clock [23:51:38] ejegg: arg, now it doesn't revert to non-published state after I edit it, so I can't set it to published again to trigger the error [23:51:44] just saw an ApiUpload exception on commons, but that's it [23:51:48] oh, weird [23:53:20] ejegg: K just triggered it again! 23:52:42, or quite close to that! [23:54:48] nothing yet... just some dewiki page edit api exceptions [23:56:53] If u get tired of log-and-seek, we could also post the details to Phabricator and cc ops... Not meaning to dissuade at all! It's a fun hunt... [23:57:20] i'm feeling pretty stumped [23:59:06] Silly segfaulting unlogging zombie-like endpoints... [23:59:49] fer reals