[00:02:07] (PS1) Ejegg: Add order_id to PayPal messages [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313541 [00:02:35] cwd sorry [00:03:29] hehe, you're the fastest gun in the west [00:03:34] actually in the west now too [00:04:18] (CR) Cdentinger: [C: 2] Add order_id to PayPal messages [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313541 (owner: Ejegg) [00:04:56] (Merged) jenkins-bot: Add order_id to PayPal messages [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313541 (owner: Ejegg) [00:06:19] (PS1) Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313542 [00:06:32] (CR) Ejegg: [C: 2] Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313542 (owner: Ejegg) [00:07:36] !log re-enabled donations queue consumer [00:07:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:08:25] (Abandoned) Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313438 (owner: Ejegg) [00:11:06] oops, 'donations limit must be numeric' [00:11:09] let's see... [00:11:14] is that just jenkins config? [00:11:38] it's a drupal variable [00:11:38] (Merged) jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313542 (owner: Ejegg) [00:12:18] !log updated SmashPig // FIXME: var map can't put one thing in two places [00:12:21] if ( isset( $new_msg->contribution_tracking_id ) ) { [00:12:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:12:24] $new_msg->order_id = $new_msg->contribution_tracking_id; [00:12:25] !log updated SmashPig from 3811f0f1c4bed1bd0b02264b5865ae36021cb275 to 8ff1950ccd87c649f1748f25e1a0a708c3337206 [00:12:29] } [00:12:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:12:31] argh wrong copybuffer [00:12:55] committed to permanence on wikitech... [00:13:24] hehe [00:13:27] #linux [00:17:26] (PS1) Ejegg: 0 instead of '' for Unlimited messages [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/313543 [00:17:51] ok, let's take some baby steps forward, right up to the next queue consumer [00:21:11] hmm, monitor hitting the old ipn listener [00:21:42] (PS1) Ejegg: Merge commit '558b942b12872bfba33389640d63b592ca276368' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/313544 [00:21:50] oh it's pointing right to a php file [00:22:41] (CR) Ejegg: [C: 2] Merge commit '558b942b12872bfba33389640d63b592ca276368' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/313544 (owner: Ejegg) [00:23:06] (Merged) jenkins-bot: Merge commit '558b942b12872bfba33389640d63b592ca276368' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/313544 (owner: Ejegg) [00:24:33] ejegg: now i see one about globalcollect. is that obsolete? [00:25:04] wait, what about globalcollect? [00:25:16] !log updated civicrm from 637659ee8257562492405385d3fadaee53db998b to 18e59abac57ba85ee9d9dbd50f9f25df64522974 [00:25:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:25:46] cwd oh huh [00:25:51] uhhh [00:25:57] Fundraising Sprint Rocket Surgery 2016, Fundraising Sprint Stirring The Pot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, and 4 others: Banner not showing up on site - https://phabricator.wikimedia.org/T144952#2679065 (awight) @RobLa pointed to a deployment on the date this bug was re... [00:26:02] no idea what that redirects to [00:26:36] phooey, let's add monitoring sites to the things we need logins for [00:27:22] wait, did I just actually break a thing? why did that not trigger earlier [00:28:03] ejegg: that's what i'm wondering, we didn't shut anything off did we? [00:28:14] just added a redirect for paypal [00:35:41] ejegg: heads-up, I've installed your damaged UI locally but might not get the chance to try it out tonight. Don't wait up for me and feel free to take another dev to the review ;) [00:36:04] awight: uhh, so, I think we need web servers restarted on thulium [00:39:50] uh d'oh [00:40:09] maybe drop that in #wikimedia-operations? [00:40:28] I can't imagine how to short-circuit the panel [01:37:31] AndyRussG: This is crazy, but what do you think about logging when we *do* get the message? That should only happen once per language, right? [01:37:51] Maybe repeating at each cache expiry? [01:38:33] awight: hmm I dunno... What is the Varnish setup for BannerLoader? [01:38:44] I think it may not even be Varnished [01:38:57] wat. Well what is caching it, then? [01:39:07] I mean, we don't have any unique keys in the URL to make it change when the banner changes [01:39:12] Maybe its short varnished? [01:39:19] I think maybe nothing [01:39:35] I think we control that stuff with cache headers in SpecialBannerLoader [01:39:54] SpecialBannerLoader::sendHeaders [01:40:03] logged-in users get header( "Cache-Control: public, s-maxage={$wgNoticeBannerMaxAge}, max-age=0" ); [01:40:09] anons get header( "Cache-Control: private, s-maxage=0, max-age=0" ); [01:40:26] oops! reverse logged-in and anons. [01:40:27] awight: ah cool yeah [01:40:31] * awight facepalms [01:41:00] drum roll... [01:41:03] * AndyRussG sighs a sigh of cluser-relief [01:41:06] ? [01:41:13] 10 min? [01:41:21] ./extension.json: "NoticeBannerMaxAge": 600, [01:41:23] yep. [01:41:27] ooooooooooooooooooooh [01:41:42] * awight takes off cape and soars into night, crashes immediately [01:42:04] robla: header( "Cache-Control: public, s-maxage={$wgNoticeBannerMaxAge}, max-age=0" ); [01:42:07] 600 seconds. [01:42:25] awww a crash hit instead of a crash miss? [01:42:27] hopefully you catch your bus, though :) [01:42:44] cache crash in T-10minutes [01:43:25] oooh, magic 600 number! [01:43:26] mediawiki-config$ grep -ri wgNoticeBannerMaxAge [01:43:28] wmf-config/CommonSettings.php: $wgNoticeBannerMaxAge = 0; [01:43:49] O_O [01:43:56] inside a conditional, I hope? [01:44:05] looking [01:44:18] testwiki, nbd [01:44:53] heheh indeed [01:45:08] K interesting [01:45:20] well that's gotta mean something [01:45:32] Still doesn't explain *why* the durn thing is regenerated empty, but yeah I like it so far. [01:46:00] I guess it means that's how long it takes between requests that make it all the way to PHP [01:46:25] How about, we accept a small element of sorely contingent reality here and invalidate the cache if we BannerLoader calculates the banner to not exist at any point [01:47:10] I'm thinking we can cauterize later but in this case a band-aid would be a good first step towards healing [01:47:25] awight: I think we need to defer to ObjectCache's special features for just this kind of thing [01:47:31] excellent [01:47:48] but are we supposed to tickle that far down the pile of turtles? [01:47:52] so we gotta find what parameters are currently used, how and when it actually is currently invalidated, and tweak [01:47:54] yep [01:48:48] I feel really weird about https://www.mediawiki.org/wiki/Localisation#Caching not mentioning ObjectCache. [01:49:09] basically remove a middle turtle, poke it in the eye, spin it round, and put it back without disturbing the pile [01:49:19] :) [01:49:25] upside-down [01:49:32] now which card did you pick? [01:49:47] genesis of the shell game... [01:49:48] um [01:50:20] My trail ran cold at MessageCache, I haven't pried any deeper to see which backing might be failing us. [01:51:20] ther is a memcached vagrant role [01:52:09] That still sounds expensive to me, but don't let me stop you [01:52:29] we're probably talking about some kind of microsecond race condition [01:52:39] hold on-- [01:52:46] that might not be true [01:53:04] cos the CN admins see the problem in "preview" [01:53:14] that's before any potential for a stampede [01:53:34] maybe you will need vagrant db replication as well [01:59:57] Fundraising-Backlog, fundraising-tech-ops: Spike: Investigate using php5-fpm on frack - https://phabricator.wikimedia.org/T147042#2679198 (awight) [02:08:10] (PS1) Awight: Don't return a reference to object, that means nothing [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313546 [02:08:15] cwd: ^ [02:10:09] http://vignette1.wikia.nocookie.net/villains/images/c/cc/Do-the-right-thing-buggin-out.jpg/revision/latest?cb=20151107014327 [02:12:08] awight: val() going to take some cleanup before we can not return reference? [02:12:33] I think we rely on that. [02:12:42] Which is ridiculous, but at least it means anything [02:13:30] awight: oh yes good point about the error pre-stampede! [02:14:06] awight: confused about that block with the new comment, won't it automagically return ref with or without that bool? [02:14:46] oh nm i get it [02:14:56] local assignment [02:15:09] sorry gotta make dinner, will check in a bit [02:22:39] Fundraising Sprint Rocket Surgery 2016, Fundraising Sprint Stirring The Pot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, and 4 others: Banner not showing up on site - https://phabricator.wikimedia.org/T144952#2679226 (RobLa-WMF) Here's the chronology of possibly unrelated events: *... [03:15:01] awight: removing the & on val() does not break any tests... [03:15:38] <_< [03:15:47] That QueueFactory thing might have relied on it [03:16:03] I'd want to read through the many usages either way, is why I left it out of that patch. [03:16:18] Not that I'm usually cautious ;) [03:16:19] i still wonder how that thing came back from the grave like 8 hours later [03:16:25] aaargh [03:16:30] * awight fumbles for brains [03:21:28] it's so confusing that the objects get memoized in the object but also optionally returned by reference [03:21:58] 2 separate levels of semi persistence [03:25:39] diabolic, I know [03:26:46] I'm in favor of tearing out any caching we haven't proven a need for [03:27:07] Make it creak with slowness before optimizing [03:33:28] yep [03:33:58] http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast [03:52:27] (PS1) Cdentinger: Remove Configuration::setDefaultConfig() [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313551 [03:54:34] Fundraising Sprint Stirring The Pot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Fatal exception when cloning/saving banners - https://phabricator.wikimedia.org/T146880#2679232 (AndyRussG) In logstash, I see a bunch of stuff like this: {"id":"V@0v0gpAAEIAAd5oBDcAAAEJ","type":"MWExceptio... [03:58:31] (CR) Cdentinger: [C: 2] Don't return a reference to object, that means nothing [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313546 (owner: Awight) [03:59:17] (Merged) jenkins-bot: Don't return a reference to object, that means nothing [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313546 (owner: Awight) [04:05:00] (CR) Awight: [C: 2] "Thanks for hiding some knives!" [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313551 (owner: Cdentinger) [04:05:42] (Merged) jenkins-bot: Remove Configuration::setDefaultConfig() [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313551 (owner: Cdentinger) [08:32:02] Fundraising Sprint Stirring The Pot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Fatal exception when cloning/saving banners with translatable messages - https://phabricator.wikimedia.org/T146880#2679316 (Pcoombe) [08:32:40] Fundraising Sprint Stirring The Pot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Fatal exception when cloning/saving banners with translatable messages - https://phabricator.wikimedia.org/T146880#2673576 (Pcoombe) [08:32:43] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Using translatable messages in CentralNotice banners lead to an MWException - https://phabricator.wikimedia.org/T147002#2679322 (Pcoombe) [08:33:20] Fundraising Sprint Stirring The Pot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Fatal exception when cloning/saving banners with translatable messages - https://phabricator.wikimedia.org/T146880#2679324 (Pcoombe) p:High>Unbreak! [16:33:13] I'm just here to keep an eye on slander... [16:33:31] But how's the sh*tstorm? [16:59:18] awight|lurk: looks like things are OK now, I'm about to turn all the banners back on [17:05:47] * AndyRussG anticipates hearing a "crunch" sound when CN breaks... [17:17:37] d'oh--no slander [17:18:02] bad slander, no cookie [17:20:07] hehe [17:26:59] borken. [17:27:30] baaha. UTC. [17:30:07] fr-tech: How come we never talk anymore? [17:30:08] -- discuss. [17:32:01] awkward. just a test of the automated annoyance system. [17:46:20] love it [17:54:28] (PS1) Ejegg: Add check for stable data before running GC charges. [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/313606 (https://phabricator.wikimedia.org/T144557) [17:54:52] (CR) Ejegg: [C: 2] Add check for stable data before running GC charges. [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/313606 (https://phabricator.wikimedia.org/T144557) (owner: Ejegg) [17:55:18] (Merged) jenkins-bot: Add check for stable data before running GC charges. [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/313606 (https://phabricator.wikimedia.org/T144557) (owner: Ejegg) [17:57:00] !log updated civicrm from 18e59abac57ba85ee9d9dbd50f9f25df64522974 to e2b5bbfbdaaad29925fc60586ce7a2da8297cc2d [17:57:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:43:25] (PS1) Ejegg: Add required gross_currency to PP refunds [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313607 [18:43:39] fr-tech some code review if you please? ^^ [18:44:14] We just got a couple of failmails, and I think we need ^^ to actually get PayPal refunds imported from the listener [18:44:23] (CR) Cdentinger: [C: 2] Add required gross_currency to PP refunds [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313607 (owner: Ejegg) [18:44:27] thanks! [18:44:34] thank you! [18:44:51] deploying, let's see if it gives apache the fits again [18:45:10] (Merged) jenkins-bot: Add required gross_currency to PP refunds [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313607 (owner: Ejegg) [18:45:50] that was so weird [18:46:52] (PS1) Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313608 [18:47:12] do we run mod php or fpm? [18:47:20] (CR) Ejegg: [C: 2] Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313608 (owner: Ejegg) [18:47:31] (Merged) jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - https://gerrit.wikimedia.org/r/313608 (owner: Ejegg) [18:48:27] cwd think it's mod php there [18:49:33] !log updated SmashPig from 8ff1950ccd87c649f1748f25e1a0a708c3337206 to 4b36376f4b206406b5b88661cfcecf1b588d5bcf [18:49:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:55:20] i guess that's why php could crash apache, if that's what happened [18:59:17] yeah [19:03:39] (PS2) Ejegg: Eradicate Stomp from crm repo [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/312443 [19:03:41] (PS3) Ejegg: Stop mirroring to stomp from audit processors [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/312442 [19:03:43] (PS2) Ejegg: Remove obsolete and broken wmf_unsubscribe module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/312441 (https://phabricator.wikimedia.org/T145419) [19:07:49] (PS3) Ejegg: Use ct_id to find completed, avoid race [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/312563 (https://phabricator.wikimedia.org/T141477) [19:16:52] (PS3) Ejegg: WIP use redis for Adyen jobs [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313130 [20:01:34] Fundraising Sprint Stirring The Pot, Fundraising-Backlog, Unplanned-Sprint-Work: List the teams/system that could accidentally break us - https://phabricator.wikimedia.org/T147096#2680851 (DStrine) [20:35:09] (PS2) Ejegg: PDO: create/delete table doesn't need arg [wikimedia/fundraising/php-queue] - https://gerrit.wikimedia.org/r/287942 [21:05:03] (PS4) Ejegg: WIP use redis for Adyen jobs [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313130 [21:25:01] (PS5) Ejegg: Use redis for Adyen jobs [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313130 [21:26:05] (PS1) Ejegg: Remove bogus 'inflight' store [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/313622 [21:33:18] (CR) Ejegg: [C: 2] Cleanup: Use helper functions [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/312949 (owner: Awight) [21:52:02] (PS13) Ejegg: MariaDB strict mode [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/310457 (https://phabricator.wikimedia.org/T145591) [21:53:40] (CR) Ejegg: "Good catch AndyRussG - looks like the rows returned from the db still often have all the ints cast to strings. I undid all the new strict " [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/310457 (https://phabricator.wikimedia.org/T145591) (owner: Ejegg) [21:55:17] relocating... [22:51:43] AndyRussG: any luck reproducting the banner save error? I've got all the translate stuff enabled, and it's saving fine for me [22:52:16] oh, let me try with group Fundraising to see if it's that weird workflow extn [22:56:28] hmm, doesn't seem to [23:14:30] ejegg: no, haven't yet... Was gonna try on the beta cluster [23:14:54] Just looking at the stack traces, it's definitely related to translate extension [23:17:16] oh, let me take a look [23:17:25] To see, go to https://logstash.wikimedia.org, choose mediawiki-errors, timespan of the last 7 days, and enter CentralNotice in the search area [23:18:08] Thanks! [23:18:12] Note that there have been some unimportant changes in BannerMessage between master and wmf_deploy, so to see the real line numbers, go to and update your local wmf_deploy branch [23:20:21] AndyRussG: I see a bunch of StaleCampaignExceptions there [23:21:04] oh wait, i'm only looking at one day [23:22:25] ejegg: yeah stale campaign stuff is all fine and dandy [23:23:12] It used to be transparently caught and hidden because bad code but it's really just due to old JS that caches stuff (like choiceData) for tooooo loooong [23:24:14] AndyRussG: the Invalid banner name supplied exceptions? [23:25:06] some spammer asking for a banner that's naught... hopefully? are there a lot of those? [23:25:14] that was the top one [23:25:26] from the last hour? [23:25:29] Or day? [23:26:33] aaarg my network stretches the meaning of "connectivity" [23:26:46] AndyRussG: last 7 days [23:26:58] * ejegg can't find any stack traces [23:27:07] ejegg: ah I think you didn't click on "mediawiki-errors" at the top [23:27:14] That'll eliminate ones that are info or debug [23:27:57] oh, i see [23:31:03] AndyRussG: Content of revision X could not be loaded for validation? [23:31:16] what exception message am I looking for? [23:32:31] ejegg: yep that's the one. core/includes/Revision.php line 1553-1555 [23:33:51] cool, digging from there [23:35:28] ejegg: interesting to note that $revId seems to be empty [23:42:02] checking history for possibly related changes [23:46:24] So somehow $content is returning falsy [23:48:39] K the code that actually throws the error isn't new [23:50:43] AndyRussG: is the $revId empty because we created it with newNullRevision? [23:50:51] trying to understand that bit [23:53:29] No idea. I think that's because it's not actually a real revision, just an action of protecting the page from edits? [23:53:41] Heheh it could actually be related to the other bug, if it's because the content (message?) is null [23:54:33] Off to pick up the kid. Have a good weekend. [23:55:18] XenoRyet|afk: cya! [23:56:08] ok, setting $wgTranslateWorkflowStates [23:57:38] I think you said u set that up locally, i.e. not vagrantwise, right?