[00:01:37] oh, because it was just merged 6 days ago :) makes sense [00:05:39] 3Wikimedia-Blog, MediaWiki-Core-Team: Finish up blog post for Cirrus - https://phabricator.wikimedia.org/T85176#972510 (10Tbayer) [00:30:26] ori: https://packagist.org/packages/wikimedia/arc-lamp [00:30:51] bd808: dude, you're awesome. thanks very much. [00:31:11] Do you have a packagist account that I can add as a co-maintainer? [00:34:58] bd808: no; i'll create one now. [00:35:20] bd808: 'atdt' [00:35:47] {{done}} [00:36:40] thanks! [00:40:58] bd808: should ext-redis be listed as a dependency? it ships with HHVM, and this extension doesn't work with PHP5. [00:41:48] eh it works fine. hhvm properly claims to include redis [00:42:07] cool. wfm. [00:42:30] I tested on mw-vagrant when I made the manifest [00:43:19] * ori nods [00:43:29] I thought briefly about trying to make a vagrant role to set it up but then didn't because the install process seemed like it would be a pita [00:44:18] yes, the codebase is still a few source files slapped together; the pieces would have to be better integrated before that would be viable [00:47:24] ori: https://gerrit.wikimedia.org/r/#/q/status:open+project:mediawiki/services/jobrunner,n,z [00:48:05] AaronSchulz: https://gerrit.wikimedia.org/r/#/c/184412/ should have a comment in-line [00:49:38] 3MediaWiki-Core-Team: MediaWiki Developer Summit 2015 proposal: Why MediaWiki slows us down - https://phabricator.wikimedia.org/T86280#972574 (10RobLa-WMF) @daniel, thanks for agreeing to head up this session! I've pinged both @bd808 and @smalyshev as people that may be able to help give this session more shape... [04:14:41] 3MediaWiki-API, MediaWiki-Core-Team, MediaWiki-General-or-Unknown: Fatal Error: Call to a member function disable() on a non-object (wfHttpError should check, if wgOut is an instance of OutputPage) - https://phabricator.wikimedia.org/T86398#972739 (10Florian) 5Open>3Resolved a:3Florian [06:13:29] thank the lord, mw.format() now provides sprintf-like functionality in javascript [06:13:51] a few more decades and we'll have a decent development environment [06:15:02] i love this, too: [06:15:04] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/isInteger [06:15:14] in boldface: "This is an experimental technology, part of the Harmony (ECMAScript 6) proposal." [06:15:33] "Summary: The Number.isInteger() method determines whether the passed value is an integer." [06:16:08] just imagine, we're only a few years away from this being available on most browsers [08:11:45] _joe_: i didn't try any of the kernel params in the end, too many distractions [08:11:54] so hopefully we can just go to the new package instead [08:30:28] <_joe_> ori: I'll try with the kernel params this morning, so we have time to properly test the twemproxy package and the use of sockets [08:30:37] <_joe_> I have all the time to do it [08:30:46] cool [08:30:51] <_joe_> one question: how do I configure mediawiki in beta? [08:30:54] i tested the socket thing on osmium, it seemed to work [08:31:39] oh, either by having a realms guard ( if ( $wmfRealm === 'labs' ) ... ) or by placing the config in one of the labs specific files [08:31:48] <_joe_> I'd like to keep beta in the condition "prod + new stuff" as much as possible [08:31:56] <_joe_> oh, ok so mediawiki-config as usual [08:32:42] jenkins automatically pulls master every 5 mins IIRC, so it's not effective on sync-file/dir, but we typically sync changes to the whole cluster even if they are labs-only just so that there are no unsynced changes in the queue for the next deployer [08:32:58] <_joe_> nod [08:36:34] <_joe_> ori: one last thing and I'll stop harassing you. What is a typical memcache error on the problematic servers? [08:38:16] 'CONNECTION FAILURE'. Tim devised this command line for aggregating by server: grep 'CONNECTION FAILURE' /a/mw-log/memcached-serious.log | awk '{print $3}' | sort | uniq -c (simple, but spares you from having to count fields :)) [08:38:34] <_joe_> ori: ok so only on fluorine [08:38:43] <_joe_> I hoped to see those directly on the server [08:39:07] 'tail -f /a/mw-log/memcached-serious.log | grep mw1232'? [08:39:24] <_joe_> yeah, on fluorine :) [08:39:31] <_joe_> not a real issue btw [08:39:46] I'm also not sure if the errors are generated continuously, or if they are only apparent during peak traffic hours [08:40:01] <_joe_> they are [08:40:34] At some point last week I did a 'tail -10000 memcached-serious.log | grep -c FAILURE' and got very few and convinced myself (somewhat prematurely) the problem was resolved [10:14:34] hi all. Will wikimedia be at FOSDEM 2015 in brussels (in two weeks) ? [10:15:48] Yes, I think so [10:15:55] but don't ask me about details, I have no clue [10:18:16] ok [10:27:31] <_joe_> ggherdov`: well not sure what you mean by "wikimedia being at fosdem"; quite a few of the WMF engineers will be attending of course [10:27:34] <_joe_> me included [10:38:35] _joe_: I think there's something official going on [10:38:51] I saw something in phab I think [10:39:08] <_joe_> hoo: I'm unaware, but I am terrible in following those things [10:39:19] <_joe_> hoo: will you be @fosdem btw? [10:39:57] No, can't make it... mediawiki summit in SF is already eating up more time than reasonable :/ [10:42:13] <_joe_> oh you'll be in SF anyways [10:42:28] <_joe_> so we'll have time for a beer then :) [10:43:25] Yeah :) [11:27:00] _joe_: are we also going to try the unix domain socket? i'd feel like an idiot if all that work was for nothing :/ [11:27:18] <_joe_> ori: I am working on the package right now [11:27:30] \o/ [11:27:37] i really did go to sleep, just woke up again :/ [11:27:38] <_joe_> ori: so yes, I just want to do that a bit more cautiously [11:27:53] yeah makes sense [11:27:56] <_joe_> ori: also, we should think of a way of getting 'response times' at the apache level [11:28:00] <_joe_> we have nothing there [11:28:13] <_joe_> and I'd really like to have per-server graphs of those [11:28:23] <_joe_> so that we can see the effects of changes we make [11:28:24] yeah, do you have any particular approach in mind? [11:28:28] <_joe_> nope [11:28:31] <_joe_> :) [11:28:35] me neither, will do some research tomorrow [11:34:04] _joe_: just FYI: tested new hhvm version on twn and still started timeouting [11:36:42] <_joe_> sigh [11:36:57] <_joe_> Nikerabbit: are you going to be in SF? [11:37:30] _joe_: yes 20th-30th approximately [11:40:24] <_joe_> ok, hopefully we can get some time to pair up on that [11:48:11] _joe_: I'll add a note to myself to ping you [11:48:51] <_joe_> k :) [11:51:40] or is it "for myself"? [11:51:50] for myself [11:52:31] actually, i think either would work in this context. [11:53:22] <_joe_> yeah, 'note to self' is commonly used [11:53:32] <_joe_> well, it doesn't mean it's correct [12:39:05] * Nikerabbit imagines writing a note to his hand [12:40:07] Does your hand read notes [13:04:11] 3MediaWiki-Core-Team, MediaWiki-extensions-TitleBlacklist: Title blacklist intermittently failing, allowing users to edit things they shouldn't be able to - https://phabricator.wikimedia.org/T85428#973343 (10Aklapper) @tstarling: Any progress/success finding out the reason? Any news to share? [14:27:37] AaronS: i'll look into the linksupdate [14:29:51] "RandomDSdevel awarded T70361: Section edit containing "/*" is mis-parsed a Haypence token."? WTF is a "Haypence token"? [14:51:04] <^d> anomie: tokens are silly :) [14:51:13] <^d> Little things you give tasks [14:51:16] <^d> To show stuff [14:51:20] <^d> Like a cactus [14:51:25] <^d> Or a haypence [14:52:45] "I am going to put a cactus on this task, because I am silly"? [14:53:10] <^d> I gave you the toucan [14:54:00] And yet the notification calls it a pterodactyl. [14:54:14] <^d> They all have silly not-quite-what-it-is names :) [14:54:38] <^d> Well, maybe not [14:54:47] <^d> Just the toucan and the cactus [14:54:51] <^d> "baby tequilla" [14:54:54] 9/16 of them do [14:55:01] "Evil Spooky Haunted Tree"? [14:55:17] <^d> well normal trees aren't as fun :) [14:56:12] * anomie awards the "Manufacturing Defect?" token to T899 [15:33:39] this makes me happy http://www.gog.com/game/grim_fandango_remastered [15:33:51] I never got to play it when I was a kid and I've heard its great [15:42:48] 3MediaWiki-Core-Team, operations: Deploy multi-lock PoolCounter change - https://phabricator.wikimedia.org/T85071#973808 (10fgiunchedi) 5Open>3Resolved a:3fgiunchedi resolving, changes deployed including https://gerrit.wikimedia.org/r/#/c/184640/ [16:40:21] If anyone wants something that should be easy to +2, https://gerrit.wikimedia.org/r/#/c/184662/1 [17:25:43] 3MediaWiki-Core-Team, MediaWiki-API, MediaWiki-General-or-Unknown: Fatal Error: Call to a member function disable() on a non-object (wfHttpError should check, if wgOut is an instance of OutputPage) - https://phabricator.wikimedia.org/T86398#973998 (10Anomie) a:5Florian>3Anomie [17:29:30] csteipp: legoktm: Any idea why a password matching could take 30s? [17:29:44] hoo: For one password? [17:30:45] pbkdf2 is cpu intensive, so anything else running on the cpu is going to slow it down. Not super memory intensive, so swapping shouldn't affect it. [17:31:09] mh [17:31:19] Looking again, it were different servers [17:31:30] so the users was hitting the special page several times, nervermind [17:31:36] but still... why did that time out before [17:32:02] Takes 10s now (for that user) and can match 3 accounts by mail [17:32:15] and until yesterday it didn't work... how? [17:32:36] Maybe memory is a problem (but I don't know where exactly... in my tests on osmium it wasn't) [17:37:18] ori: T86305 is incredibly frustrating [17:42:08] i chimed in [17:42:29] csteipp: https://github.com/facebook/hhvm/issues/3740 could that hit us? [17:42:42] Do we iterate over stuff that often? (I have no idea) [17:43:02] ori: Warning: Invalid argument supplied for foreach() in /srv/mediawiki/wmf-config/StartProfiler.php on line 79 [17:43:18] again?! [17:43:29] Looks like the xenon array doesn't always have "phpStack" data? [17:43:53] I see 41 in the last hour in logstash [17:44:08] well, there's "asyncStack", but we're not using Hack [17:44:38] there's also "ioWaitSample" which looks very interesting [17:44:40] Maybe we have a host that is out of sync... that line number is the closing brace of the current foreach loop [17:44:42] anyways, i'll ad a check [17:44:47] oh [17:44:49] hmmmm [17:45:11] let me see what hosts it comes from [17:45:22] 3MediaWiki-Core-Team: New serializable DataUpdate classes - https://phabricator.wikimedia.org/T86597#974021 (10aaron) [17:47:54] ori: Its from a variety of hosts and only 1 or 2 times from each host over the last hour [17:48:18] so I guess just sometimes the data isn't there [17:48:19] well, xenon only fires once every ten minutes per host [17:48:30] so one or two times per host per hour is pretty often [17:48:37] https://logstash.wikimedia.org/#dashboard/temp/vL3ik2niTeCWWOAUjJReSA [17:48:47] anyways, i'll add a check [17:48:49] thanks for flagging [17:48:51] CR in a sec [17:49:31] 3MediaWiki-Core-Team, SUL-Finalization, MediaWiki-extensions-CentralAuth: Investigate and fix OOMs caused during account globalization - https://phabricator.wikimedia.org/T78727#974023 (10hoo) [17:49:54] 3MediaWiki-Core-Team, SUL-Finalization, MediaWiki-extensions-CentralAuth: Investigate and fix OOMs caused during account globalization - https://phabricator.wikimedia.org/T78727#851988 (10hoo) I'm fairly sure now that https://github.com/facebook/hhvm/issues/3740 hits us over here. [17:52:17] ori: https://github.com/facebook/hhvm/issues/3740 [17:52:36] Do we have any procedure to push for that? Or shall we rather try working around it? [17:53:00] swtaarrs: ^^ [17:53:12] it got flagged as high-prio, and you provided a minimal repro case (which is excellent of you) [17:53:27] so i imagine it'll get a fix pretty soon [17:53:46] if you want to give it a shot yourself you'd need to be cozy with gdb [17:55:29] I gave it a brief shot myself... no luck (see issue on github)... valgrind also didn't help (to no surprise on such a thing with GC and stuff) [17:56:17] yeah debugging memory leaks in hhvm is tricky. i know the HHVM folks have a build with ASAN (http://www.chromium.org/developers/testing/addresssanitizer) [17:56:24] we don't at the moment [17:57:28] how bad is it right now? [17:57:58] hoo@fluorine:/a/mw-log$ grep -c MergeAccount apache2.log [17:57:58] 100 [17:58:05] that's the special page failing [17:58:20] We also had the problem with automerge... so I guess it's a blocker for SULF [17:58:57] oh hm [17:59:02] just read sgolemon's comment [18:00:29] hoo: some stuff in HHVM is only freed when the request context is destroyed, so the fact that memory usage is growing is not in itself proof that anything is wrong. try changing the repro case so that it runs only once (no loop), and then run it with hhvm --count=1000 [18:00:33] (or whatever) [18:00:46] ori: mh [18:01:14] but the leak was happening after a certain amount of memory... eg. 30000 worked, but 30001 didn't (numbers totally amed up) [18:01:40] Seeing the timestamps on the comment, I guess she only posted it after I poked around [18:03:53] ori: Can we tweak jemalloc to allocate differently (and if only for testing)? [18:07:57] yes [18:09:41] see https://phabricator.wikimedia.org/T820 [18:10:34] <3 if you can add a link to that in wikitech's HHVM/Debug [18:10:41] * ori has 1000000 tabs open [18:41:00] bd808: because of the weird failure to refresh fatal error handler code you may still see these errors until all hosts are restarted [18:41:10] i'm not going to initiate a cluster-wide restart because it doesn't seem worth it [18:41:31] yeah that seems crazy [18:42:06] hhvm persists handlers across request? [18:43:09] that's how things appeared at one point [18:43:15] weird [18:43:17] and i think you observed something like that too, no? [18:43:27] but i haven't done the requisite leg-work to confirm / isolate the bug [18:43:31] so i'm still not 100% sure [18:43:40] I'm old and forget things :) [18:43:52] * bd808 looks for his keys and glasses [18:44:04] "Warning: could not unserialize value, no igbinary support" ugh [18:44:06] still [18:44:42] we should hack HHVM to print the key in such cases [18:45:43] Same for the case where it fails to unserialize -- https://github.com/facebook/hhvm/issues/4517 [18:55:39] legoktm: I wonder if you should add a test for DeferredStringifier that tests that the callback isn't called when the object isn't stringified. [18:57:17] anomie: I can do that [19:02:10] anomie: so this is what my test case ended up looking like... http://fpaste.org/169203/21175705/raw/ is that useful? [19:02:40] legoktm: Looks good to me. [19:03:06] * anomie tries to remember if there's a better idiom than "assertTrue( true )" [19:05:11] updated https://gerrit.wikimedia.org/r/184205 [19:22:00] 3VisualEditor, MediaWiki-Core-Team, VisualEditor-MediaWiki, MediaWiki-Configuration: convertExtensionToRegistration.php does not set defaults for globals, so $wgResourceModules += array(...) and similar cause fatals - https://phabricator.wikimedia.org/T86311#974254 (10Legoktm) 5Open>3Resolved [19:33:59] 3MediaWiki-Core-Team, MediaWiki-Configuration: convertExtensionToRegistration.php should write out extension.json files in a nice order - https://phabricator.wikimedia.org/T86608#974327 (10Legoktm) p:5Low>3High [19:49:58] typing while icing your hand is weird [19:51:13] ow. what happened? [19:51:46] tendinitis [19:52:00] * ori nods [19:52:15] doctor twinged by tendon today and I didn't scream but I very firmly told her not to do it again [20:17:31] <^d> manybubbles: Ouch. [20:18:15] <^d> Totally different subject...I had mentioned before that Elastica had implemented "connection strategies" since we pulled last. We might be able to use the callback-based strategy to do some sort of multi-write. [20:18:27] <^d> When we get to the "need to support codfw" point [20:18:38] <^d> I was poking it last night. [20:20:20] ori: hmm has that actually been confirmed as a leak? [20:22:49] ^d: I think i'd prefer a fork at the job level [20:22:58] swtaarrs: hoo may be able to provide more data but I think what has been confirmed is unexpected max memory usage. It may be jmalloc quirks rather than a leak. [20:23:09] ok [20:23:20] that seems a bit less brittle [20:23:22] 3ContentTranslation-Deployments, MediaWiki-Core-Team, MediaWiki-extensions-ContentTranslation: Content Translation Beta Feature security review - https://phabricator.wikimedia.org/T85686#974448 (10Jsahleen) @CSteipp All the security related patches are merged and on beta-labs now. Please verify that things are... [20:23:25] yeah unless usage keeps growing in an infinite loop that doesn't count as a leak [20:23:37] <^d> manybubbles: I also considered that. We'd just have to tack on the destination on but that'd be easy. [20:23:50] ^d: da [20:23:59] swtaarrs: Ok, mh... so what now? [20:24:03] we could give it a different name so we could turn it on and off quick [20:24:05] That is a huge problem :/ [20:24:41] hoo: what's the situation this was causing a problem in? 2MB of extra memory usage isn't a problem is it? [20:24:46] <^d> manybubbles: Doubling our # of jobs sounds like it could be fun. We already back up on a major enough template jobs. [20:24:50] <^d> Only reason I shied away from it [20:25:35] ^d: we're backed up on those because we throttle them. we could play with the throttle i think and be ok [20:26:09] <^d> If they have different names does that mean we'd need a duplicate of each job type? [20:26:10] swtaarrs: We do that in a loop, so I guess it adds up [20:26:15] <^d> Just to subclass and handle the diff url? [20:26:39] hoo: but if you change that small repro to an infinite loop, does usage keep growing over time or does it top out at 2mb? [20:26:44] But I don't really have details... our memory limit is 330MiB (or something along these lines) and the base usage is probably less than 50MiB [20:27:18] swtaarrs: Not sure, I only tested the case where it started the behavior [20:29:04] meh, my hhvm is not up to date... [20:29:11] I'll try it on my machine [20:31:10] hoo: ok this is definitely a leak [20:31:13] I'll comment on the task [20:31:26] thanks [20:47:57] 3MediaWiki-Core-Team, Wikidata, wikidata-query-service: Figure out quantity representation - https://phabricator.wikimedia.org/T85298#974503 (10Smalyshev) May also be related to T86528 [20:49:22] 3MediaWiki-Core-Team, Wikidata, wikidata-query-service: Deploy a Wikidata complex query service into production - https://phabricator.wikimedia.org/T85159#974508 (10Smalyshev) [20:49:23] 3MediaWiki-Core-Team, Wikidata, wikidata-query-service: Evaluate Titan as graph storage/query engine for Wikidata Query service - https://phabricator.wikimedia.org/T76373#974507 (10Smalyshev) 5Open>3Resolved [20:54:46] 3MediaWiki-Core-Team, Librarization, MediaWiki-General-or-Unknown: [Regression] MediaWiki should detect absent or outdated vendor - https://phabricator.wikimedia.org/T74777#974519 (10bd808) >>! In T74777#951776, @Legoktm wrote: >>>! In T74777#951775, @Legoktm wrote: >> MWLogger is a class inside MediaWiki (as th...