[00:36:29] 3Phabricator, MediaWiki-Core-Team, Security-Reviews: Install PHPExcel so I can export reports - https://phabricator.wikimedia.org/T152#1026737 (10Dzahn) >>! In T152#987801, @Chad wrote: > Let's drop the owncloud package. It's completely unrelated and we don't need it. We'd just deploy this via git-deploy or some... [00:39:33] 3Phabricator, MediaWiki-Core-Team, Security-Reviews: Install PHPExcel so I can export reports - https://phabricator.wikimedia.org/T152#1026745 (10Dzahn) quote from that bug "Since ownCloud recently decided not to ship PHPExel anymore, I lost interest into packaging it. Please note that OLE is DFSG-incompatible a... [01:39:03] ^d: Any idea why my mw-vagrant install would be loading extensions/CirrusSearch/tests/jenkins/Jenkins.php on each request? [01:39:28] It's causing the hooks for jenkins stuff to run all the time [01:40:51] eval.php says $wgWikimediaJenkinsCI is undefined [01:41:31] oh.... PHP_SAPI !== 'cli' [01:44:32] no that is guarded by an and. Hmmm [02:53:27] <^d> bd808|BUFFER: I hate that file. [03:08:51] the recent contentmodel change is causing tons of problems [03:09:05] users are finding ways to create pages with the wrong content model, which are then permanently unfixable [03:21:19] jackmcbarn, are there bugs for these issues? [03:21:46] MaxSem: i think there are (they're rather old issues, I don't really remember), but they're just now becoming bad [03:22:01] a major aggravating factor is that nobody is allowed to change a page's content model for "security" reasons [03:49:50] jackmcbarn: lets grant the userright to sysops by default? [03:50:04] as a stopgap maybe [03:50:15] but i see no reason to even have it [04:07:40] how do people even manage to screw content model up? [04:39:23] ^d: Doh. It's loaded on purpose. https://github.com/wikimedia/mediawiki-vagrant/blob/master/puppet/modules/elasticsearch/templates/CirrusSearch.php.erb [04:44:20] bd808, nevertheless it's a pita as it turns wikis into wiktionaries [04:46:20] its triggering a redis connection error for me too. Trying to track that down. Something is passing an empty string as the redis connection password instead of a null [04:47:26] hmm, i think i've seen an empty redis password somewhere [04:47:37] wasn't that CS itself? [04:48:47] yup. it's in that jenkins file. Hadn't looked there for some dumb reason [04:49:25] does that mean that under Jenkins we have a redis server that takes a password but it needs to be empty? [04:49:48] on mw-vagrant redis has no password rather than an empty password [04:52:03] now guess why I tried https://gerrit.wikimedia.org/r/#/c/155861/ ? :P [04:52:15] heh [04:52:58] yawwwn [04:53:02] going home [06:42:42] 3operations, Incident-20150205-SiteOutage, MediaWiki-Core-Team, Wikimedia-Logstash: Prototype Monolog and rsyslog configuration to ship log events from MediaWiki to Logstash - https://phabricator.wikimedia.org/T88870#1027134 (10bd808) The syslog transport will pass messages up to 65023 bytes minus the syslog mes... [10:04:00] 3MediaWiki-extensions-SpamBlacklist, Continuous-Integration, MediaWiki-Core-Team: Figure out a system to override default settings when in test context - https://phabricator.wikimedia.org/T89096#1027386 (10hashar) 3NEW [10:04:17] 3MediaWiki-extensions-SpamBlacklist, Continuous-Integration, MediaWiki-Core-Team: Figure out a system to override default settings when in test context - https://phabricator.wikimedia.org/T89096#1027397 (10hashar) [10:06:56] 3MediaWiki-extensions-SpamBlacklist, Continuous-Integration, MediaWiki-Core-Team: Figure out a system to override default settings when in test context - https://phabricator.wikimedia.org/T89096#1027423 (10hashar) [11:34:13] 3MediaWiki-extensions-SpamBlacklist, Continuous-Integration, MediaWiki-Core-Team: Figure out a system to override default settings when in test context - https://phabricator.wikimedia.org/T89096#1027711 (10Florian) +1 for a new hook, but the documentation should be really clear, that this hook should be used ver... [14:37:34] 3MediaWiki-Interface, Mobile-Web, MediaWiki-Core-Team, Parsoid: Switch MobileFrontend over to using Parsoid for its read HTML - https://phabricator.wikimedia.org/T76970#1028024 (10marcoil) [15:00:58] <^d> I've decided I don't like UsageException [15:10:46] Any particular reason? [15:11:51] <^d> Our use of it in Cirrus is kind of lame. We throw it as an error condition deep down. [15:12:05] <^d> Really, that code should support bubbling some error back to the user instead of an explosion [15:12:25] <^d> It's not so much UsageException, just our use of it. [15:13:19] <^d> Too common a failure condition for an exception to hit the logs :) [15:39:27] 3Possible-Tech-Projects, MediaWiki-Core-Team: Removing inline CSS/JS from MediaWiki - https://phabricator.wikimedia.org/T89134#1028165 (10NiharikaKohli) [16:11:54] 3operations, Incident-20150205-SiteOutage, MediaWiki-Core-Team, Wikimedia-Logstash: Prototype Monolog and rsyslog configuration to ship log events from MediaWiki to Logstash - https://phabricator.wikimedia.org/T88870#1028237 (10bd808) If we wanted to use rsyslog as an intermediary, it would need configuration si... [16:16:41] bd808: btw, re better error pages help, what exactly is needed from my team? I kinda missed that part. :) [16:18:02] greg-g: Not reported on the ticket, but apparently during the outage on Friday he found that the error page being served wasn't the one that I had pointed him to in mediawiki-config [16:18:17] where he == ? [16:18:49] https://phabricator.wikimedia.org/p/Nirzar/ (had to look it up) [16:19:16] This really needs better tracking in phab [16:19:17] were no patches linked to this ticket? where's the work, man? [16:19:32] * greg-g is just having still-waking-up-fun [16:19:33] in his dev machine so far I guess [16:19:36] gotcha [16:20:01] I deramed up a wya for him to test medaiwki-config changes under vagrant [16:20:07] *dreamed [16:20:17] who the typos there [16:20:32] heh. "wow" /me stops typing [16:20:35] <^d> google docs [16:22:26] bd808: thanks, I commented. I don't see any messages from anyone with "we want this out the door by $date" so I'm not going to bust my butt on it yet [16:22:57] <^d> Isn't this just 503.html in mw-config? [16:23:07] <^d> and 404.html? [16:23:20] *nod* the ticket has talk of needing sign off on final wording etc. Mostly Jarred wanted a point of contact for asking tech questions [16:23:35] bd808: ahhh, ok, that's easy enough :) [16:23:52] ^d: yeah and whatever page is directly served by varnish when things are really borked [16:24:14] plus w/404.php and ... I think I found one more error page [16:24:24] <^d> Yeah I'm having trouble finding the varnish one [16:24:54] It's buried in ops/puppet somewhere. Let me find my emails on this [16:25:09] <^d> Gah, which one of these 404s do we use?!? [16:25:28] w/404.php [16:25:46] the other one is only for secure.wikimedia.org [16:27:01] ^d, greg-g: email forwarded with some stuff I dug up previously. [16:27:12] <^d> bits & secure use 404.html [16:27:18] <^d> Everything else uses 404.php [16:27:41] bd808: ty [16:27:50] hhvm-fatal-error.php and wmf-config/missing.php are also used [16:28:00] so many error paths [16:28:13] ...are they functionally different? [16:28:14] * ^d 's head explodes [16:29:16] they are. hhvm-fatal is the hhvm 503 page. missing.php is for a non-active language for an active project [16:29:21] kind of a meta 404 [16:29:30] <^d> Yeah, missing.php is special [16:29:35] <^d> I have 6 error pages open in my editor for wmf-config. [16:30:29] but we don't want all of them to have the donate button, right? [16:31:00] who the hell knows [16:31:28] right, just push the buttons, techy [16:31:36] techie, whatever the right spelling is [16:31:38] the comp I saw was working hard to hide the "ugly error message" too [16:31:44] "monkey" I think is it [16:32:17] It was twitterified a bit much I think [16:32:42] "oops the server kitties ran away" sort of stuff [16:34:54] right [16:35:39] <^d> How about focusing on having less errors? Then everybody wins [16:36:11] bd808: https://github.com/wikimedia/FirefoxWikimediaDebug/pull/1 Looks like you forgot that wikdata thing :D [16:36:46] who uses that site? ;) [16:37:43] <^d> loll github [16:37:50] hoo: merged. I'll do a new release tonight [16:38:02] Nice :) [16:38:51] hoo: you should give a patch to or's chrome version too. https://github.com/wikimedia/ChromeWikimediaDebug [16:41:28] https://github.com/wikimedia/ChromeWikimediaDebug/pull/1 There you go :) [16:42:16] tgr mail bombed me last night with mediawiki-vagrant patches :) [16:42:50] the best kind of mail bombing [16:43:19] it is except now I feel like I have to review them... [16:46:13] or dan :) [16:48:47] bd808: I might copy/paste your 404/error page email to a phab paste and link it from that ticket [16:49:03] greg-g: +1 [17:40:11] http://taylorswift.tumblr.com/post/110643780215/style-music-video-friday-feb13style in case you were wondering! [17:50:00] 3operations, Incident-20150205-SiteOutage, MediaWiki-Core-Team, Wikimedia-Logstash: Prototype Monolog and rsyslog configuration to ship log events from MediaWiki to Logstash - https://phabricator.wikimedia.org/T88870#1028470 (10Anomie) >>! In T88870#1025205, @bd808 wrote: > What it does is prepend a partial RFC... [17:51:33] 3operations, Incident-20150205-SiteOutage, MediaWiki-Core-Team, Wikimedia-Logstash: Prototype Monolog and rsyslog configuration to ship log events from MediaWiki to Logstash - https://phabricator.wikimedia.org/T88870#1028474 (10bd808) >>! In T88870#1028470, @Anomie wrote: >>>! In T88870#1025205, @bd808 wrote: >>... [18:10:58] legoktm: https://gerrit.wikimedia.org/r/#/c/188721/ [18:11:29] * legoktm looks [18:14:51] 3MediaWiki-Core-Team: ObjectCacheSessionHandler should avoid pointless writes in write() - https://phabricator.wikimedia.org/T88635#1028560 (10Legoktm) [18:20:30] bd808: I guess I need the "elk" role to test that patch of yours, but I still can't seem to convince it to actually log anything. [18:20:58] hmm... it "should" work. Works on my machine ;) [18:21:19] bd808 / legoktm / anomie: I'm running late (waiting for wife's car to get fixed). My user stories are up. Any chance we can delay our meeting until noon pst? Or you guys can continue without me. [18:24:49] noon works for me [18:31:52] anomie: any object to moving our meeting from 19:00Z to 20:00Z? [18:31:57] *objection [18:32:05] bd808: not really [18:32:18] k. let's do that then for csteipp [18:32:52] * bd808 gets a lunch hor now today [18:33:19] wowie. the typing still evades me today [18:38:45] ^d: SHould I take you off of the mw-core monthly backlog grooming meeting invite list? I'm glad to have you if you want to come but ... [18:39:01] <^d> Please do :) [18:45:48] <^d> bd808: https://gerrit.wikimedia.org/r/#/c/188597/ [18:51:27] ^d: {{done}} [18:51:42] <^d> :) [18:51:47] <^d> yay, less profile calls [18:56:32] ^d: looks like https://gerrit.wikimedia.org/r/#/c/186946/ and the flow could merged by overriding jenkins [19:09:45] AaronS, you broke stuff: https://integration.wikimedia.org/ci/job/mediawiki-extensions-zend/2739/console :P [19:35:59] AaronS: I'm not seeing why that patch is failing...? [19:36:12] (the BagOStuff callback one) [19:42:35] legoktm: Do you have user stories for CentralAuth, local only, and/or 2-factor? I'm happy to bang out any that are missing... just not sure if you have any not in the etherpad yet [19:43:37] <^d> bd808|LUNCH: Crud, I missed some [19:45:16] <^d> legoktm: Easy :) https://gerrit.wikimedia.org/r/#/c/188590/ [19:45:26] <^d> https://gerrit.wikimedia.org/r/#/c/188595/ too [19:49:11] csteipp: I have them in a local notepad right now, i'll put them in the etherpad in a few minutes [19:50:15] ^d: merged :) [19:51:12] <^d> sweet [19:51:16] <^d> merged your ED stuff too [19:52:41] thanks :D [19:56:52] ^d: want to deploy https://gerrit.wikimedia.org/r/#/c/188586/ ? [19:57:15] <^d> Not this second [19:57:47] sure, just sometime...I'm not at my "deploying" box atm [20:01:50] ori: https://gerrit.wikimedia.org/r/#/c/186410/ just deleting code [20:03:55] 3VisualEditor, VisualEditor-Performance, MediaWiki-Core-Team: Parsoid performance analysis - https://phabricator.wikimedia.org/T85870#1028846 (10Jdforrester-WMF) [20:08:40] 3MediaWiki-Core-Team: Memory stash BagOStuff wrapper that relays all writes - https://phabricator.wikimedia.org/T88342#1028852 (10aaron) [20:12:31] AaronS: needs rebase sorry [20:13:09] lots of rebasing today [20:20:37] <^d> AaronS: Can you sanity check https://gerrit.wikimedia.org/r/#/c/189786/ for me? [20:23:06] 'directory' => '/usr/local/apache/images', [20:23:11] ^d: what's that for? [20:23:23] <^d> Oh yeah, where it caches thumbs [20:23:26] <^d> Bad directory [20:23:33] you shouldn't need that with 'backend' [20:24:51] <^d> I don't want backend here probably... [20:25:01] <^d> This was all dumped from $wgInstantCommons defaults [20:26:15] it's wikitech, so I guess the local FS is fine [20:26:24] so just setting directory is probably easiest [20:26:51] FileBackendGroup will see it and automatically make/register a backend and setup.php will also make the config be pointing to this implicit backend [20:26:58] ...so many layers of b/c... [20:31:17] <^d> So many ways to configure this crap [20:31:24] <^d> s/So/Too/ [20:31:54] 3VisualEditor, VisualEditor-Performance, MediaWiki-Core-Team: Parsoid performance analysis - https://phabricator.wikimedia.org/T85870#1028906 (10Jdforrester-WMF) [21:02:08] legoktm (also csteipp, bd808): Are we doing authz here too? I see it's in the RFC, but for the most part I'm not sure it makes much sense (only for OAuth, probably). [21:02:41] anomie: Tyler originally put it into the RFC, then pulled most of it out at Tim's request [21:02:50] ^ that [21:03:00] I think for us, we're only focusing on authn right now [21:03:02] He seems to have missed a bunch. [21:03:02] but thinking about authz is allowed obviously [21:03:35] * anomie thinks authn will be hard enough without mixing authz in too [21:05:11] Yeah, it would be nice, after the design for the authn side is mostly done to do a quick version of this exercise for OAuth/authz, and make sure there aren't glaring incompatibilities. But yeah, just authn is going to be difficult enough. [21:08:56] I wonder if OAuth is really "authn", or if it falls under "session provider" versus the standard session provider where all the existing authn methods are just fancy ways to get the session-key cookie set.. [21:12:58] bd808: Do we have a phab ticket about fatal logs? [21:13:24] hoo: not that I'm specifically aware of [21:15:53] <^d> bd808: Missed some :p https://gerrit.wikimedia.org/r/#/c/189802/ [21:16:34] <^d> Only calls in core left are lang converter [21:16:37] <^d> Looking at it now [21:17:54] One thing I note about the RFC, he seems to completely punt on UI for each provider. The one semi-relevant example in the "User interface" section seems to imply a form with separate username and password boxes for each method. [21:21:41] bd808: https://phabricator.wikimedia.org/T89169 [21:21:53] not sure who to CC :/ [21:25:34] me and joe [21:28:13] Done [21:33:02] bd808: PM question!!! We have https://phabricator.wikimedia.org/tag/mediawiki-logging/ but nothing for logging generally, right? (something like WMF-Logging or somesuch?) [21:33:23] 3Release-Engineering, MediaWiki-Core-Team: Log php fatals with full backtraces again (fatal.log on fluorine) - https://phabricator.wikimedia.org/T89169#1029177 (10bd808) [21:33:36] bd808: basically, I'm trying to think where to put that ticket other than team boards ^ :) [21:34:06] I added hhvm because that was the proximal cause [21:34:11] and, by extension, if there's anything else that might be falling through the cracks in #operations or #core or #mwgen or whatever [21:34:24] I'm sure tons of stuff :( [21:34:39] That's keeping me up at night right now [21:34:46] which part? [21:34:51] (and really?) [21:35:01] things not getting triagged [21:35:11] sort of really [21:35:23] :/ [21:36:16] triage has at least been front of mind the last couple of days [21:36:39] wondering how it really happens today. I know we find things but I don't know what we don't find [21:36:41] yeah, understood [21:37:04] I think we rely on andre and the community mostly [21:37:09] I have other things keeping me up at night, but that one is also on my mind [21:40:23] 3Release-Engineering, MediaWiki-Core-Team, Wikimedia-Logstash: Log php fatals with full backtraces again (fatal.log on fluorine) - https://phabricator.wikimedia.org/T89169#1029201 (10bd808) [21:41:26] 3Release-Engineering, MediaWiki-Core-Team, Wikimedia-Logstash: Log php fatals with full backtraces again (fatal.log on fluorine) - https://phabricator.wikimedia.org/T89169#1029116 (10bd808) Whatever we end up doing here, the events should also end up in #wikimedia-logstash. [21:44:26] hoo: how important is debug message capture loss to you as a consumer of the logs? Meaning how hard do you think we should be working to make sure that every wfDebugLog call ends up on fluorine and in logstash? [21:44:59] Obviously no logging sucks, but is some amount of loss reasonable? [21:45:19] You mean from the stuff we have set as what we want to log? [21:45:29] In InitializeSetting [21:45:30] d [21:45:31] * s [21:45:36] Asking this in context of https://phabricator.wikimedia.org/T88732 [21:45:51] the redis thing was an attempt to be really lossless [21:45:55] what blew up [21:46:00] *that blew up [21:46:22] now trying to find real requirements to inform the next gen design [21:46:45] I can't give you any numbers, but I had various occasions in the past where I needed one specific log entry [21:47:13] So having loss there is a rather huge trade off [21:47:52] I can say pretty confidently that udp2log loses some unquantified amount of messages [21:48:00] I know :/ [21:48:12] Also api.log is now sampled and all things like taht :/ [21:48:47] Especially for the SULF stuff I had quite some occasions where I just couldn't investigate because of missing logs [21:48:54] +1 ^ [21:48:56] *nod* [21:49:02] But before the hhvm switch it mostly just worked fine [21:49:15] (and full api logs are also important, to stress that) [21:49:36] care to add some "customer" feed back on T88732 about impact of gaps in logs? [21:50:20] I think there is work underway to ensure that api.log can be un-sampled. Ti.m seemed to be equially adamant that it should be complete [21:50:32] eg not sampled [21:53:26] <_joe_> bd808: there is no work going on ATM. I should work on it and I am 100% at work on redistributing the memcached machines atm [21:54:02] _joe_: We each only have so many hands. It's at least a known issue that we plan to address [21:54:52] <_joe_> btw, I have seen that the redis instances on the memcached hosts contain session data. Anyone knows something more about them? [21:55:03] legoktm: I put some initial thoughts on the bottom of the etherpad. [21:55:55] * legoktm looks [21:58:18] <_joe_> I see keys like loginwiki:session:a3be28d804686391d884cf698672afd7 [21:58:21] <_joe_> in redis [21:58:32] <_joe_> containing serialized php data [22:00:40] CommonSettings.php:$wgSessionCacheType = 'sessions'; [22:01:10] <_joe_> hoo: where are the IPs of those redis hosts set? [22:01:36] Would be in operations/mediawiki-config somewere [22:01:58] mediawiki-config/wmf-config/session.php [22:01:59] <_joe_> bd808: next question is, how are those keys sharded? [22:02:13] <_joe_> ok I'll look at the code directly [22:02:16] <_joe_> hoo: thanks! [22:02:20] https://github.com/wikimedia/operations-mediawiki-config/blob/35056c1c906505e30f4492a771882a54332f5402/wmf-config/session.php [22:02:49] _joe_: It's essentially a RedisBagOStuff [22:03:06] <_joe_> my next question would be why we don't use nutcracker to connect to redis as well [22:03:14] <_joe_> given we use it for memcache [22:03:16] https://github.com/wikimedia/operations-mediawiki-config/blob/ef3f5c8f0f47653e57439d8d33b09e118005a756/wmf-config/CommonSettings.php#L363-L371 [22:03:23] That's where the list gets used [22:03:50] <_joe_> ok [22:04:00] <_joe_> thanks a lot [22:04:20] I'd have to read code to see what RedisBagOStuff does to pick a server. AaronS would know for sure [22:04:39] <_joe_> yeah I'll look [22:04:55] <_joe_> if it's aaron's code it's going to be readable :) [22:04:57] ArrayUtils::consistentHashSort [22:05:26] <_joe_> ok, so no way to assign labels whatsoever, I guess [22:05:34] no [22:05:40] it's just using md5 basically [22:05:52] <_joe_> md5? [22:06:19] On the cache key [22:06:30] <_joe_> yeah, but after that [22:06:35] <_joe_> to assign it to a server [22:06:41] <_joe_> that's the interesting part [22:06:44] <_joe_> I'll look [22:07:03] That's what I meant [22:07:07] md5( $elt . $separator . $key ) [22:07:16] $elt is the server name or ip [22:07:44] <_joe_> I want to understand which percentage of users will be unable to login while we move a server and how many will lose it once we re-insert it with a new IP [22:08:08] <_joe_> I'll do the math tomorrow [22:08:09] The distribution should be pretty even [22:08:22] <_joe_> hoo: consistent hashing doesn't mean that [22:08:42] <_joe_> it means that rebalancing is kept to a minimum upon a change in the hash slots [22:09:03] _joe_: But session keys should be very well random [22:09:46] <_joe_> hoo: yes, but what you showed there was a way to hash the key to a md5 hash, not how to shard it in 16 slots :) [22:09:54] <_joe_> I'll look tomorrow [22:09:57] uasort( $array, function ( $a, $b ) use ( $hashes ) { [22:09:57] return strcmp( $hashes[$a], $hashes[$b] ); [22:09:57] } ); [22:09:59] ok :P [22:31:57] ^d: are we profiling individual db queries already? [22:32:03] <^d> Yeah [22:32:12] <^d> Should be? [22:32:14] <^d> AaronS: ^ [22:32:23] <^d> Or is it just slow queries? [22:32:42] well, iirc that was a ver slow query :P [22:32:50] very even [22:33:28] <^d> I'll amend to drop that bit [22:33:50] ^d: oh, I just +2'd [22:34:41] <^d> meh [22:47:50] bd808: Can you fill me in (or link to doc) about logging? E.g. the path from wfDebugLog() to logstash. And at which point they meet and join the path that hhvm/apache2 log takes as well. [22:48:11] A fair amount has changed, and not sure I still get it. [22:49:38] Making a doc that makes sense is still on my "todo list" unfortunately. I'm free in about an hour if you'd like an irc or voice chat about it then. Maybe by answering your questions I can create the outline of a proper doc [22:50:27] There is some info on wikitech that is horribly out of date -- https://wikitech.wikimedia.org/wiki/Logstash [22:50:58] udp2log to logstash is gone and it doesn't discuss the PSR-3/Monolog layer in MW [22:52:29] I see some stuff in logstash (though not all that used to be), and some stuff on fluorine. [22:52:41] Cool, yeah, that'd be cool. Ill ping you in a bit [22:53:08] logstash is down since the outage on Friday. I hope to get it back up "soon" [22:53:32] But need to change some things about the architecture of message delivery [22:54:06] https://phabricator.wikimedia.org/T88732 and https://phabricator.wikimedia.org/T88870 [22:54:09] https://phabricator.wikimedia.org/T89169 requests logging of fatal. Since your comment says to also send to logstash, what does the bug ask for? For fatal.log on fluorine to come back? [22:54:25] yep, on flourine [22:54:33] I personally don't use logstash often :P [22:54:43] yes. we need to do something to get better fatal data out of hhvm [22:55:04] today we have events with no stack traces in hhvm.log and logstash [22:55:22] supposedly we can configure hhvm to add stack traces [22:55:29] it may be as easy as that [22:55:58] MaxSem looked at the issue some over the summer so he may be of help [22:56:47] pfft. one config variable [22:59:56] ^demon|away: when you some time, could you look over https://www.mediawiki.org/wiki/Extension:ExtensionDistributor/tardist and make sure it's sane? [23:14:12] bd808, Config::Bind(LogNativeStackOnOOM, ini, [23:14:12] error["LogNativeStackOnOOM"], false); [23:24:52] 3VisualEditor, VisualEditor-Performance, MediaWiki-Core-Team: Parsoid performance analysis - https://phabricator.wikimedia.org/T85870#1029418 (10tstarling) 5Open>3Resolved [23:30:42] 3MediaWiki-Core-Team: Remove cache anti-dependencies - https://phabricator.wikimedia.org/T89184#1029444 (10aaron) 3NEW a:3aaron [23:35:59] 3Release-Engineering, MediaWiki-Core-Team, Wikimedia-Logstash: Log php fatals with full backtraces again (fatal.log on fluorine) - https://phabricator.wikimedia.org/T89169#1029470 (10bd808) @maxsem pointed out that HHVM has a configuration option to add stacktraces to OOM errors. I'm not sure, but the flag name... [23:36:22] Krinkle: I've got some time to chat if you're free. [23:46:27] 3Release-Engineering, MediaWiki-Core-Team, Wikimedia-Logstash: Log php fatals with full backtraces again (fatal.log on fluorine) - https://phabricator.wikimedia.org/T89169#1029488 (10MaxSem) Considering all these insane implicit parameter name conversions deep in HHVM's guts, it should probably be called hhvm.er... [23:46:57] bd808, ^^^^ [23:47:26] bd808: Heya, thx [23:47:29] MaxSem: Just read it. [23:47:46] MaxSem: So it's not really what hoo wants :( [23:48:00] Krinkle: hangout or irc chat? [23:48:07] legoktm: https://gerrit.wikimedia.org/r/#/c/189780/ passes now [23:48:14] bd808: So yeah, where does it go first from wfDebugLog() in prod? [23:49:15] wfDebugLog -> MWLoggerFactory -> MWLoggerMonologSpi -> Monolog [23:49:22] bd808: I'd prefer chat if that's alright. Doing a bit of non-blocking stuff at the same time (also with people here). [23:49:49] hm, there's also FullBacktrace but I'm not even sure if it's even used after grepping [23:49:56] yeah text works I hope. I'll end up with a log to work from later too [23:51:25] Krinkle: MWLoggerFactory::getInstance() returns some PSR-3 logger implementation. In our prod config that is a Monolog\Logger instance created by the MWLoggerMonologSpi factory [23:52:07] The log factory config is in https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/logging.php [23:52:09] bd808: So I haven't actually gotten familiar with what PSR-3 or Monolog consistutes for this purpose. Is it a string format, communication protocol for udp or tcp, both? [23:52:26] PSR-3 is just a standardized log interface for php [23:52:26] PRS-3 is just the PHP object model [23:52:31] Right [23:53:02] When Monolog is used you pick Handlers (transport) and Formatters to process the log events [23:53:30] you can have a stack of them such that each message is handed to N Handlers each of which has a Formatter [23:54:26] The config I wrote for prod has two Handers in most "channels". One that sends packets to udp2log and another that sends to Logstash [23:55:01] bd808: Ah, we're publishing to logstash directly from apaches? [23:55:06] yes [23:55:10] Interesting [23:55:21] well we were until Friday and will again soon [23:55:26] :) [23:55:37] The Redis transport I picked failed under high load [23:55:50] We will be switching to another transport [23:55:50] bd808: I suppose the udp2log (which was previously aggregated to logstash via fluorine?) is now terminated at fluorine, right? [23:56:22] yeah. I just removed the log2udp forwarding rules from it when I got the redis tansport working [23:56:40] The transport to logstash being redis? Does that mean logstash has built-in Redis server? Or is it one we set up that is then connected to logstash? [23:56:57] The likely next candidate is udp syslog datagrams sent either directly to logstash or relayed from the local rsylog service [23:57:26] Redis as an intermediary. Both Monolog and Logstash connected to it [23:57:56] beta is still running with that config and the current role::elk in mediawiki-vagrant does as well [23:58:14] So Formatter and Handler. [23:58:26] Handler being a publisher. And formatter speaks for itself. [23:58:34] *nod* [23:58:38] K [23:58:58] https://github.com/Seldaek/monolog/blob/master/src/Monolog/Handler/AbstractHandler.php [23:59:09] yeah, we'd want the transport to be lightweight. I guess TCP would be too much to do lots within a request handled in PHP. [23:59:38] the redis side failed basically under higher than normal error logging [23:59:42] bd808: At which point(s) do we filter/forward messages? [23:59:55] Or is everything in logstash by default based on keys directly from mediawiki?