[02:43:11] TimStarling and ori: re 226220, i just found a really ugly issue that will be quite hard to fix now that it's been in production [02:43:23] links table entries for modules are gone, and they won't come back on their own [03:02:22] ori: wanna relicense BufferingStatsdFactory under MIT so it can be upstreamed? [03:04:21] although maybe StatsdService will obsolete it [04:02:15] jackmcbarn: https://gerrit.wikimedia.org/r/#/c/226220 ? [04:02:26] tgr|away: sure [04:02:58] tgr|away: let me know if it's still relevant [13:08:01] jackmcbarn: Ugh, good catch on 226260. [16:31:50] AaronSchulz: have a moment ? [16:32:15] * AaronSchulz counts his moments [16:32:29] AaronSchulz: https://phabricator.wikimedia.org/T106444 comment please ? :) [18:34:49] tgr: do you still want me to relicense BufferingStatsdDataFactory? [19:57:51] ori, I'll ask the maintainer if he is interested [19:58:21] bd808: is there a convenience funcion for catching and suppressing exceptions? [19:58:42] um, I mean catching, logging and suppressing [20:00:13] MWExceptionHandler::logException() maybe? [20:00:40] is it fine to just invoke that out of the blue? [20:00:52] yeah [20:01:48] that log channel is expected to be nasty errors though [20:02:46] MWExceptionHandler::getLogContext() might be more what you want [20:09:43] legoktm: https://gerrit.wikimedia.org/r/#/c/226256/ [20:11:29] AaronSchulz: "Caches than need" --> Caches that* need? [20:11:35] also yay release notes :D [20:19:18] legoktm: also https://gerrit.wikimedia.org/r/#/c/226212/ [20:24:59] bd808: do we have a vagrant role for testing statsd? [20:25:21] yes! I just merged it yesterday. [20:25:37] it logs to a file rather than pushing to graphite [20:25:47] but useful for testing [20:27:11] great, just what I needed [20:27:21] thank gilles [20:40:36] bd808: what's up with puppet/modules/service/manifests/init.pp ? what is that supposed to do? [20:40:52] does it extend the core service resource somehow? [20:41:17] it's just a placeholder for vars. service::node is the real meat [20:41:18] just wondering why does the statsd role start with include ::service [20:41:35] so that the $service::* vars are in scope [20:41:47] it's a bit hacky [20:41:56] I see [20:42:10] (using ::service in ::statsd) [21:28:18] bd808: I'm getting 21:19:46 liuggio/statsd-php-client: 1.0.16 installed, 1.0.12 required. [21:28:22] 21:19:46 Error: your composer.lock file is not up to date, run "composer update" to install newer dependencies [21:28:41] in https://integration.wikimedia.org/ci/job/mediawiki-extensions-qunit/5045/console after bumping the composer.json version [21:28:47] but the lockfile is not committed [21:28:53] what am I missing? [21:29:05] is there a logfile in CI somewhere? [21:29:38] *lockfile* [21:29:47] tgr: it uses mediawiki/vendor's lockfile [21:29:55] tgr: https://phabricator.wikimedia.org/T88211 [21:30:16] ah, thx [21:31:27] I think StatsdService gets rid of the need for ori's BufferingStatsdDataFactory doesn't it? [21:32:09] bd808: sort of although I would prefer ori's approach [21:32:26] StatsdService is a mediator and you should keep those simple [21:33:11] elegant design have never been my strong suit [21:33:32] * bd808 smashes code with a hammer until it fits the problem [21:33:41] https://github.com/liuggio/statsd-php-client/issues/45 [21:35:09] if that gets fixed, the buffering factory can be discarded IMO [21:35:30] or maybe replaced with something super simple to do the key normalization [21:36:09] oh yes we'd still need that somewhere [21:36:48] not sure what it is good for actually, looks like remainders from the profiler [22:14:28] bd808: do you think https://gerrit.wikimedia.org/r/#/c/225552/ has a snowball's chance in hell or should I just drop it and people will do it when they actually need to use classes and can test them? [22:15:43] It looks pretty sane to me. Maybe poke mutante to see what he thinks of it? [22:32:23] bd808: I don't remember the end of the discussion about sticking a 'statsd' key into structured logging data [22:32:46] where should the code go? WikimediaEvents or core? [22:33:05] for a handler? [22:33:18] yeah [22:33:38] I think it would be ok in includes/debug/logger/monolog [22:33:56] unless you want to make a stand alone lib for it that we can import [22:34:31] seems to simple for that [22:35:25] WikimediaEvents == EventLogging right now and this is not that [22:35:56] I've already got a custom syslog handler in there [22:36:20] because the upstream one is kinda broken [22:49:25] AaronSchulz: is there a reason APC isn't installed on the scalers? it means that we aren't caching any of the extension.json files.. [22:53:03] legoktm: is that this error? -- https://phabricator.wikimedia.org/T84842#1473663 [22:53:59] bd808: the exception is caused by Gadgets probably, aaron just fixed https://phabricator.wikimedia.org/T106743#1477257 . But extension.json stuff will fall back to no-cache if APC isn't installed rather than throwing exceptions [22:55:23] the one in T84842 is a hhvm box... [22:55:32] uhh what [22:56:30] yeah it was very confusing since apc should be built into hhvm I thought [22:57:13] php5 is also installed on the trusty machines, because it is part of the default trusty server install [22:57:32] and the apache rules can default to php5 if the hhvm proxymatch rules didn't match [22:57:37] which could be what is happening here [22:57:43] meaning the response is actually generated by php5 [22:57:49] legoktm: mw1155 looks to be running hhvm too [22:58:05] ori: ah. that would sort of make sense [22:58:59] I'm just context-switching from something unrelated, give me a second to look. [22:59:30] don't update the tasks yet [22:59:55] I just linked the two that were reporting the same error [23:00:04] nod [23:02:43] yeah, I think this is what is happening. [23:02:45] $ curl -sIH 'host: commons.wikimedia.org' 'localhost/w/thumb_handler.php/2/2e/Mirage_III_A_01_Mus%0Aee_du_Bourget_P1020118.JPG/424px-%0AMirage_III_A_01_Musee_du_Bourget_P1020118.JPG' | grep X-Powered-By [23:02:45] X-Powered-By: HHVM/3.3.0-static [23:03:08] $ curl -sIH 'host: commons.wikimedia.org' 'localhost/w/index.php' | grep X-Powered-ByX-Powered-By: HHVM/3.6.1 [23:03:08] X-Powered-By: HHVM/3.6.1 [23:03:31] 'HHVM/3.3.0-static' is leftovers from Zend / HHVM split-testing which _joe_ never cleaned up [23:03:59] it's what Apache sets if there is no X-Powered-By header, which is the case when Zend handles the request. [23:04:27] if your immediate reaction is to exclaim, "holy mother of god, what a fucking mess", you are correct. [23:04:52] ProxyPassMatch ^/w/(.*\.(php|hh))$ fcgi://127.0.0.1:9000/srv/mediawiki/docroot/mediawiki/w/$1 [23:05:10] that doesn't allow for /w/thumb_handler.php/8/83/... [23:05:22] in modules/mediawiki/files/apache/sites/main.conf [23:05:36] Yep. [23:06:20] We don't have any pooled Precise scalers to fall back to, so I don't want to apply a fix across the fleet. [23:06:40] Instead, I'll disable puppet on one of the scalers, fix the config, and see what happens. [23:06:49] this means he hasn't been testing what the thought he was testing [23:06:56] s/the/he/ [23:07:04] Yes, much face-palming. [23:07:08] What should the regexp be? [23:07:32] maybe a separate one for /w/thumb_handler.php/.* ? [23:07:38] not sure really [23:07:52] maybe just drop the '$'? [23:07:56] or remove the $ anchor [23:07:58] yeah [23:07:58] yeah [23:08:24] OK, I'll try that on mw1153 and !log [23:09:55] Prepare for nasty surprises. [23:12:04] bd808: Any suggestions as to the best log to tail / dashboards to watch? [23:12:38] maybe https://logstash.wikimedia.org/#/dashboard/elasticsearch/default with "host:mw1153" as a query? [23:19:00] ori: I'm still seeing X-Powered-By: HHVM/3.3.0-static. did you hup apache? [23:19:19] Yeah, I was just puzzling over that. I did a full-blown hard restart. [23:19:27] hmmm [23:19:54] This could happen if we were setting an ErrorDocument for 500 responses, but we don't. [23:20:17] commons vhost is actually in modules/mediawiki/files/apache/sites/remnant.conf [23:20:37] and has no fcgi blocks? [23:21:15] well, https://commons.wikimedia.org/wiki/Special:Version is definitely on HHVM [23:21:58] yeah, wtf. [23:22:45] OK, progress. [23:23:08] yeah see it now -- X-Powered-By: HHVM/3.6.1 [23:23:17] I changed 'php_admin_flag engine on' for commons vhost to 'off' [23:24:15] I think we need a discrete rule for thumb_handler.php, as you suspected above [23:24:26] sneaky. with "on" mod_php takes over the routes I guess [23:24:54] it's weird enough to draw attention to it I thik [23:26:44] I'm seeing thumbnailaccess logs and no crazy errors for mw1153 now [23:26:52] which is a great start [23:29:57] well, it's returning a 404 [23:29:57] some rsvg-convert segfaults but that shouldn't be an hhvm problem [23:30:11] but this is so confusing -- is a 404 a problem, or the correct behavior? [23:31:47] I don't see a spike of 5xxs or 404s in , so we don't appear to be breaking anything, at least. [23:32:42] ori: this one works -- curl -sIH 'host: commons.wikimedia.org' 'localhost/w/thumb_handler.php/5/56/Villa_Eugenia_%28Hechingen%29.JPG/200px-Villa_Eugenia_%28Hechingen%29.JPG' [23:33:28] nice. [23:38:12] ori: I've got to run. dinner plans [23:49:58] bd808: kk, thanks