[04:34:28] Betawiki(s) seems down. Any status updates on this or where I can ger moreinfo? [04:35:21] er, do you mean twn or betalabs? [04:35:36] twn seems fine... [04:36:05] > It's not just you! http://en.wikipedia.beta.wmflabs.org looks down from here. [08:24:45] hashar: beta is borked? [08:24:51] I don't see any skin [08:28:16] @notify hashar [08:28:16] This user is now online in #wikimedia-labs. I'll let you know when they show some activity (talk, etc.) [08:28:23] petan: maybe : [08:28:24] ) [08:28:29] lets clear bits cache [08:28:34] mhm... I would love it getting fixed :D [08:28:57] I decided to wide-deploy huggle to beta cluster for some massive testing :3 [08:29:09] don't kill the cluster please! :D [08:29:18] if we had any problems with spammers we will not in future :P [08:29:22] ahah [08:29:30] have you looked at RCStream on beta? [08:29:35] yes [08:29:45] but it's pretty complex to implement in C++/Qt4 [08:29:58] websocket? [08:30:07] don't you have some nice C++ lib to handle websocket? [08:30:10] I will probably have to create own instance that will just convert new stream to old IRC format until this new technology get a decent support in Qt4 [08:30:13] no [08:30:23] there is not just no nice C++ lib for websocket, there is even no lib for JSON [08:30:37] !log deployment-prep restarted varnish on deployment-cache-bits01 . Hoping to clear bits cache [08:30:40] Logged the message, Master [08:30:42] Qt4 can only handle XML so far [08:31:00] and it will never support anything else, Qt5 does support JSON somehow though [08:31:19] problem is that Qt5 isn't supported in 98% of linux distros and MacOS [08:31:41] only distro that can handle Qt5 is debian SID :P [08:32:28] petan on https://wikitech.wikimedia.org/wiki/RCStream points to a C++ lib there https://github.com/hannon235/socket.io-poco [08:32:38] I am sure there are some others floating around [08:32:51] you probably don't want to reimplement your own websocket lib :D [08:32:51] I know, but these aren't that easy to use, they have incredible dependency tree and aren't truly cross platform [08:33:08] it would make it pain in the ass to build huggle smoothly everywhere [08:33:12] rewrite huggle to java :-] [08:33:15] lol [08:33:21] we already had it in c# before [08:33:30] c# suck everywhere except windows [08:33:33] java suck everywhere [08:33:40] and for json, I am sure there are a bunch of nicely written json libs [08:34:58] if I find anyone willing to volunteer half of year of their life to try to implement it (and eventually fail doing so) I will happily let them try to implement it, but I myself don't really have time for that... creating some temporary convertor somewhere on labs is thousand times easier [08:35:35] petan: ahhh libboost has a json parser : http://www.boost.org/doc/libs/1_55_0/doc/html/property_tree/reference.html#header.boost.property_tree.json_parser_hpp :D [08:35:49] only requirement is: it must be super easy to compile on all known platforms: ./configure && make etc... :P [08:35:57] doc about property tree: http://www.boost.org/doc/libs/1_55_0/doc/html/property_tree.html [08:36:03] yes libboost is one of things I would love to avoid having to use :P [08:36:16] any reason? [08:36:23] even linus torvalds has nightmares of libboost [08:36:30] let me find his email :) [08:36:33] boost seems to be reasonably cross platform and is definitely available in all distros [08:36:46] he invented some new swear words when he was talking about boost lib :D [08:36:53] lol [08:37:00] he has different expectations though [08:37:27] and he would write a json parser straight in the kernel in a matter of an hour anyway [08:38:03] "using the 'nice' library features of the language like STL and Boost and other total and utter crap," that may "help" you program, but they cause infinite amounts of pain when they don't work and inefficient abstracted programming models." [08:40:26] hashar: yes that's the thing... sometimes writing own JSON parser may seem easier that trying to implement one of these "nice and easy to deploy anywhere" libs... I did that with google breakpad which is very well maintained by large company. It still has zillion of issues. I can imagine that some random JSON library maintained by 1 guy will have much more problems [08:40:48] libboost definitely has more than one maintainer though [08:41:13] it took me 1 month to implement google breakpad and about 1 month to unimplement it after hundreds of complains from people who stopped being able to build huggle themselves after [08:41:25] :-D [08:42:44] there is big difference between server side development when you can use thousands of various libs without problem and development of application that is meant to be built or installed by users on their own machines [08:43:27] I would prefer to wait for Qt5 to get decent support everywhere and then eventually switch to that as it already has both websocket and JSON parsers... but now it's too early [08:46:11] 3Wikimedia Labs: Replicate centralauth.renameuser_status table to labs - 10https://bugzilla.wikimedia.org/68356#c4 (10Kunal Mehta (Legoktm)) 5RESO/FIX>3REOP It's been 24+ hours and the table still isn't available... [09:02:37] hashar: importing to en.wikipedia doesn't work :/ Import failed: Expected tag, got [09:02:46] but the XML file I upload has mediawiki tag :(( [09:03:03] I did Special:Export on english wiki [09:04:07] (03PS1) 10Legoktm: Remove api.py [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149814 [09:04:11] petan: import under hhvm is broken [09:04:19] I need import these pages: https://tools.wmflabs.org/paste/view/dc68a582 [09:04:22] who can do that [09:04:42] what is hhvm [09:08:02] https://www.mediawiki.org/wiki/HHVM [09:09:20] petan: bits works with ?debug=true http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Main_Page&debug=true [09:09:25] petan: something is wrong in caches :-/ [09:09:43] skins do work to me now [09:09:48] but import not [09:10:32] ahh [09:10:46] (03CR) 10Yuvipanda: [C: 032 V: 032] Remove api.py [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149814 (owner: 10Legoktm) [09:10:48] some CSS loaded from load.php is garbage [09:12:26] http://bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/load.php?debug=false&lang=en&modules=ext.flaggedRevs.basic%7Cext.gadget.PagesForDeletion%2CwmfFR2011Style%7Cext.uls.nojs%7Cext.visualEditor.viewPageTarget.noscript%7Cext.wikihiero%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.skinning.interface%7Cmediawiki.ui.button%7Cskins.vector.styles%7Cwikibase.client.init&only=styles&skin=vector&* [09:12:27] :-( [09:16:13] it is compressed twice apparently [09:22:22] petan: I have logged the CSS issue on beta as https://bugzilla.wikimedia.org/show_bug.cgi?id=68720 [09:22:25] no clue what happens [09:22:25] 3Wikimedia Labs / 3deployment-prep (beta): beta: ResourceLoader CSS URL gzipped twice, causing skins to be broken - 10https://bugzilla.wikimedia.org/68720 (10Antoine "hashar" Musso) 3NEW p:3Unprio s:3normal a:3None One of the CSS resource loader URL yield content which is gzipped twice :-( The conte... [09:36:45] (03PS1) 10Legoktm: Move configuration into a JSON file [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149817 [09:36:53] YuviPanda: ^ [09:38:06] * legoktm looks up logging [09:38:12] legoktm: ^ [09:38:12] (03CR) 10Yuvipanda: [C: 04-1] Move configuration into a JSON file (033 comments) [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149817 (owner: 10Legoktm) [09:38:15] bah [09:38:16] legoktm: ^ [09:40:21] (03PS2) 10Legoktm: Move configuration into a JSON file [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149817 [09:40:27] legoktm: also set the logformat to something saner than the default? "pid asctime message" maybe? [09:40:30] (03CR) 10Legoktm: Move configuration into a JSON file (033 comments) [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149817 (owner: 10Legoktm) [09:40:37] legoktm: err, or "asctime pid message'? [09:40:40] or just asctime message? [09:40:58] * legoktm nods [09:41:42] (03CR) 10Yuvipanda: [C: 032 V: 032] Move configuration into a JSON file [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149817 (owner: 10Legoktm) [09:44:22] legoktm: why is it /srv/src/extensions and not /srv/src? [09:45:05] So it would be easier to support skins in the future [09:45:29] legoktm: hmm, ok [09:45:42] namely the "Vector" extension and the "Vector" skin [09:45:55] legoktm: are you not doing this for core? [09:46:28] not yet [09:46:37] legoktm: hmm, ok [09:46:57] for core we have https://tools.wmflabs.org/snapshots/#!/mediawiki-core/master [09:47:01] legoktm: can I just create /srv/src in puppet, and you can create subdirectories there as you wish? [09:47:11] sure [09:47:15] legoktm: that way, puppet cahnges won't be required when you need a new directory [09:47:16] legoktm: cool [09:48:01] (03PS1) 10Legoktm: Setup logging to conf.LOG_FILE [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149818 [09:48:04] YuviPanda: ^ [09:49:24] (03CR) 10Yuvipanda: [C: 032 V: 032] Setup logging to conf.LOG_FILE [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149818 (owner: 10Legoktm) [09:49:25] cool [09:52:12] legoktm: that would mean SRC_PATH will also be pointed to /srv/src and you should construct extension path from that [09:52:19] ok [09:52:36] legoktm: also only non null config params will be specified [09:52:50] hmm [09:52:55] let me remove "SUPPORTED_VERSIONS": null [09:54:29] (03PS1) 10Legoktm: Remove conf.SUPPORTED_VERSIONS and conf.EXT_LIST [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149819 [09:56:51] (03PS1) 10Legoktm: Add and use conf.EXT_PATH, conf.SRC_PATH no longer implies /extensions [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149820 [09:57:15] YuviPanda: ^ [09:58:39] (03CR) 10Yuvipanda: [C: 04-1] "You shouldn't blanket set them, I think. Set None to be default, but overridable from conf?" [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149819 (owner: 10Legoktm) [09:59:19] (03CR) 10Yuvipanda: [C: 032 V: 032] Add and use conf.EXT_PATH, conf.SRC_PATH no longer implies /extensions [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149820 (owner: 10Legoktm) [09:59:22] legoktm: ^^ [09:59:32] YuviPanda: why would we want to override on conf? I just need a global object to lazy load them in [09:59:58] legoktm: it's just good practice. so if you *do* want to skip things in the future, you don't need to modify the py files... [10:00:11] hmm ok [10:00:52] YuviPanda: so...do I just keep them as "null"? [10:01:19] legoktm: thing is, conf.json in the repo won't be loaded at all, since I'm putting a /etc/extdist.conf [10:01:27] [02:52:37] legoktm: also only non null config params will be specified <-- not sure how that will be affected. [10:01:39] legoktm: so all you have to do is ensure that the code doesn't crash if there's a null value in the conf [10:01:45] legoktm: err, if there's a *missing* value in the conf [10:01:49] that's null [10:03:43] !ping [10:03:44] !pong [10:03:45] ok [10:06:16] (03PS2) 10Legoktm: Don't die if conf.SUPPORTED_VERSIONS and conf.EXT_LIST aren't set [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149819 [10:06:22] YuviPanda: fixed [10:07:19] (03CR) 10Yuvipanda: [C: 032 V: 032] Don't die if conf.SUPPORTED_VERSIONS and conf.EXT_LIST aren't set [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149819 (owner: 10Legoktm) [10:07:40] legoktm: bah, needs manual rebase: https://gerrit.wikimedia.org/r/#/c/149820/ [10:07:44] * legoktm does [10:09:51] (03PS2) 10Legoktm: Add and use conf.EXT_PATH, conf.SRC_PATH no longer implies /extensions [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149820 [10:10:07] YuviPanda: ^ [10:10:47] (03CR) 10Yuvipanda: [C: 032 V: 032] Add and use conf.EXT_PATH, conf.SRC_PATH no longer implies /extensions [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149820 (owner: 10Legoktm) [10:10:48] legoktm: right. [10:11:18] legoktm: so another thing you should add (not right now, but eventually) is a pid file, so more than one instance of the nightly script doesn't run [10:12:02] can it be just a text file? [10:12:52] legoktm: a pid file is just a text file with the process id of the process that created it [10:13:09] legoktm: and then upon start the next time you check file, check if that process is still running, and if it is you just abort yourself [10:13:16] ok [10:13:31] legoktm: path of that should be configurable as well [10:14:26] ok, let me find some food first [10:14:37] legoktm: err, erroring with 'error: pathspec 'origin/REL1_20' did not match any file(s) known to git.' again [10:14:50] is it actually dying? [10:14:53] legoktm: ya [10:14:53] it should just keep going [10:14:56] hmm [10:15:09] on extdist-test? [10:15:39] legoktm: ya [10:15:45] where are the full logs? [10:16:03] legoktm: /var/log/extdist [10:18:47] that just has stuff like 2014-07-28 10:12:32,369 WARNING:could not checkout origin/REL1_19 [10:18:51] which is fine [10:19:06] legoktm: yeah but it stopped [10:19:20] legoktm: hmm,e rror isn't being sent there? [10:19:26] it's no longer an error [10:19:30] legoktm: run sudo -u extdist python nightly.py [10:20:13] those are sent to stderr by git and I can't control them [10:20:18] but they're safe to ignore [10:21:10] legoktm: yeah, but the process stopped [10:21:13] legoktm: is that all it was doing? [10:21:17] yeah [10:21:21] by default it just does VE [10:21:22] legoktm: hmm, maybe it stopped because it was done? :| lol [10:21:31] need to pass --all to run everything [10:21:46] legoktm: lol [10:21:48] legoktm: again. fine [10:22:24] legoktm: you proposed some Jenkinsjobs for labs/tools/extdist to add pyflakes + pep8 [10:22:32] legoktm: do you know about tox ? :)D [10:22:35] legoktm: yeah, use tox man. [10:22:49] legoktm: and set linelength to 120. steal it from the definition for quarry/web [10:22:52] hashar_: I just looked at whatever was already in the file...anything works :D [10:22:58] legoktm: it is a thin wrapper around virtualenv. Would let you define a target in your repository and jenkins will just invoke that target. This way you will not have to change the jenkins jobs. [10:23:02] legoktm: hehe [10:23:10] sounds good! [10:23:10] legoktm: will get you tox setup so :-] [10:23:25] this way your repo just provide the entry point [10:23:29] and you do whatever you want [10:23:34] jenkins becoming very lame in the process [10:23:42] legoktm: also, you could potentially redirect output of git into log file? or not worth it? [10:24:01] https://testrun.org/tox/latest/ looks awesome [10:24:32] YuviPanda: I could. Don't think it's worth it unless git starts erroring, at which point running it manually would be easier [10:24:38] legoktm: yeah, true [10:26:51] legoktm: boom http://extdist-test.wmflabs.org/ [10:27:03] :DDD [10:27:06] legoktm: you should make them into a nice folder structure [10:27:13] YuviPanda: can we put them in /dist/? [10:27:28] legoktm: sure but why? [10:27:41] legoktm: and what would you put in /? [10:27:46] I don't like polluting the web root [10:28:40] (03PS1) 10Hashar: Tox entry point to run flake8 [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149822 [10:29:05] legoktm: heh. but we should have *something* for / [10:29:05] legoktm: https://gerrit.wikimedia.org/r/#/c/149822/ :D [10:29:17] legoktm: let me know if I should add a README file explaining how to install tox and run tests [10:29:28] YuviPanda: can we just redirect to https://www.mediawiki.org/wiki/Special:ExtensionDistributor ? [10:29:32] legoktm: sure [10:30:23] hashar_: not needed [10:30:37] (03CR) 10Legoktm: [C: 032 V: 032] "Thanks" [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149822 (owner: 10Hashar) [10:32:18] (03PS1) 10Legoktm: Fix flake8 whitespace errors [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149824 [10:32:41] > congratulations :) [10:35:08] legoktm: heading out to lunch will do later :) [10:38:19] YuviPanda: ^ whitespace patch [10:38:46] (03CR) 10Yuvipanda: [C: 032] Fix flake8 whitespace errors [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149824 (owner: 10Legoktm) [10:39:09] V+2 still [10:39:41] (03CR) 10Yuvipanda: [V: 032] Fix flake8 whitespace errors [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149824 (owner: 10Legoktm) [10:43:09] legoktm: you should also log generic 'started script', 'script ended with X changes' for some value of X [10:43:29] ok [10:43:43] doing pidfile right now [10:44:15] legoktm: redirect set [10:45:17] legoktm: and everything seems to work \o/ [10:45:20] legoktm: so yay :D [10:45:28] legoktm: I wonder at what point I can remove the [WIP [10:45:29] ]] [10:46:01] http://extdist-test.wmflabs.org/dist/ says 404? [10:52:54] (03PS1) 10Legoktm: Add a pid file to check if nightly.py is already running [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 [10:53:00] YuviPanda: ^ [10:53:19] legoktm: eating brb [10:53:52] /nick EatingPanda :P [11:31:51] hey all, we're trying to create a user account for the rijksmuseum on the beta server with the name Rijksmuseum Collection Information, however the server says that that name has been blacklisted. any ideas on what might wrong with that user name? [12:13:59] (03CR) 10Hashar: "recheck" [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 (owner: 10Legoktm) [12:18:13] legoktm: fixed the /dist [14:22:55] !log deployment-prep reubilding cirrus search indexes to pick up a speed up all field [14:22:57] Logged the message, Master [14:23:42] !log deployment-prep or not - looks like I can't! [14:23:45] Logged the message, Master [14:27:08] 3Wikimedia Labs: Replicate centralauth.renameuser_status table to labs - 10https://bugzilla.wikimedia.org/68356#c5 (10Marc A. Pelletier) ..? MariaDB [centralauth_p]> select * from renameuser_status; Empty set (0.01 sec) Works for me! -- Marc [14:32:18] hashar: are you involved in this "Please install the Mantle MediaWiki extension." business? [14:34:08] manybubbles: sounds like a missing dependenc [14:34:09] y [14:34:14] where have you received that message? [14:34:26] Must be some extension depending onMobileFrontend that is missing Mantle [14:34:34] (yeah handling our deps manually sucks) [14:35:39] hashar: bleh :( - its on deployment-bastion [14:35:43] ohhhh [14:35:58] must be a load order issue in operations/mediawiki-config.git files [14:36:05] I think I have seen a patch aboutit [14:36:10] let me find it [14:36:33] https://gerrit.wikimedia.org/r/#/c/149797/ Move Mantle dependency check into efMobileFrontend_Setup [14:36:51] The Mantle check is currently done in /MobileFrontend.php [14:37:00] since that might be included before Mantle ... You get an error :D [14:37:28] the patch seems fine to me. I would have merged it this morning but preferred to wait for mobile [14:37:52] another way to fix would be to have Mantle loaded before MobileFrontend in the mediawikiconfig [14:37:59] hashar: yeah - I'll +1 it and gripe [14:41:57] <^d> Changing the load order in wmf-config should be easy. [14:42:13] <^d> (And should totally be done regardless of the error message's location) [14:44:32] I guess so [15:04:11] andrewbogott: heya! wanna CR https://gerrit.wikimedia.org/r/#/c/149486/? [15:04:50] wow, patchset 27? [15:05:13] YuviPanda, I have a window to update wikitech + turn on OAuth in an hour, will you be around to help me verify that oauth is working? I don't really know what to look for. [15:05:30] andrewbogott: yeah, sure. can I get the appropriate rights to grant OAuth things? [15:05:47] andrewbogott: my patch testing workflow is write locally, push to gerrit, pull on puppetmaster, run, test, repeat [15:06:08] andrewbogott: plus the approach itself changed halfway through that patch :) [15:06:12] andrewbogott: Just do some test edits with Widar, it uses OATH [15:07:35] andrewbogott: I'll make sure to be around in an hour. [15:09:07] YuviPanda: I don't think I can turn on oauth rights until I actually turn on oauth… we will see. [15:13:17] andrewbogott: heh, of course ;) [15:18:57] !log deployment-prep Restarted apache on deployment-mediawiki01. 65 children and non-responsive to requests. [15:18:59] Logged the message, Master [15:20:22] !log deployment-prep Restarted apache on deployment-mediawiki02. 65 children and non-responsive to requests. (same as mediawiki01) [15:20:24] Logged the message, Master [15:21:22] 3Wikimedia Labs / 3deployment-prep (beta): beta: ResourceLoader CSS URL gzipped twice, causing skins to be broken - 10https://bugzilla.wikimedia.org/68720#c1 (10Antoine "hashar" Musso) Proper command: curl 'http://bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/load.php?debug=7Cext.uls.nojs%7Cext.visual... [15:24:52] 3Wikimedia Labs / 3deployment-prep (beta): beta: ResourceLoader CSS URL gzipped twice, causing skins to be broken - 10https://bugzilla.wikimedia.org/68720#c2 (10Antoine "hashar" Musso) Niklas reported it to be fixed in Vagrant: hhvm: set GzipCompressionLevel = 0 to avoid double-gzip https://gerrit.wikimed... [15:27:25] 3Wikimedia Labs / 3deployment-prep (beta): beta: ResourceLoader CSS URL gzipped twice, causing skins to be broken - 10https://bugzilla.wikimedia.org/68720 (10Bryan Davis) [15:32:00] !log deployment-prep Restarted hhvm on deployment-mediawiki0[12]. All apache children were stuck waiting for hhvm to respond. [15:32:02] Logged the message, Master [15:32:04] ori: ^ [15:33:23] ori: I'm not sure what the cause was, but hhvm seemed pretty stuck. I tried restarting the apaches first but they filled right back up waiting on the hhvm backend. [15:35:06] ori: Somebody needs to get hhvm configured to log to syslog and aggregate that deployment-bastion to make debugging this kind of problem slightly easier [15:37:17] !log deployment-prep rebuilding elasticsearch indexes to build a weighted all field we'll try to use to improve performance [15:37:19] Logged the message, Master [15:46:06] andrewbogott: no, it's not running anywhere. currently we use... github for that (producing tarballs) [15:46:25] andrewbogott: so right now a merge there would be a nop [15:46:37] YuviPanda: ok [16:04:58] Coren: up? [16:07:26] YuviPanda: I'm messing with other things, but… you have oauth admin on wikitech, want to see how it treats you? [16:08:39] andrewbogott: sure, moment [16:08:51] andrewbogott: Que pasa? [16:09:20] Coren: I'm trying to update wikitech and it throws exceptions when I switch to the new code. Looking for help with diagnosing... [16:09:27] since I'd prefer not to leave it in a broken state for hours at a time :) [16:10:22] andrewbogott: proposing a new one says 'A database query error has occurred. This may indicate a bug in the software.' [16:10:31] Coren: Right now it's running with slot1; in slot0 is a copy of slot1 with a checkout of a new mediawiki branch + a submodule-update. [16:10:36] YuviPanda: well, that's discouraging. [16:10:42] yeah :| [16:10:50] andrewbogott: did you create the database tables needed? [16:10:51] YuviPanda: stay tuned, let me figure out about the upgrade issue, it might be related. [16:10:54] andrewbogott: ok! [16:12:38] Coren: So… if you have brain space to help, I'll break wikitech again and you can help me hunt down the problem? [16:12:55] debugging in production :( [16:13:36] andrewbogott: Erm; can you hold off on that ~30m or so? I'm knee-deep in nova-compute. [16:14:01] Coren: well, I'm in a scheduled deploy window. It's ok, I'll hack on it without you [16:14:26] Hm. Hang on, lemme see if I can disable the node in the meantime to help. [16:15:11] Coren: any news ?? [16:15:47] GerardM-: Sorry no. No news since the last update by chris telling me the wipe was in progress. I'll corner him when he gets online. [16:16:10] ok ... he promissed me a photo :) [16:16:13] happy to use it [16:16:42] andrewbogott: Yeah, I can spare a bit. Tell me when you're ready. [16:17:10] Coren: it may be simple… the immediate bug I see is complaining about a missing cache file /srv/org/wikimedia/controller/wikis/slot1/cache/l10n_cache-en.cdb.tmp.2146306129 [16:17:41] So probably I'm missing a step when I switch from slot1 to slot0. Failing to clear something... [16:19:24] andrewbogott: That's part of what scap does, isn't it? (rebuilding localization caches) [16:19:46] um… scap is definitely not involved in this process [16:19:52] It is, but wikitech doesn't use scap [16:19:56] I know, but that's what scap does in prod. [16:20:06] I'm guessing that's our missing step. [16:21:55] * Coren reads scap, figure out what it does. [16:21:58] andrewbogott: rebuildLocalisationCache.php is the maintenance script you probably need to run [16:22:16] bd808: yep, running, thanks... [16:22:25] (parallel conversations here and in #wikimedia-operations) [16:22:35] * bd808 goes back to lurking [16:23:00] * Coren emerges from reading scap with nearly the same conclusion. [16:23:12] whoah, wikitech fonts are Different! [16:23:19] I guess that's a normal feature of upgrading, huh? [16:23:26] Typography update. [16:23:36] Happened some time back in prod. [16:23:40] do we get the flamewar as well? [16:24:15] YuviPanda: my next thought after noticing that was "For god's sake, man, don't express an opinion about fonts!" [16:24:21] andrewbogott: :D [16:26:17] YuviPanda, ok, try your oauth thing again? [16:26:50] andrewbogott: moment [16:27:12] andrewbogott: Special:Version is full of happy. [16:27:36] Coren: Yep, that was fairly painless overall. Thanks for your help. [16:28:07] What little help I actually did. :-) [16:28:19] * Coren goes back to battle with Nova. [16:28:24] You suggested the same thing that everyone else suggested, which was very reassuring :) [16:28:59] andrewbogott: proposing seems to work [16:29:07] andrewbogott: let me find out how to approve [16:34:58] andrewbogott: BTW, poor icinga is desperately trying to /server-status wikitech. Either we should stop it from trying or allow it. :-) [16:35:42] How do I allow it? Is that a firewall thing or does a client need to run on virt1000? [16:35:58] andrewbogott: We just need to Allow the right IPs in the Apache config. [16:36:25] Right now it's getting 403 client denied by server configuration: /var/www/server-status [16:36:49] 208.80.154.18 and 208.80.152.32 [16:38:08] andrewbogott: OAuth seems to work, but hitting some other issues with the final step. checking [16:42:36] Coren: So, to make sure I understand… /server-status is a URL that icinga wants to hit, and that is not generally visible? [16:45:03] Coren: e.g. https://gerrit.wikimedia.org/r/149895 ? [16:45:56] "ext.semanticforms.fancybox: Error: Module ext.semanticforms.fancybox has failed dependencies" when visiting the project documentation edit page on wikitech [16:46:05] That's a javascript pop-up [16:46:28] Or at least chrome is making it one [16:49:05] bd808: ok, looking... [16:50:38] * andrewbogott LOVES googling an error and only finding source code [16:55:14] bd808: any thoughts on how to find out /what/ dependencies are failing? there's nothing in the apache logs :( [16:55:29] I see the same error, but have also verified that all the SMW bits are the same version now as before [16:55:49] andrewbogott: It's a client side js thing. SOmething not being added to the resource loader config I'd guess [16:56:23] * bd808 knows very little of the front-end code for anything [16:56:25] bd808: Where is the resource loader config defined? [16:56:30] Is that just part of the standard MW config? [16:56:36] (because that was also unchanged) [16:56:52] andrewbogott: might also be a cache somewhere. poke Krinkle or Roan? [16:59:00] Krinkle, consider yourself pinged :) [16:59:43] andrewbogott: That error is self explanatory. The SMW extension (which I don't maintain) likely has a recent change made to it that adds a dependency on something that doesn't exist. [16:59:48] maybe a typo. [17:00:00] Krinkle: I did not update SMW, it's running the same version I've been running for months. [17:00:09] Alternatively, someone updated mediawiki and mediawiki dropped support for a mobile that SMW still depends on. [17:00:12] I did a submodule update in the MW tree, but have verified that nothing in SMW updated. [17:00:15] module* [17:00:17] So, more likely the latter. [17:00:26] always run master with master :) [17:00:35] That's not possible in this case. [17:00:37] and wmf branches with wmf branches. Does SMW not have wmf branches or master? [17:00:47] or a REL branch if we want stability. [17:00:56] MAN I am so tired of having this conversation. I should print out little index cards [17:01:05] Anyway, no, for the time being we are committed to using this old version of SMW [17:01:26] I would enjoy actually knowing what component is missing, rather than just that there is some unknown thing missing. [17:01:26] Anyway, I don't know wikitech. [17:01:50] But, honestly, the state of the art is to pop up a JS error saying "Somethign is missing" and providing no other info? [17:01:56] It's not logged in a exception someplace? [17:02:13] Krinkle: not your fault for asking, btw, it's SMW I'm annoyed at, not you :) [17:02:27] I just know there's nothing I can do about the error. Something was removed in master mediawiki (as we do almost every other day), probably mentioned in the release notes. Extension maintainers should update their code, and per our convention that we've had for years, site admins needs to take care to make sure stuff is compatible. If you do thigns different for whatever reason, that's totally valid, [17:02:28] but that means you needto get creative. [17:03:22] Consider it an error like new User(); but User is not in the auto loader and/or User.php is missing. [17:03:26] there's not much else to report. [17:04:06] meeting; tell me later why we don't run (master|wmf|REL) mediawiki with (master|wmf|REL) of the extensions on wikitech [17:04:17] Oh! I think there's a lot more to report. Like, for instance, /what/ dependency is missing? [17:05:10] Krinkle: I think andrewbogott's annoyance stems from the sessage saying that X "has failed dependencies" without saying what the failed dependencies /are/ :-) [17:05:21] andrewbogott: which URL is this? [17:05:28] * YuviPanda could try some client side debugging [17:05:32] andrewbogott: javascript doesn't provide that information unfortunately. it's all mixed soup of code. If you reproduce the error with debug=true, you should get a useful stractrace with all you need. [17:05:53] YuviPanda: https://wikitech.wikimedia.org/wiki/Special:FormEdit/Nova_Project_Documentation/Nova_Resource:Deployment-prep/Documentation is an example [17:05:54] Well, actually, I'm not trying to express annoyance, but optimism. Specifically: surely the actual source of the failure is logged someplace or can be investigated somehow...? [17:06:03] Krinkle: ok, that might help, thanks. [17:06:24] andrewbogott: 208.80.152.32 - - [28/Jul/2014:17:05:52 +0000] "GET /server-status HTTP/1.1" 200 16795 "-" "Python-urllib/2.7" [17:06:27] Yeay! [17:06:39] yep, log is much quieter now :) [17:08:01] andrewbogott: it's not playing well with newer jquery [17:08:06] Hm, looks like 'fancybox' is maybe a skin, so maybe I can use a different skin? [17:08:07] With ?debug=true I see jquery.fancybox.js crashing because "$.browser" is undefined. [17:08:09] andrewbogott: missing the jquery browser module and the .live function [17:08:32] andrewbogott: you'll have to either update smw, or roll back mw (or live hack jquery into an older version, but that'll probably break other things) [17:08:48] dang [17:09:00] The pinned old SMW was bound to create issues eventually. [17:09:10] yeah, I knew this day would come [17:09:21] So, I guess no updates for today. [17:09:30] That probably means that OAuth won't work either, but we'll see. [17:09:31] :( [17:09:34] it might [17:09:39] unless our mw is pre 1.22 [17:09:39] andrewbogott: AFAICT, the error is cosmetic. [17:09:51] Oh, hm, you're right. [17:09:54] Lemme try actually editing [17:10:03] Coren, andrewbogott: yeah edits seem to work [17:10:06] andrewbogott: Some fancy display thingy is presumably missing. [17:10:20] Coren: true, but it might not be in other cases, since an error in JS actually abort execution of the entire script [17:10:31] also, these are the js ones, possibly there are php ones that could be worse :) [17:10:32] So yeah, maybe I can fork SMW and have it just not do that. [17:10:38] Presuming it doesn't happen… elsewhere [17:10:39] YuviPanda: Sure, but it means we could simply axe whatever it's trying to do. [17:10:51] true, true, but that seems like a rabbit hole [17:11:05] going down a rabbit hole playing a game of whack a mole... [17:11:33] andrewbogott: What prevents us from upgrading SMW though? [17:11:47] Oh, I never get tired of answering that question! [17:11:50] Coren: composer, I think [17:11:56] Ah. Composer. [17:12:12] Yes, I'm waiting for bd808's composer-on-prod RFC to become reality, then will form it for mediawiki, then... [17:12:15] we can upgrade SMW [17:12:41] So a stopgap now would be not entirely unreasonable then. [17:13:04] That or we can send some gangsters through a time machine and have them encourage the five-years-ago SMW developers to rethink their packaging scheme. [17:13:18] But, yeah, I'm all for a stopgap. Reading SemanticForms code now [17:16:39] Interesting, fancybox is a third-party library licensed under CC-NC. [17:16:44] I'd prefer to not know that [17:23:23] 3Wikimedia Labs / 3Infrastructure: Database upgrade MariaDB 10: Lock wait timeouts / deadlocks in a row - 10https://bugzilla.wikimedia.org/68753 (10metatron) 3UNCO p:3Unprio s:3critic a:3None Since start of upgrade process, I'm experiencing lots of Lock wait timeouts / deadlocks. Usecase: Xtools cr... [17:24:41] bd808: is that js error now mysteriously fixed? [17:25:42] andrewbogott: well... yes? [17:25:46] Coren: MariaDB [centralauth_p]> select * from renameuser_status; [17:25:46] ERROR 1146 (42S02): Table 'centralauth_p.renameuser_status' doesn't exist [17:25:52] hm... [17:25:55] Coren: am I doing something wrong? >.> [17:26:06] Coren: this is using the "legobot" tool [17:26:20] legoktm: I... wut? [17:26:28] legoktm: I don't even. [17:26:38] * Coren tries to figure out what goes. [17:27:23] Oh! OH! Yeah, nothing you're doing wrong, just an odd side effect of the migration-in-progress. New changes to schemas, etc are applying only to the new combined instances! [17:28:03] legoktm: Use the centralauth_p on c2.labsdb. The difference will be gone in a couple days when we migrated the rest of the databases. [17:28:38] sweet :D [17:28:39] works! [17:28:51] * legoktm re-closes bug [17:29:17] legoktm: Sorry 'bout that; I hadn't realized that because the maintenance scripts have been converted new views wouldn't show in the old databases. [17:30:08] 3Wikimedia Labs: Replicate centralauth.renameuser_status table to labs - 10https://bugzilla.wikimedia.org/68356#c6 (10Kunal Mehta (Legoktm)) 5REOP>3RESO/FIX [10:25:46] Coren: MariaDB [centralauth_p]> select * from renameuser_status; [10:25:46] ERROR 1146 (42S02): Table 'centra... [17:41:38] !log deployment-prep Updated hhvm to latest 3.3-dev+20140728 build on deployment-mediawiki0[12] [17:41:41] Logged the message, Master [17:46:05] YuviPanda|brb: when you're back, according to /var/log/extdist, the cron hasn't run since 10am [17:46:24] Nemo_bis: Ping? [17:49:42] YuviPanda|brb: I just did a manual run of VE which worked fine... [17:57:30] * 0 * * * /usr/bin/python /srv/extdist/nightly.py --all [17:57:38] that's...not right :P [18:01:08] Coren: pong? [18:01:31] YuviPanda|brb: fixed the patch, but I don't remember how to deploy it to extdist-test [18:01:33] If it's for /data/scratch, it's going to have plenty of space in a hour or so [18:21:23] legoktm: hmm? [18:21:27] legoktm: what did I miss? [18:33:13] Coren: you know about the s3 connection problem i talk with you last week? this was not caused by the connection from my script to s3 database as i though. the reason is a broken federated connection used by wikidata_f_p on s3 [18:33:52] Oooooh! That would make sense, the federated databases to the new combined instances are probably all broken. [18:33:56] but i think creating a new bug for this is useless? [18:34:16] Yeah, kinda pointless. With luck, we'll have moved all databases to MariaDB 10 by the end of the week. [18:36:43] is still have some performance problem but most of them them to be solvable by rewritting some queries. So i am still working on it. but deletion+join is really slow. [18:37:50] We're still tuning things and will soon return the actual replicas to SSD which will help (the new combined instances live on spinning rust temporarily because migration) [18:38:19] Merlissimo: Also, users can now show explain on running queries; if you find issues we can help tune, do point them out. [18:38:54] Coren: there's going to be an 'analyse' command in mariadb 10.1, which should be available to everyone. yay :) [18:39:24] That may help, though I doubt we'll do another round of upgrades before a couple more months. [18:39:58] Coren: true, I'm hoping we can get that in say, 6 months [18:40:16] That's probably the kind of timeline that is likely. [18:40:45] Coren: currently i am working on a sql-only script with 2 seconds runtime on ts, but 11 minutes on labs. i am still trying to improve it in the next days, but maybe i need some if i am failing [18:41:24] Merlissimo: Two orders of magnitude difference sounds odd. Is there a missing index you can see? [18:46:07] Coren: no index exist. example long running query (>2 min vs 0,2sec on ts): (CCats is a aria on my database) DELETE cc [18:46:07] FROM CCats cc [18:46:07] INNER JOIN commonswiki_p.page ON page_namespace=14 AND page_title = cat [18:46:07] INNER JOIN commonswiki_p.categorylinks ON cl_from = page_id [18:46:08] WHERE cl_to='Hidden_categories' AND page_namespace=14; [18:47:21] maybe maria doe not like mixery of aria with Toku [18:51:56] matanya: everyone's piling into that room now. [18:52:07] i see :) [19:37:34] (03PS1) 10Legoktm: Add some more debug info [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/150005 [19:40:22] (03CR) 10Yuvipanda: [C: 04-1] Add a pid file to check if nightly.py is already running (033 comments) [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 (owner: 10Legoktm) [19:40:48] (03PS2) 10Legoktm: Add some more debug info [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/150005 [19:40:59] (03CR) 10Legoktm: [C: 032] Add some more debug info [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/150005 (owner: 10Legoktm) [19:42:55] 3Wikimedia Labs / 3deployment-prep (beta): Unable to upload new version of images in commons beta lab - 10https://bugzilla.wikimedia.org/68760 (10Vikas) 3UNCO p:3Unprio s:3normal a:3None Created attachment 16068 --> https://bugzilla.wikimedia.org/attachment.cgi?id=16068&action=edit Output of the er... [19:43:56] (03PS2) 10Legoktm: Add a pid file to check if nightly.py is already running [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 [19:44:04] (03CR) 10Legoktm: Add a pid file to check if nightly.py is already running (033 comments) [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 (owner: 10Legoktm) [19:53:26] (03CR) 10Hashar: "recheck" [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 (owner: 10Legoktm) [19:58:25] please make grrrit-wm reload config somebody? [19:58:40] seems like it did not get config changes yet that are merged [19:58:48] mutante: heh, in about 5min [19:58:50] mutante: i'll do that [19:59:44] YuviPanda: thanks;0 [19:59:47] :) [20:02:00] mutante: done [20:03:04] YuviPanda: thanks, hmm, maybe there is a config error then [20:03:18] i expected output in ops channel that did not appear [20:03:31] when changing stuff in wikimedia/bots/jouncebot [20:04:02] mutante: oh yeah, it killed it :) [20:04:18] damn, i guess i broke it ? [20:04:21] with https://gerrit.wikimedia.org/r/#/c/149201/ [20:04:30] mutante: yah, missed a comma [20:04:33] missing.. [20:04:35] that :p [20:04:36] fixing [20:06:22] 3Wikimedia Labs / 3Infrastructure: WMFLabs: Diamond not running / won't start - 10https://bugzilla.wikimedia.org/68444#c4 (10Krinkle) 5PATC>3RESO/FIX Thx. [20:07:00] (03PS1) 10Dzahn: fix missing command in grrrit-wm bot config [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/150015 [20:07:03] (03CR) 10jenkins-bot: [V: 04-1] fix missing command in grrrit-wm bot config [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/150015 (owner: 10Dzahn) [20:07:24] (03PS2) 10Dzahn: fix missing comma in grrrit-wm bot config [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/150015 [20:07:27] (03CR) 10jenkins-bot: [V: 04-1] fix missing comma in grrrit-wm bot config [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/150015 (owner: 10Dzahn) [20:16:03] I have the impression that since the new hack is introduced no error messages are printed any longer even if the php variables are set? does that make sense? [20:19:04] (03PS3) 10Dzahn: fix missing comma in grrrit-wm bot config [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/150015 [20:19:20] mutante: that seems to have worked [20:19:39] (03CR) 10Dzahn: [C: 032] fix missing comma in grrrit-wm bot config [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/150015 (owner: 10Dzahn) [20:20:03] YuviPanda: yea, sorry for that, merged [20:20:09] mutante: :) 'tis ok [20:20:17] mutante: restarted [22:55:03] (03CR) 10Yuvipanda: [C: 032] Add a pid file to check if nightly.py is already running [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/149831 (owner: 10Legoktm) [22:56:34] valhallasw`cloud: hey! wikitech has OAuth now, but flask-mwoauth doesn't seem to work with it :( can you try take a look when you have time? [22:56:48] valhallasw`cloud: the get_current_user() fails because the response from the server is blank page [22:56:56] YuviPanda: what the heck. [22:57:07] how can a call to api.php *ever* return a blank page?! [22:57:09] valhallasw`cloud: I know, I suspect it's a wikitech problem, but would like to confirm / know wtf is happening. [22:57:11] maybe http vs https? [22:57:12] valhallasw`cloud: if php dies [22:57:48] valhallasw`cloud: my current theory is that php dies when presented with header used by oauth, for... some reason. [22:57:59] valhallasw`cloud: why would that be a problem? plus I set my URLs to be https [22:58:18] no, I mean that maybe you try connecting to wikitech over http [22:58:25] you set the wikitech url somewhere, right? [22:59:05] valhallasw`cloud: yeah, and I set that url to https [23:00:52] hm. [23:00:57] maybe try wiresharking it? [23:01:23] (if you have a local dev environment set up, at least) [23:01:26] valhallasw`cloud: yeah, I can do that tomorrow and see wtf is happening. Just wanted to let you know beforehand [23:01:27] valhallasw`cloud: I do [23:04:51] * YuviPanda pokes andrewbogott with https://gerrit.wikimedia.org/r/#/c/149486/ again before going to sleep [23:05:24] YuviPanda: is it done and ready for merging? Dev still seems pretty active :) [23:05:34] andrewbogott: nah, I just added the last 'nice-to-have' [23:05:38] 'k [23:05:38] which was a pid file [23:06:09] andrewbogott: yeah, and legoktm just verified and +1'd :) [23:06:24] ok, I will merge when Jenkins catches up [23:06:30] andrewbogott: w00t, t y [23:06:32] :D [23:06:33] thanks! [23:06:54] legoktm: you should be able to figure out how to add it to wikitech as a role and then apply it to a new instance in extdist [23:07:02] legoktm: it should be in 'manage puppet groups' on the sidebar [23:07:06] ok [23:10:39] andrewbogott: w00t! Thanks :) [23:10:55] :) [23:11:25] 3Wikimedia Labs / 3(other): Puppetize extdist.wmflabs.org - 10https://bugzilla.wikimedia.org/68609#c3 (10Yuvi Panda) 5PATC>3RESO/FIX Puppetized and merged \o/ [23:30:52] just created a new instance [23:30:57] puppet is failing with [23:30:58] Error: /Stage[main]/Role::Labs::Instance/Mount[/data/project]: Could not evaluate: Execution of '/bin/mount /data/project' returned 32: mount.nfs: mounting labstore.svc.eqiad.wmnet:/project/extdist/project failed, reason given by server: No such file or directory [23:31:08] andrewbogott: ^ [23:31:14] andrewbogott: I got this too, but then it seemed to not matter? [23:31:26] legoktm broke the internet [23:31:42] puppet hates me [23:31:44] legoktm: why would you do that?! [23:31:48] If your project doesn't have shared storage enabled it will always do that... [23:31:56] killing all the cats! [23:31:57] legoktm: pro tip, puppet hates eveyone [23:32:34] andrewbogott: so I can just ignore it? [23:32:50] legoktm: what project? [23:32:54] extdist [23:33:14] https://wikitech.wikimedia.org/wiki/Special:NovaInstance is saying puppet status is "failed" everything else is "stale" or "ok" [23:33:20] instance is extdist2 [23:33:37] hm, no, that has shared storage enabled. [23:34:01] andrewbogott: it does, I checked when it failed on me [23:34:07] legoktm: also the puppet status thing on wikitech is bogus :) [23:34:09] most of the time [23:34:31] legoktm: when that error hit me, I just ignored it and went on. no issues, since the thing it said it couldn't mount mounted anyway [23:35:22] okay [23:35:26] I'll look when I finish dinner, might be a bit. [23:35:43] But, yeah, you should be able to forge ahead in the meantime [23:36:31] ok, I shall go to sleep for real now [23:36:32] night [23:36:48] gnite [23:39:46] http://extdist2.wmflabs.org/ awkward.... [23:40:04] legoktm: heh, try a 'sudo service nginx restart'? [23:40:25] worked! [23:40:35] legoktm: yay. I need to fix the nginx module... [23:41:02] extdist2.wmflabs.org/bar/ redirects to mw.o.... I guess that works [23:41:56] legoktm: heh, yeah. I can restrict it to just / if you want, later [23:42:03] legoktm: but /dist works, shows empty directory [23:42:08] nah it's fine for now [23:42:16] legoktm: cool :) [23:42:19] bbl [23:42:21] legoktm: so let's just wait for cron to run [23:42:21] go to sleep! [23:42:33] legoktm: finnne [23:56:27] legoktm: may I reboot extdist2?