[01:22:37] 3Wikimedia Labs / 3tools: Unable to access user database from tools-submit host - 10https://bugzilla.wikimedia.org/69081#c1 (10Philippe Elie) 5NEW>3RESO/FIX Working now, someone fixed it. [03:05:37] 3Wikimedia Labs / 3Infrastructure: Database upgrade MariaDB 10: Metadata access in INFORMATION_SCHEMA causes complete blocks - 10https://bugzilla.wikimedia.org/69182#c3 (10Sean Pringle) Setting tokudb_empty_scan=disabled looks to have improved the speed of TokuDB /opening/ tables. Stack traces indicate the... [04:02:37] 3Wikimedia Labs / 3Infrastructure: Database upgrade MariaDB 10: Metadata access in INFORMATION_SCHEMA causes complete blocks - 10https://bugzilla.wikimedia.org/69182#c4 (10jeremyb) 5UNCO>3ASSI This certainly sounds like confirmed... [04:06:07] 3Wikimedia Labs / 3tools: Unable to access user database from tools-submit host - 10https://bugzilla.wikimedia.org/69081#c2 (10jeremyb) 5RESO/FIX>3REOP It's not FIXED unless you actually know that someone did something to fix it. Dupe of bug 69182? (which is still open) [06:46:35] good morning: we type this command :git clone --recursive https://gerrit.wikimedia.org/r/pywikibot/compat.git pywikipedia. to create pywikipedia folder in winscp. so, is there another command to delete the folder? [08:33:52] (03PS3) 10Legoktm: Add some more debug info [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/150005 [08:38:22] ori: I am trying that stream again [08:38:24] now I get DEBUG:socketIO_client.transports:[packet received] 5::/:{"args":["no_such_namespace","The endpoint you tried to connect to doesn't exist: /"],"name":"error"} [08:45:51] haha, i had that too [08:46:03] don't remember the solution [08:50:02] petan: try /rc? [08:50:20] why it's not in https://wikitech.wikimedia.org/wiki/RCStream#Python [10:23:09] 3Wikimedia Labs / 3tools: Unable to access user database from tools-submit host - 10https://bugzilla.wikimedia.org/69081#c3 (10Andre Klapper) 5REOP>3RESO/WOR Closing as WORKSFORME as reporter cannot reproduce anymore. [12:16:53] 3Wikimedia Labs / 3Infrastructure: Database upgrade MariaDB 10: 600 seconds timeout - 10https://bugzilla.wikimedia.org/69110#c10 (10Incola) After switching back to SSD no error is reported and the queries are run with their previous timing. [12:35:33] * benestar waves [12:35:40] * SPQRobin waves back [12:38:57] https://wikitech.wikimedia.org/wiki/New_Project_Request/Structured_Wikiquote [12:39:09] how to get that approved quickly? :p [12:39:57] andrewbogott: could you give us an advice? :) [12:40:20] * SPQRobin praises Vogone for his glorious timely preperations [12:41:02] Vogone: I'll read... [12:41:44] Vogone: Project names need to be all lower case, no spaces. [12:41:57] andrewbogott: thanks, we will have our talk in less then two hours :P [12:42:10] andrewbogott: alright, will fix [12:42:23] *one hour xD [12:42:41] Vogone: --^ [12:42:58] andrewbogott: "is "structured-wikiquote" fine? :p [12:43:10] sure [12:43:12] k [12:43:13] fixed [12:43:16] :D [12:43:51] is the talk recorded/streamed? [12:44:09] gifti: I have no idea actually [12:44:11] aren't they all? [12:44:43] andrewbogott: at Wikimania? [12:44:46] all I have attended were at least recorded [12:45:07] Vogone, ok, done -- please fill in docs here: https://wikitech.wikimedia.org/wiki/Special:FormEdit/Nova_Project_Documentation/Nova_Resource:Structured-wikiquote/Documentation [12:45:17] benestar: nope, in Minnesota. [12:45:26] :( [12:45:32] ty :) [12:50:21] andrewbogott: thanks for the merges [12:51:14] yuvipanda: sure thing; everything working ok? [12:51:20] andrewbogott: yeah! [12:52:43] andrewbogott: the main box is on its own puppetmaster atm anyway, I'll move it to regular boxen in a few weeks when they're more 'stable' [12:54:13] andrewbogott: hey, Vogone tried to make me also an admin of the project but it doesn't work [12:54:32] what happened? [12:54:32] we just don't do it the right way ... [12:54:32] also what has to be done now that we can actuall connect to the instance? [12:54:47] andrewbogott: actually nothing [12:54:52] you have to be a member before you can be an admin [12:55:03] of course [12:55:22] similarly, you need to create instances before you can connect to them :) [12:55:33] *head to wall* :P [12:56:07] andrewbogott: he added me to https://wikitech.wikimedia.org/wiki/Nova_Resource:Structured-wikiquote [12:56:15] that is just wiki stuff [12:56:16] anything else to do then? [12:56:34] you want to be using the 'Manage xxx' links in the sidebar. [12:56:45] Editing wiki pages won't do much :) [12:56:47] glrious [12:56:51] *glorious [13:02:02] oh joy [13:09:36] qchris: thanks for setting up the repo for my labs tool! [13:09:46] dbrant: yw. [13:10:03] Currently, you're the only owner. [13:10:12] If you want more in there, just drop me a line. [13:10:23] qchris: right; is there a way to add members to the group? [13:10:48] qchris: yes; can we have BearND and yuvipanda? [13:11:06] https://gerrit.wikimedia.org/r/#/admin/groups/819,members [13:11:13] ^ dbrant that is the relevant group [13:11:37] I can't add to it... [13:11:47] Ok. [13:11:51] Let me add them then. [13:12:05] dbrant: done. [13:12:20] qchris: great! thanks [13:12:24] yw [13:22:24] (03PS1) 10Dbrant: Add .gitignore [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/152907 [13:25:14] (03CR) 10Dbrant: [C: 032 V: 032] Add .gitignore [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/152907 (owner: 10Dbrant) [13:29:09] ori: can I have some better example of https://wikitech.wikimedia.org/wiki/RCStream#Python which eventually uses the data somehow? [13:29:29] the example does basically nothing it just idles, what about printing the recent changes to terminal or something [13:29:37] I don't even see how could I access them? [13:36:44] Petan: ??? You get a dict with data [13:37:47] And it does print.. That's what on_change does [13:38:40] If it's not working, add the debugging lines I showed you yesterday [13:40:06] mhm... I see [13:49:24] valhallasw`mania: so what datatype is "change"? [13:49:46] python doesn't use explicit data types so it's very hard to guess that [13:50:31] I need to figure out how to access the JSON elements provided [13:52:32] It's a dict [13:52:47] Aka hashmap [13:56:33] !log deployment-prep jobrunner01 : running apt-get upgrade [13:57:03] valhallasw`mania: hello! [13:57:38] valhallasw`mania: I have added an entry to the lovely Git/Reviews to have myself added as a reviewer on repos matching integration/* but I am not sure it actually works Relevant diff https://www.mediawiki.org/w/index.php?title=Git/Reviewers&diff=1092873&oldid=1084950 [14:02:15] hashar: not sure if that is supported, i think "*" is a special case. Can be added though. [14:06:48] hashar: https://github.com/valhallasw/gerrit-reviewer-bot/blob/master/add_reviewer.py#L47 [14:07:39] That should maybe be a more generic matxher [14:09:31] tarrow: ^ maybe somethibg for you? :p [14:09:53] 3Wikimedia Labs: /srv won't mount on new 'large' instance - 10https://bugzilla.wikimedia.org/69161#c2 (10Andrew Bogott) I just tried this and it worked for me :( Can you reproduce, and then tell me what project it's happening in? [14:10:39] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c2 (10Antoine "hashar" Musso) There is no component for jobrunner yet in bugzilla (bug 68318). Ccing authors Aaron and Ori. [14:11:38] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c3 (10Antoine "hashar" Musso) Created attachment 16158 --> https://bugzilla.wikimedia.org/attachment.cgi?id=16158&action=edit /etc/jobrunner/jobrunner.conf on deployment-jobrunner01.eqiad.... [14:11:56] !log deployment-prep Beta cluster job queue is not running because jobrunner.conf is invalid json {{bug|69272}} [14:12:19] valhallasw`mania: yeah I get a regex match would do [14:12:31] valhallasw`mania: want me to bug fill the feature request somewhere ? [14:13:04] Sure. On github, I think... [14:13:30] valhallasw`mania: it is not urgent at all. Don't waste your wikimania time on it :D [14:13:49] hashar: :D [14:15:51] filled https://github.com/valhallasw/gerrit-reviewer-bot/issues/2 ! [14:15:53] thank you :] [14:19:38] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c4 (10Antoine "hashar" Musso) The file has inline comments using // which is not supported by PHP json_decode(). Removing the comment fix the issue. [14:24:38] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c5 (10Antoine "hashar" Musso) I believe the new jobrunner service is only used on HHVM. So adding keyword hiphop. [14:26:47] hashar: I think we just need to update to the latest jobrunner in beta [14:27:06] The comments are stripped in aaron's code [14:28:23] 3Wikimedia Labs / 3deployment-prep (beta): Give all members of the deployment-prep project sudo - 10https://bugzilla.wikimedia.org/69269#c1 (10Antoine "hashar" Musso) That has been done because we want to deploy real SSL certificate on the HTTPS handlers (on beta cluster, that is nginx on the varnish instanc... [14:29:45] !log deployment-prep Fixed permissions in /srv/deploy with `chgrp -R wikidev *` [14:29:48] bd808: isn't the service self updating ? [14:30:19] I guess that will need yet another jenkins job to do so [14:30:24] or hack in wmf-beta-autoupdater [14:31:14] !log deployment-prep Updated jobrunner to 5c927f9 [14:31:30] hashar: It's trebuchet. We haven't automated that. [14:31:34] ohhh [14:32:31] and ... not starting [14:32:52] /var/log/syslog has the php errors [14:33:01] wait where does this run now [14:33:10] Still on jobrunner01 right? [14:34:23] blerg. local host patch there [14:34:27] *hto [14:34:31] *hot [14:36:42] !log deployment-prep Removed local patch for jobrunner on jobrunner01 [14:37:26] hashar: Looks like it's running now ... "2014-08-08T14:35:55+0000: Starting job loop(s)..." [14:40:45] bd808: you are awesome [14:40:53] sorry was eating some cheese / milk with familly [14:41:17] hashar: :) thanks. I don't like to see broken things that I know how to fix [14:41:27] bd808: so that was just comments not being stripped in json file right? [14:41:46] bd808: I guess you can bugsmash https://bugzilla.wikimedia.org/show_bug.cgi?id=69272 now :D [14:42:01] and strike it on the whiteboard tracking bug fixed during wikimania (if such a thing exist) [14:48:53] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c6 (10Bryan Davis) 5NEW>3RESO/FIX The deployed version of jobrunner in beta had lagged behind the configuration. Additionally there was a local hotpatch on deployment-jobrunner01 that pr... [14:50:01] bd808: that was quick thank you. I guess you can enjoy talking to real people now :] [14:50:53] hashar: I'm listening to the Q&A part of the CirrusSearch talk by manybubbles and ^d [14:52:53] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c7 (10Bryan Davis) (In reply to Antoine "hashar" Musso from comment #5) > I believe the new jobrunner service is only used on HHVM. So adding keyword > hiphop. The new jobrunner is actually... [15:14:24] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c8 (10Antoine "hashar" Musso) a:3Bryan Davis Excellent. Thank you very much :] [15:22:03] hashar: Would you be sad if we get rid of the nda sudoers group in beta? That would basically mean that we are giving up on adding a "real" ssl certs in beta. [15:27:25] bd808: having real ssl certs was the only reason we restricted cluster sudo [15:27:33] so if we abandon real certs, it is all good :] [15:27:47] bd808: you might want to poke chris mcmahon about ssl certs [15:28:02] the idea was to have browsers in SauceLabs instances to test some HTTPS Scenario against beta cluster [15:28:15] hashar: Yeah I'll ask him too, but I think I know the answer [15:28:21] I am sure we can do with self signed certificate and figure out a way to have the Saucelabs browser to allow the cert [15:28:35] or maybe we can give the browser the public CA cert that we used to self sign [15:28:42] I think that he doesn't want to be forced to use ssl in tests [15:28:56] What I don't know is if he wants the option [15:29:12] the whole topic occurred when we worked on the centralized login thingie which uses ssl [15:32:42] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c9 (10Kunal Mehta (Legoktm)) 5RESO/FIX>3REOP Thanks for looking into this quickly. I still see the same number of jobs (well there are more now...) queued though? htmlCacheUpdate: 81 qu... [15:35:39] !channels [15:35:40] | #wikimedia-labs-nagios #wikimedia-labs-offtopic #wikimedia-labs-requests [15:36:26] bd808: ^ I re-opened the job queue bug [15:36:53] 3Wikimedia Labs / 3deployment-prep (beta): Give all members of the deployment-prep project sudo - 10https://bugzilla.wikimedia.org/69269#c2 (10Bryan Davis) Making this change would essentially mean that we are abandoning the quest for installing commercially provided ssl certificates in beta and closing bug... [15:37:20] legoktm: Ok. Have you seen Aaron? We should make him look at it. :) [15:37:31] no, I don't see him in the room [15:38:10] <^d> hashar: bleh, botspam again. [15:38:11] * ^d cries [15:38:21] yeah [15:38:25] just ignore the stupid bot :D [15:38:35] <^d> The bot's not stupid, the spam is :p [15:38:35] or have it stop spamming #wikimedia-dev :-] [16:02:41] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c10 (10Bryan Davis) It looks like we have a configuration issue. I see `"runners": 0,` for all of the groups in the config file. This is probably a puppet problem. [16:05:46] !log deployment-prep Fixed merge conflict that was preventing updates on puppet master [16:05:50] Logged the message, Master [16:06:53] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c11 (10Antoine "hashar" Musso) role::beta::jobrunner has: class { '::mediawiki::jobrunner': aggr_servers => [ '10.68.16.146' ], queue_servers => [ '10.68.16.146' ],... [16:13:04] legoktm: Can you check the job count again? [16:13:18] yay it's going down! [16:13:21] EchoNotificationJob: 425 queued; 0 claimed (0 active, 0 abandoned); 0 delayed [16:13:25] Sweet [16:13:32] I'll make a real patch [16:15:04] \O/ [16:15:18] bd808: I had once in a while a configuration hash varying by $::realm [16:15:27] but never landed :-/ [16:16:01] hashar: https://gerrit.wikimedia.org/r/152931 [16:16:23] Do you know if those are reasonable numbers of runners? [16:16:44] edited summary to point to the bug [16:16:51] well [16:16:52] prod has 20, 20, 7 and 1 [16:16:57] the class supports a statsd_host to report [16:17:02] so I guess we can enable that as well [16:17:08] and have some graph to figure out what is going on [16:17:26] I guess we can refine later on [16:17:31] Does my statsd server actually work? :) [16:17:39] there is only one instance doing the generic job running [16:17:50] not sure about statsd hehe [16:19:40] bd808: basically who ever did the puppet change forgot beta entirely :-( [16:20:01] hashar: Gee. I wonder who would do that? ;) [16:20:11] Oh yeah... everybody but you and me [16:20:18] hehe [16:20:30] we should eventually get rid of the role::.*beta classes [16:20:33] and uses configuration hashes [16:20:39] they can later be migrated to hiera() [16:21:00] cherry pick https://gerrit.wikimedia.org/r/#/c/152931/ :-D [16:21:01] hiera will make the world a better place [16:21:14] definitely [16:21:44] anyway it is time to close that laptop and take care of my little familly [16:21:47] be back on monday [16:21:50] have a good conference :] [16:22:04] bye hashar. we missed you here. [16:22:09] yeah :-( [16:22:13] on duty! [16:22:21] child++ can be unleashed anytime now! [16:22:36] see you [16:26:34] andrewbogott: Do you have a way that I can check my filesystem usage on my instance? [16:27:03] Negative24: you mean, besides 'df'? [16:27:23] 3Wikimedia Labs / 3deployment-prep (beta): Beta cluster job queue not running - 10https://bugzilla.wikimedia.org/69272#c13 (10Bryan Davis) [17:13] < bd808> legoktm: Can you check the job count again? [17:13] < legoktm> yay it's going down! [17:13] < legoktm> EchoNotificationJob: 425 queue... [16:27:48] I was using df and it showed / as being used by 14% but my git submodule update is failing because of no space left. [16:28:56] I think df is incorporating the shared storage such as /public/backup [16:29:07] well… df should be per volume. [16:29:11] Where is your submodule? [16:29:59] It's the /var/www/wmf/extensions [16:30:20] what instance is this? [16:30:49] performance-testing [16:31:22] df shows /var as being full. [16:31:55] andrewbogott [16:32:02] Must have missed that [16:32:55] you can create a new bigger volume if you want… of course then you'll need to move your mediawiki install there, which might or might not be easy. [16:33:15] Would that be a bigger instance? [16:33:34] Nah, you can just enable a puppet class like role::labs::lvm::mnt or role::labs::lvm::srv [16:33:43] 3Wikimedia Labs / 3deployment-prep (beta): Make the password for logstash-beta.wmflabs.org available to all users with sudo - 10https://bugzilla.wikimedia.org/69267#c1 (10Bryan Davis) p:5Unprio>3Normal a:3Bryan Davis Password has been placed in deployment-bastion:/root/secrets.txt The apache auth mess... [16:33:49] that'll create a big partition and mount in mnt or srv, respecively. [16:34:27] Okay. I'll try that. Why is /var mounted on a different volume? [16:34:53] Mostly so if you have something that creates a boundlessly giant logfile it doesn't cause the whole box to lock up :) [16:35:30] But var is more than a logfile store. It has the whole apache documentroot. [16:36:25] You mean, with apache install defaults? Or are you using a puppet setup that puts the docroot there? [16:37:26] role::lamp::labs puts docroot there. [16:37:50] That's the only puppet config I enabled. [16:39:22] Ah, I see… I think that's just what the Apache .deb package does. Generally people set up a custom site with the storage on its own volume. [16:39:36] I don't entirely know why that's the convention, but it's pretty much the only thing I see done. [16:40:06] I guess I'll move my docroot somewhere else. [16:40:44] but that means I'll being doing configuration without puppet which from what I understand shouldn't be done. [16:41:47] Well, sort of… I mean, I presume everything /in/ that docroot is already unpuppetized anyway [16:41:49] which is fine... [16:42:10] If you want to use puppet to set up mediawiki entirely, there are a few classes that can do that. [16:44:36] labs-vagrant is nice for that, it lets you do some work on a local VM and then transfer it to a labs instance without too much messing about. [16:52:02] andrewbogott: So should software installation be handled by puppet or the installation and configuration of those packages? [16:54:45] Negative24: It really depends on the application. Generally folks start with as much ready-made puppet stuff as possible (e.g. labs-lamp or labs-vagrant) and then build what you need by hand. That's because of the improvisatory nature of labs -- usually you don't know what you're going to do until you do it. [16:55:16] Then if you wind up with something that's going to be an important service going forward you would puppetize it all, so that if an instance crashes or you need to scale up you can recreate things quickly. [16:56:50] Negative24: but even in production the actual MW install isn't puppetized; that's handled using other scripts and such. [16:57:24] So only the LAMP stuff should be managed by puppet. [16:58:00] Well, as I said, the flip answer is 'whichever things you aren't actively developing' can/should be. [16:58:24] So if you're working on a mediawiki extension, it's simplest to have puppet install and manage mediawiki entirely, and just insert your extension in there. [16:59:16] That's what I'm doing. I guess I'll tinker with puppet for a bit... [16:59:44] If you're not in too deep I'd encourage using the labs-vagrant class. [16:59:58] Then you can make a role to install your extension, that's pretty trivial… and you're fully puppetized just like that. [17:00:38] Nice. [17:00:59] Vagrant (local vagrant, not labs-vagrant) is nice for local development of MW extensions too. [17:01:09] I think… that labs-vagrant might only work with Precise at the moment though. I'm not sure. [17:01:18] For some reason the manage interface panel still shows my instance as rebooting. [17:01:39] Yeah, I saw that too. Haven't had time to investigate why [17:01:50] andrewbogott: Nope. It's better supported on trusty actually [17:01:56] ok then! [17:03:10] Well that's helpful. [17:03:22] No need to trash my instance. [17:38:05] dewiki database is not replicated anymore (last edit: 11:45 UTC) [22:03:16] hello, Labs people! Not sure exactly who can field this one: [22:03:59] we've got an account on toollabs, into which we've installed the Android SDK (for continuous integration building of our Android app)... [22:05:04] now, the SDK has some binary executables that need to be used for the build process... [22:05:46] here's the thing: when we log in through tools-login, we're able to execute the binaries. However, when we log in through tools-dev, we can no longer execute them! [22:07:18] anyone have an idea why? ^^ [22:07:50] dbrant: no errors? [22:07:58] error messages? [22:08:17] "No such file or directory" [22:08:52] hmmm, and they are binaries? [22:09:01] not scripts? [22:10:09] quite sure they're binaries... (and that they exist!) [22:25:31] dbrant: maybe not in your path? [22:25:36] try ./binary ? [22:25:47] yep, tried it. [22:26:32] hmm, then I don't quickly have an idea [22:27:13] think you have to wait till an admin is awake, as they are likely all at wikimania, they are on european timezone now [22:46:09] dbrant: IIRC YuviPanda faced a similar problem some time ago. I think the problem was that the binaries required some libraries that (at the moment) are only installed on tools-login for testing. There was also a Puppet change to install them permanently, currently on hold. [22:47:54] scfc_de: aha, that makes a bit more sense! [22:50:59] although, when I do "ldd" on the binary, it doesn't say which libraries it's missing... [23:00:49] dbrant_: on tools-login I do get output for ldd, see http://pastebin.com/fvD3txXs [23:05:14] well, missing libraries does indeed make sense, so thanks scfc_de. We'll just need to track down yuvipanda... [23:59:12] dbrant: same aapt not found error happens when I submit a build job to the grid (see wikipedia/submit_build_job.sh)