[00:32:20] [bz] (8NEW - created by: 2Antoine "hashar" Musso, priority: 4Unprioritized - 6normal) [Bug 45625] /home/l10nupdate/.ssh/authorized_keys has input output error - https://bugzilla.wikimedia.org/show_bug.cgi?id=45625 [00:33:22] [bz] (8NEW - created by: 2silke.meyer, priority: 4Unprioritized - 6normal) [Bug 45609] Input/Output errors in a /home directory - https://bugzilla.wikimedia.org/show_bug.cgi?id=45609 [01:03:34] hmm does labs editing work for anyone? [01:10:46] Ryan_Lane, seems like I was able to make a few edits, but now any edits don't finish saving and I can't even login to my account through an alternative browser (IE instead of Chrome) [06:11:28] where are dumps stored in the bots project? [06:11:41] searching labsconsole is now impossible since they merged... [06:14:12] /public/datasets/pubic/xxwiki [06:14:21] would be nice if there was a -latest alias that could be used [06:14:25] Coren: poke :) [06:14:27] ^ [06:40:37] BTW, re: dumps, is their creation atomic, i. e. if a file exists beneath /public/datasets/..., is it going to change? [07:41:13] legoktm? [07:41:28] hi [07:41:34] dumps?? [07:41:37] yeah [07:41:42] of what [07:41:45] wikipages [07:41:47] i found them [07:41:48] but [07:42:03] toolserver has like "enwiki-latest-pages-articles.xml" or something [07:42:06] rather than by date [07:42:21] so you just hardcode that path, rather than having to figureout which date is the latest [07:42:22] ah that [13:58:00] petan: i cant ssh into bots-bnr1 [13:58:04] legoktm@bastion1:~$ ssh legoktm@bots-bnr1 [13:58:05] Permission denied (publickey). [14:01:54] let me check [14:03:39] legoktm can you type ssh -vvvv [14:03:46] sure [14:03:53] legoktm I think everyting is ok [14:04:01] you have a key accessible there and it's readable [14:04:10] so you are either not forwarding your key or I don't know [14:04:18] did you try loging to another one? [14:05:04] http://dpaste.de/s1q1G/raw/ [14:05:09] i tried it from bots-3 as well [14:35:02] petan: any ideas whats up with http://ganglia.wmflabs.org/latest/?r=day&cs=&ce=&c=bots&h=bots-sql3&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [14:35:17] that cant be all me [14:56:00] legoktm did bot-3 work ok? [14:56:42] legoktm@bots-3:~$ ssh legoktm@bots-bnr1 [14:56:43] If you are having access problems, please see: https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [14:56:43] Permission denied (publickey). [15:02:52] For what it's worth, I can ssh from bastion to bots-bnr1, but not from bots-3 (same error). [15:04:46] It says: debug3: no such identity [15:12:01] hah legoktm I fixed it ;p http://ganglia.wmflabs.org/latest/?r=2hr&cs=&ce=&c=bots&h=bots-sql3&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [15:12:11] heh [15:12:13] what was it? [15:12:35] I think just too many requests [15:12:49] which means I need to fix this before wikidat rolls out other wise that server is going to DIE [15:12:49] xd [15:13:20] heh [15:13:21] ok [15:13:40] * addshore thinks he should have his own dedicated 8GB ram boxes, 1 for db and 1 for running on ;p [15:15:04] heheh [15:23:39] I broke it again? O_o http://ganglia.wmflabs.org/latest/?r=20min&cs=&ce=&c=bots&h=bots-sql3&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [16:22:05] legoktm: I fixed my code :P [16:22:10] yay! [16:22:21] I was doing about 200 individual delete requetss every min or so [16:22:22] xD [16:22:30] * addshore combined them all to 1 query :P [16:22:35] heh [20:39:25] Coren around? [20:39:38] addshore: Semi. What's up? [20:39:39] or petan :) [20:39:57] :P Is it possible to increase memory and cpu on an instance? [20:40:38] addshore: Probably. [20:41:10] it would be great if bots-sql3 could have a bit more, maybe of both, else I have a feeling that when wikidata rolls out and I run my bot it is going to get buthered :/ [20:41:12] addshore: Though not, seemingly, from the interface. [20:41:14] *butchered [20:41:58] addshore: The ops team was so busy this week I basically had no opportunity to get access and pointers to anything beyond the project-level stuff. [20:42:07] oh hi Coren. Where can I stick requests for things that are super annoying but trivial to fix? [20:42:23] legoktm: You mean on the tools project? [20:43:01] Tools/bots not sure. It has to do with the availablility and access to dumps [20:43:19] legoktm: bz is the right place then. [20:44:18] ok will do thanks [21:01:01] [bz] (8NEW - created by: 2Legoktm, priority: 4Unprioritized - 6normal) [Bug 45646] Create -latest alias for dumps - https://bugzilla.wikimedia.org/show_bug.cgi?id=45646 [21:01:08] Coren: ^ [21:11:27] I don't think it's possible to increase the memory of a running instance [21:11:33] (although that would be handy) [21:12:03] Platonides: Possibly not while running, but I know Openstack allows rejiggering it while down. [21:12:40] Platonides: (While running is cool, but would require a kernel with hot-swap memory support as well as the VM supporting the right ACPI voodoo) [21:13:15] that's why I said "of a running instance" :) [21:13:32] adding memory while shut down is much easier [21:13:58] but you have to shutdown the running processes [21:14:04] which is usually the problem [21:14:16] otherwise you could also create a new instance [21:14:55] dynamic memory allocation to labs instances would be very nice [21:15:16] we wouldn't need to overallocate memory [21:15:36] and increasing the available memory of a host would be as simple as increasing a limit in the interface [21:20:46] addshore: no [21:21:13] Coren: never try that ;) [21:21:42] Damianz: There's a lot of things that should never be tried. Which specific one did you have in mind? :-) [21:22:04] Last time I asked laney to try re-sizing an instance in labs... it never came back... ever... like badly... [21:22:29] Idea behind using puppet... just throw it out and make a new one, this is why the bots mysql servers suck because I'm lazy and want pizza and cbfa finishing off the puppet stuff [21:22:50] Ah. Puppet can't do everything though. [21:23:10] what's cbfa? [21:23:19] can't be fucking assed [21:23:35] Coren: Heathen! [21:24:04] Damianz, what's the problem with puppetising it? [21:24:09] it shouldn't be too hard [21:24:19] Damianz: Don't get me wrong; puppet is great for most things. [21:24:38] you could also copy some puppet config intended for wikimedia db servers... [21:24:43] Time... I don't have any spare, more things to work on than time to do it. [21:24:49] Damianz: And you can do /almost/ everything if you actually can write the node entry. But the way it's done now we can't actually do that. [21:25:01] It's mostly done - just needs standard base classes adding for apps servers, the data dumping, instances creating and the data importing. [21:39:17] Coren: yes, yes you can [21:39:22] Coren: it's stored in ldap [21:39:33] the "configure" page manages what gets stuck in there [21:39:43] you can manage, per project, what is shown in the configure page [21:39:58] you can not, though, use paramaterized classes in the node [21:39:59] Ryan_Lane: You mean I /hypothetically/ can. I still don't have any write access to the LDAP, remember? :-P [21:40:03] dude. [21:40:14] every single project admin can do this [21:40:19] o_O [21:40:36] With what, my user credentials?! [21:40:45] Coren: https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=configure&instanceid=51f3a377-90b5-4f46-8590-5624c50c5dbf&project=tools®ion=pmtpa [21:40:52] that writes to ldap [21:41:01] Coren: https://wikitech.wikimedia.org/wiki/Special:NovaPuppetGroup [21:41:11] that allows you to modify what's available via configure [21:41:17] I know, but doesn't allow freeform writing to the node entry. :-) [21:41:27] I.e.: parametrized classes, for instance. [21:41:38] we don't want parametrized classes in the node entries [21:41:43] at all [21:41:54] ... why? [21:42:09] put those into roles, and have the roles use parameterized classes [21:42:24] Ryan_Lane, I get a blank page for that link [21:42:36] or maybe that's because I changed the project [21:42:39] Platonides: because you don't have config access on that instance, or in that project [21:42:47] it shouldn't return a 500 error, though [21:42:50] Ryan_Lane: That doesn't help; you still have to use a shitload of variables for configuration parameters. [21:43:05] why? [21:43:13] a role defines a specific type of thing [21:43:27] if you're needing to pass in a million variables, you're not using a role properly [21:43:45] roles aren't meant to be generic [21:43:54] they are meant to be specific to us [21:44:00] so, you configure almost everything in the role [21:44:01] it may be because the instance doesn't belong to that project [21:44:08] yeah, likely [21:44:13] Hmm..... on a scale of 12-12 are exported resources as much of a pain to work with as I think? [21:44:15] Platonides: mind entering a bug for that 500? [21:44:35] Damianz: they slow down puppet very much and are also not secure to use across projects [21:44:59] Ryan_Lane: Wait, you guys use roles as a 1:1 to node config? That's... very much not puppet SOP [21:45:15] Coren: puppet's SOP isn't exactly wonderful [21:45:41] when you have a large number of systems it's silly to configure everything in the node [21:46:00] When you have a large number of /identical/ systems, sure. [21:46:03] yes [21:47:05] Ryan_Lane, done: bug 45649 [21:47:09] Use case: I have compute nodes. To me, SOP would be to make a role for "compute node" but parametrize it in the node. "This node has role compute node with X Y and Z properties" [21:47:10] Platonides: thanks [21:47:34] Surly for a compute node, each sub-group would be a seperate role? [21:47:41] Coren: are all the compute nodes configured very differently from each other? [21:47:58] It depends where you want your logic though - personally I'm thinking of shoving it out of puppet and abusing the ENC to get around this debate. [21:48:49] Ryan_Lane: No, but there certainly is going to be a few exceptions for dedicated queues. [21:48:53] s/is/are/ [21:48:57] right, so, use variables for that [21:49:12] Ryan_Lane: Every time you use a variable, a puppy is eviscerated. [21:49:26] I don't see why [21:49:40] only because puppet has abused them historically [21:50:02] and is poorly architected [21:50:11] Ryan_Lane: No, because there is no clear line of reference. For all you know, some defined type invoked from a class you didn't use specifically is using it. [21:50:22] namespace it, then [21:50:24] I'll be happy when everything is modules, the role classes are auto imported into labs console, labsconsole has an api to interact with everything from the command line and on a puppet change jenkins can build the instance (via the api) and run integration tests... oh and did I mention us actually having unit tests for puppet? [21:50:33] * Damianz notes he's easily pleased [21:50:38] Damianz: yes, I would also like this [21:50:40] very much [21:51:07] Ryan_Lane: Sure, that works for me, but *you* don't know how I used it and it's not clear how to find its effects. I have to think about the other sysadmins who are going to be looking at that config someday. :-) [21:51:22] it only affects things that are being included [21:51:29] it doesn't affect everything else [21:51:57] when I say use variables, I mean define it in the node, then use it in the role [21:52:05] don't use it anywhere underneath that scope [21:52:35] the biggest problem here is that the ldap backend doesn't actually support parameterized classes [21:52:38] Ryan_Lane: Hm. In other words, use it like a parametrer to a class. Kinda like a... parametrized class. :-) [21:53:09] we could write an ENC that does [21:53:29] we have some other reasons why that would be a nice thing to have, as well [21:53:46] Ryan_Lane: That'd certainly simplify a few other things. [21:54:05] yes, because then we could also define variables and such based on other parts of the DIT [21:54:40] for instance, we have information defined about projects, that would be nice to include into the nodes as variables [21:54:54] I don't have time to write that ENC, though, and I hate ruby ;) [21:54:57] Ryan_Lane: See, /that/ is a good use of variables. :-) [21:55:05] I guess it's possible to write ENC in any language [21:55:12] I could write it in python [21:55:43] Sure. Also using your Copious Free Time(tm) [21:56:19] what are you meaning by ENC? [21:56:20] indeed [21:56:35] Platonides: http://docs.puppetlabs.com/guides/external_nodes.html [21:57:05] ooh... [21:57:35] it would be a pain to make a parameterized class in ldap [21:57:40] which is likely why no one has done it ;) [21:58:45] We could use mysql =D [21:58:48] shoot me now [21:59:12] heh [21:59:36] Ryan_Lane: Incidentally, about a per-project OU, do you know when your TODO will reach that point? [21:59:38] It kind of would be interesting to make an ENC for openstack... then you're not tied to ldap... and you can directly take its data. [22:00:25] yeah, we could actually define classes/variables and such as metadata, then [22:00:49] Coren: probably pretty soon [22:03:52] Ryan_Lane: Are you using the LDAP node terminus or an ENC that pulls from LDAP? [22:04:01] ldap node terminus [22:07:23] Damianz: sad times :/ [22:07:29] (finally got your ping) [22:08:45] Coren: how is some sort of db for tools coming along? :) [22:10:59] addshore: Contingently on some sort of intrastructure to make one. Ideally, a physical box would be best but if push comes to shove I'll make a temporary one in a VM. [23:17:39] Coren: The webserver for PHP works well, thanks! The cgi-bin bit (http://tools.wmflabs.org/wikilint/cgi-bin/test.sh) doesn't, though, even though the config doesnlooks [23:17:48] doesn't look wrong. [23:18:16] Also, both -login and -webserver-01 demand "*** System restart required ***". [23:18:32] tools.wmflabs.org needs a pretty homepage like bots.wmflabs.org [23:18:58] scfc_de: Yeah, they do that on some updates. Definitely not a worry. :-) [23:19:42] Damianz: Is that puppetized? :-) [23:20:09] The file is put there by puppet... we're not quite using the class yet though [23:20:37] Damianz: I've asked jorm for a Tool Labs logo. :-) [23:20:56] jorm? he a weird guy that makes pretteh things? [23:21:09] == Brandon [23:21:13] Ahh [23:21:56] He makes me jealous of awesome hair [23:22:52] scfc_de: Something wrong with my rewrite rule. Checking. [23:23:17] Ah. Ordering. [23:23:33] Damianz: Is that already in the puppet repo or do you use a local puppetmaster? (Always confused why "master" is named "production" in the former.) [23:23:52] Hm. And not actually CGI either. [23:24:25] It's pending a merge into production - currently it's local puppetmaster on dev and manually done on 'prod' [23:26:13] scfc_de: Should work now. [23:26:44] Yep, my small sample works, now to fix the bugs on my part :-). [23:28:34] Something different than Toolserver: http://tools.wmflabs.org/wikilint/cgi-bin/test.sh gives the expected output, http://tools.wmflabs.org/wikilint/cgi-bin/wikilint the script source. [23:29:10] http://tools.wmflabs.org/wikilint/cgi-bin/wikilint.pl works. [23:34:52] And now both do. Coren, did you change something? [23:35:07] (Both *work*, even though it doesn't look like it.) [23:39:34] scfc_de, is it executable? [23:39:50] I get a 500 on http://tools.wmflabs.org/wikilint/cgi-bin/wikilint [23:40:01] or with the .pl ... [23:40:57] That's an error in my script (wrong lib directories), but if the server treats them both the same, it's working :-). [23:41:16] that's probably apache config [23:41:17] 'In French with English subtitles.' boo [23:41:50] which then leads to confusions... [23:42:06] good night [23:42:41] Good night! [23:45:48] Coren these dbs make me cry :< [23:46:07] * Damianz pats addshore and gives him milk and cookies [23:47:39] I just went and checked on my scirpts and turns out they cant edit the db now :P [23:47:56] Lock wait timeout exceeded; try restarting transaction :< [23:48:03] scfc_de: I didn't change anything since earlier. [23:49:08] scfc_de: Perhaps your browser kept a local cached version of the page? [23:49:37] * addshore thinks about this [23:51:46] it cant handle deleting 100 rows at once >.< [23:53:07] do I go back to individual delete requests that actually go through :/ gragh [23:56:15] Coren: Probably. [23:56:57] Note for self: log split for the access and error logs. [23:58:18] Ah. Doesn't switch the UID right for cgis. [23:58:21] * Coren fixes that.