[00:00:03] I could do it with 2.4 [00:00:50] I guess it's complaining about unicode() now? [00:01:22] unicode() doesn't exist. [00:01:23] fwilson: No, more u'foo' strings at line 23 [00:01:32] legoktm: exactly :) [00:01:33] fwilson: Run 2to3 on your script first. [00:02:21] u'compare' on line 38 [00:02:35] fwilson: Listen to legoktm. He speaks sooth. :-) [00:02:38] :) [00:02:58] Ryan_Lane: Do we have 2.4 deployed anywhere? [00:03:00] 2to3 works now, so I did do that [00:03:07] no clue [00:03:20] does it come with precise? [00:03:20] ImportError: No module named urllib2 [00:03:26] Wait what? [00:03:28] No urllib2? [00:03:43] oh [00:03:44] fwilson: I may need to add python3 libs. :-) [00:03:45] urllib.request [00:03:49] no, you shouldn't need to [00:04:21] Can't use string pattern on a byte-like object? [00:04:50] Ryan_Lane: Nope; even raring seems to be on 2.2.22 [00:04:55] then nope [00:05:03] * Coren curses. [00:05:26] *sigh* [00:05:35] Coren: python3 libs should be default with the package. [00:05:47] fwilson: Just do import urllib.request as urllib2 [00:05:54] That should fix most things [00:06:03] legoktm: i'm from urllib.request import * [00:06:10] Ahhhhhhhhhhhhhhhhhhh [00:06:13] Never do that [00:06:14] Okay, okay [00:06:18] Ssshhh [00:06:20] * legoktm trouts fwilson  [00:06:20] Don't tell Vacation9 [00:06:40] !trout fwilson [00:06:43] :/ [00:06:50] Nope, no helpmebot :) [00:06:55] Bwahahahahaha! *ahem* [00:07:26] !trout is *catches a fresh trout from the river, then slaps $1 with it a few times [00:07:27] Key was added [00:07:32] !trout fwilson [00:07:32] *catches a fresh trout from the river, then slaps fwilson with it a few times [00:07:37] ... [00:07:37] close enough. [00:07:46] Be back in a bit [00:07:50] [Thu Mar 14 00:07:16 2013] [error] [client 10.4.1.89] import urllib2.request as urllib2 [00:07:51] [Thu Mar 14 00:07:16 2013] [error] [client 10.4.1.89] ImportError: No module named urllib2.request [00:08:06] >.< [00:10:51] I note that, in python3, the lib seems to be named just 'urllib' [00:11:09] viz. /usr/lib/python3.2/urllib [00:11:45] cf. /usr/lib/python2.7/urllib2.pyc [00:17:56] Yeah. [00:18:10] Well python2 has a plain "urllib" too. [00:18:11] python3 merged the two [00:22:01] Okay, that's fixed [00:22:29] but what in the world does "attempted to use string pattern on a bytes-like object" mean? [00:22:54] google it [00:23:37] I did, it doesn't help very much, all I can tell is that it is in some way related to json [00:23:44] Which I have a feeling tha it isn't [00:24:10] Oh, wait a second [00:25:51] I'd love to be more helpful, but I have level zero in Python-fu [00:26:44] Coren: $ python -c "import this" [00:27:03] Oh yes, you should do that [00:27:57] Just out of curiousity, why didn't we start with unicode in the first place [00:31:18] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Notepad was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=659541 edit summary: [+8] +python3 [00:34:54] !logs [00:34:54] logs http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-labs [00:35:27] is it possible to have own domain pointing to labs? [00:37:06] Why? [00:37:35] Just out of curiousity, is it possible to get data from bots project in tools project? [00:39:06] fwilson: rsync them from an instance to another? [00:39:21] Danny_B: though yes, I'd also like to know why [00:39:22] hashar: but it's cross-project, i'm not sure how i would do that [00:39:42] labs isn't meant for production use [00:40:51] for instance when somebody runs some wiki-related tools on own server with own domain but would like to migrate to labs [00:41:38] why not redirect their domain to labs? [00:42:03] short answer: eys [00:42:07] keep urls? [00:42:07] *yes [00:42:19] i'm just checking possibilities [00:44:02] Ryan_Lane: "ls: cannot access /data/project: Transport endpoint is not connected " [00:44:11] which instance/project? [00:44:13] cvn-app1 [00:44:16] one sec [00:44:26] Also, cvn-apache2 appears to be critical in ganglia, ssl handshake? [00:44:43] cvn-app1 doesn't have that problem. I set up both the same way afaik [00:45:38] well, these are gluster errors I haven't seen before [00:45:56] either way [00:45:58] fixed [00:46:06] /etc/init.d/autofs restart [00:46:13] ok [00:46:21] looking at apache2 [00:46:27] I just rebooted apache2 [00:46:38] (right after you said gluster was fixed on app1) [00:46:50] I'm on apache2 [00:47:06] !resource i-00000339 [00:47:06] https://labsconsole.wikimedia.org/wiki/Nova_Resource:i-00000339 [00:47:21] nagios shows no issue for it [00:47:24] err [00:47:25] icinga [00:47:28] i-00000339 [00:47:29] err [00:47:32] http://icinga.wmflabs.org/cgi-bin/icinga/extinfo.cgi?type=1&host=cvn-apache2.pmtpa.wmflabs [00:48:13] Ryan_Lane: http://nagios.wmflabs.org/cgi-bin/icinga/status.cgi?hostgroup=cvn&style=detail [00:48:24] "cvn-apache2.pmtpa.wmflabs " has critical errors [00:48:38] "CHECK_NRPE: Error - Could not complete SSL handshake. " for load, disk, ram and processes [00:48:45] ah [00:49:11] I recall seeing those last year one a fresh instance, one had to run puppet manually once to fix it [00:49:19] I know there was some issue with nrpe.... [00:49:27] however this instance isn't fresh anymore, it's been on for a while and only now does it start showing the errors since a few days [00:49:31] but killing it and restarting it should have worked [00:49:38] Damianz: ^^ ? [00:49:42] mutante: ^^ ? [00:49:55] Ryan_Lane: after the restart just now it resolved it for some of them [00:49:59] yep, i killed and restarted nagios-nrpe-server on all instances the other day [00:50:03] they're slowly disappearing [00:50:11] but this is just on one instance and i think it wasnt the same error [00:50:34] btw, why didn't these get reported in #wikimedia-labs-nagios? I have a stalkword in that channel for projects I'm in. [00:50:42] duration 6d ... so that isnt that new either..hmm [00:50:49] they only get reported when it initially goes critical [00:50:57] ok [00:50:59] mutante: well, it didn't hit all instances [00:51:04] oh wait, it just fixed itself [00:51:07] mutante: only ones that had salt responding [00:51:12] I killed nrpe and restarted it [00:51:16] must've been the reboot then [00:51:17] at least disk space is OK now [00:51:23] gotcha Ryan [00:52:11] we had similar issues in the past in prod. with nagios-nrpe-server not restarting properly [00:52:30] and once attempted to fix it, i think successfully, by adding a "sleep" in the init script [00:52:42] and then we cleaned up/removed that old init script when we switched to icinga .. [00:53:30] but it also didn't happen anymore, last time i made a change to NRPE checkcommands, which triggers a restart, it did not break like it used to back in the days.. [00:57:37] mutante: could it be the puppet subscribe[] not working properly ? [00:58:49] well, i don't know if this is the exact same issue, but the old issue wasn't that, it was puppet doing a service restart and then it would stop but not come back [00:59:50] I tried out on a frozen instance [01:00:31] puppet never stopped the nrpe process despite claiming it did [01:01:04] notice: /Stage[main]/Nrpe::Service/Service[nagios-nrpe-server]/ensure: ensure changed 'stopped' to 'running' [01:01:13] keep doing it despite me having /usr/sbin/nrpe -c /etc/icinga/nrpe.cfg -d running [01:03:45] confirmed ..ugh [01:04:07] just using the init script , not puppet related, does not properly stop it [01:04:11] and if you try to use /etc/init.d/nagios-nrpe-server stop [01:04:19] it says OK despite the process still running [01:04:25] suspecting the pid file is incorrect [01:05:43] running as user pid 4294967295 ? [01:05:55] uid [01:06:25] if [ ! -d "$PIDDIR" ]; then mkdir "$PIDDIR" chown nagios "$PIDDIR" [01:07:02] PIDDIR owned by nagios .. but process not running as nagios [01:07:11] good catch [01:08:18] Welp, 13 hour days is 'nuf for me. [01:08:19] * Coren waves. [01:08:43] mutante: /etc/icinga/nrpe.cfg has user icinga [01:08:47] # id icinga [01:08:47] id: icinga: No such user [01:08:49] so hmm [01:09:03] include user::icinga; require => User['icinga'] [01:11:40] systemuser { icinga: name => "icinga", home => "/home/icinga", groups => [ "icinga", "dialout", "nagios" ] } [01:13:03] but does whatever class install nrpe on labs include that icinga::monitor class? [01:13:08] Ryan_Lane: service is setup in manifests/nrpe.pp [01:13:16] class nrpe::service { [01:18:17] fixing [02:00:18] Is ganglia down? [02:01:19] hm [02:01:25] looks like it kind of is [02:02:06] hm [02:02:10] looks like it's out of memory [02:02:14] ssh: Could not resolve hostname bastion3.wmflabs.org: nodename nor servname provided, or not known [02:02:15] :( [02:02:58] hashar: fixed [02:03:01] \O/ [02:03:11] stupid pdns ldap backend [02:03:19] cant we get a bastion.wmflabs.org entry pointing to all bastions? :] [02:03:29] heh [02:03:36] that won't make things better ;) [02:03:37] or even a failover between both datacenters [02:03:52] expecting ssh to try all the IN A records it is given [02:04:00] yeah, it doesn't do that [02:04:08] and it would give back different keys for each [02:04:16] ah true [02:04:16] unless we sync'd all the host keys [02:04:20] ;-] [02:04:21] but that has another set of problems [02:04:43] FastLizard4: fixed [02:04:46] restarted gmond [02:04:56] seems we need a larger instance for ganglia [02:04:59] Ryan_Lane: Aha, danke :) [02:05:02] * Ryan_Lane goes to create one [02:12:00] bah. both aggregator1 and aggregator2 need to be recreated [02:12:21] or resized.... [02:12:27] hm. maybe I'll get resize working [02:13:09] !log deployment-prep Trying out geoip module from {{gerrit|53714}} on deployment-integration [02:13:12] Logged the message, Master [02:21:42] booo [02:21:49] geoip does not play well in labs [02:22:00] !log account-creation-assistance apt-get update/upgrade on all instances [02:22:01] Logged the message, Master [02:22:45] hashar: in what way? [02:23:03] attempt to access the private volatile storage to get the GeoIP datafiles [02:23:04] :-D [02:23:09] oh [02:23:09] right [02:23:11] of course [02:23:11] I thought it worked [02:23:17] different puppet master [02:23:20] using a different set of files [02:23:22] we don't have that enabled on virt0 [02:23:39] hashar: can you add a bug? [02:23:44] wikimedia labs, infrastructure [02:23:49] sure [02:24:25] hmm geoip-database package has been installed [02:28:56] https://bugzilla.wikimedia.org/show_bug.cgi?id=46093 [02:28:58] [bz] (NEW - created by: Antoine "hashar" Musso, priority: Unprioritized - normal) [Bug 46093] virt0 puppet master lacks GeoIP files - https://bugzilla.wikimedia.org/show_bug.cgi?id=46093 [04:20:32] !log deployment-prep Upgrading apache32, apache33, video05 and jobrunner08 [04:20:35] Logged the message, Master [04:22:05] !log deployment-prep Restarted apache on apache32,33 [04:22:07] Logged the message, Master [04:22:17] !log deployment-prep killed job runners on jobrunner08 and restarted service [04:22:19] Logged the message, Master [08:05:12] is there a general problem ? git fetch origin does not work [08:06:46] hmm, was slow but is working now [08:52:47] addshore, legoktm were there any stability issues over past days? [08:53:03] compared to the TS? lol nope [08:53:08] great [08:53:36] I mean since we launched scheduler [08:54:03] nah its great [08:59:59] Damianz is CB working now? [09:00:03] I see little activity on sql [09:07:21] [bz] (NEW - created by: Antoine "hashar" Musso, priority: Low - enhancement) [Bug 46104] reduce the number of wiki on beta - https://bugzilla.wikimedia.org/show_bug.cgi?id=46104 [10:22:46] [bz] (NEW - created by: Tim Landscheidt, priority: Unprioritized - normal) [Bug 46105] Install libmediawiki-api-perl - https://bugzilla.wikimedia.org/show_bug.cgi?id=46105 [10:24:10] !log wikidata-dev wikidata-testrepo: disabled AbuseFilter extension in the local puppet files, because it led to error messages when editing lang links. Also: Installed php5-xdebug for a better apache error log. [10:24:13] Logged the message, Master [11:14:33] !log webtools rebooted webtools-login & webtools-apache-1 [11:14:34] Logged the message, Master [11:20:43] petan: :D [11:21:40] one thing I was going to say is we should always try and keep usage at 100^% of all resources - 1 instance, that way if one goes down / we need to do something to it all is fine :) [12:02:27] !log wikidata-dev puppety-pupp: Deleted obsolete puppet test instance [12:02:29] Logged the message, Master [13:10:43] addshore what [13:11:49] addshore I don't care about wiki bots being on loaded instance - but I DO CARE about interactive bots being on these [13:12:08] it's like mixing elephant and glasses in one room - glasses will crash [13:12:37] interactive bots should live on instance with load near to 0 to work properly [13:12:53] that's it [13:18:39] @notify addshore [13:18:39] This user is now online in #huggle so I will let you know when they show some activity (talk etc) [13:44:50] petan: im here for a bit [13:45:07] ok what did you talk about [13:45:23] well firstly looking at ganglia we should probably have another instance for OGE [13:45:33] ok - I will make it then [13:46:18] secondly with interactive bots we should have yet another instance which OGE uses, but have a seperate queue for that instance [13:46:23] * addshore now has to go ;p [13:46:30] ill be back latrter! [13:47:40] addshore yes I agree with that [13:50:08] !gridbots is http://helms-deep.cable.nu/~rwh/blog/?p=159 [13:50:08] Key was added [14:48:14] Coren, when reading the conversation between Petr and you, I got the question, if volunteers can really help in tools already or is Petr right that you want to do the first steps alone? [14:48:51] Jan_Luca no volunteers in tools so far [14:48:56] btw I am Petr Bena [14:49:36] Jan_Luca: I'm not /opposed/ to having help but, at this point, it's more likely than not that the actual time spent would increase as additional sysadmins are involved. At least some minimal documentation would be needed, and that needs the base groundwork to have been settled. [14:49:55] Jan_Luca: tl;dr in a couple weeks, most likely. [14:50:37] Coren: You think that volunteers would cause problems [14:50:40] And yes, petan will be welcome to step forward personally if he wants to. :-) [14:50:42] I? [14:50:45] ? [14:51:10] Because I think there are many persons that could help you with some things [14:51:17] Jan_Luca: No, just that right now the time needed for coordination would be bigger than the time saved with the extra hands. Most of the documentation exists in my head only. :-) [14:51:25] well, that's what I think as well, hence all my emails [14:51:50] but I can of course wait working within bots project only of course [14:52:14] In a couple of weeks, once the greater part of the design has gelled and is at least minimally documented, extra help will become useful. [14:52:49] Coren: I think the idea of open source/open knowledge is that everybody can access it and help to extend it [14:53:40] And I understand labs as a platform for working together! to help the Wikimedia projects [14:55:59] At moment I can understand DaB. why he has his problems with the migration to Tool Labs when the first steps are happen without the community [14:57:18] Jan_Luca: I think that the dichotomy is imaginary. I am no less "the community" than everyone else, except that I have the opportunity to devote my full attention to this; I'm not sure where that meme originates that I am somehow and adversary, but that canard needs to stop. [14:59:08] <^demon> Coren's old school :) [14:59:46] ^demon: Oy. *Elder* school. :-) [14:59:57] (Ancient?) [14:59:58] :-) [15:00:33] <^demon> Well, I meant it as "Been on-wiki since at least '06 or earlier" :) [15:00:59] Jan_Luca: If people needed to be root to "work together", Toolserver would have two users. [15:02:50] Jan_Luca: And, again, nobody every said there wouldn't be volunteer roots; just that right now it's not as useful as it would be in a couple weeks. [15:03:12] <^demon> "Repository corruption is not good. Sounds like there are still some bugs in JGit's GC class that need to be investigated and fixed before this is unleashed on the masses." [15:03:13] <^demon> Aww :( [15:03:37] ^demon: I would tend to agree. Repo corruption is "not good". [15:04:29] <^demon> Not good at all. [15:05:35] <^demon> Yeah, I think it collects some stuff that's not garbage. That's kinda really bad. [15:05:54] Coren: My problem is that persons who like to help to make the Tool Labs a good alternative to Toolserver get the answer that they are at moment not useful [15:06:26] with this answer there would be nobody in a couple weeks to help then [15:08:31] Jan_Luca: Read and comment on Coren's proposals on-wiki, become a member of the project and beta-test its current implementation, etc., etc., etc. [15:08:33] Jan_Luca: That... I have no answer to that. That assertion is neither realistic nor rational, and I do not see what I could possibly say that is relevant. People also will not get root access to the Openstack underlying the VM infrastructure, nor enable access on the routers. How is that relevant? [15:09:15] scfc_de: That too, yes. [15:10:00] Jan_Luca: The difference is that I am saying "not yet, I'm still preparing things" as opposed to "no, you don't get to touch that" [15:10:40] Coren: I don't know if you know about DaB. and his position about Tool Labs [15:11:01] Jan_Luca: I do, and I'm disapointed that he chose to not get involved; his help would have been invaluable. [15:11:59] Jan_Luca: But then again, he is free to occupy his time in the manner he choses. We can't very well coerce him into participating! [15:12:06] <^demon> "Build path is incomplete...cannot find java.lang.Object" [15:12:10] <^demon> wtf eclipse? [15:14:52] So my question is "Why?" Why do DaB. choose this position? I think he shares among other things the doubts that petan and I have and the less participation that the comunity has at moment because you decide the way to go [15:18:51] Jan_Luca: I think DaBPunkt is alive and well and can answer your question much better. [15:19:08] Jan_Luca: I very much doubt that Daniel based his decision on how I approach the job months before I even applied for it. If you want his reasons, then you should ask him. [15:23:41] Coren: I don't think that there is any dichotomy but I (and petan seems to do this, too) think that the way of the first steps of Tool Labs could be better [15:23:48] well, I have doubts, indeed, but I am not really opposed to the idea of tools labs should the philosophy of its maintenance change in future - and you said it will. The reason why I am standing out of tools is unlikely same as DaB's - he's not the part of project because he doesn't want to, I am not part of it because I am not allowed to :P [15:26:20] Coren: I don't say that your work is bad, I only want that Tool Labs is more open for the volunteers [15:26:27] petan: You are welcome as a part of the project, just not *yet* as root :-). [15:27:16] scfc_de, which kind of block me from doing anything useful there - I am working as sysadmin on bots, not as bot operator - I can't help very much with testing... [15:27:27] I have some bots, but... [15:27:53] they aren't designed to run in grid, so even if some would run there, they wouldn't help to test anything [15:28:07] Jan_Luca: What do you want to do that you can't do? [15:28:28] scfc_de: What should I test? How the webserver for my tools works? [15:29:01] Jan_Luca: No, I asked what you *want* to do. [15:29:55] Jan_Luca: Yes, they are different approaches. Perhaps ^demon is right in calling me "Old school", and I do my original work with a "too many cooks spoils the sauce" approach. I believe that the initial design and setup phase needs to be done with a consistent and unified approach that is not amenable to multiple simultaneous designers. This doesn't mean that I won't appreciate input and [15:30:24] Jan_Luca: advice (and, indeed, I've already made several changes to the design according to feedback like Tim's here) [15:31:03] scfc_de: My problem is not that I have less permissions but that the project is too closed at moment [15:31:09] Jan_Luca: And, every bit of testing helps. Bring tools over, and note what breaks, or what needs to be documented. [15:31:29] scfc_de: When I need root I could get my own project ;-) [15:31:31] Jan_Luca: ... what? How is it closed? [15:32:17] <^demon> That's not what I meant by old school, but meh :) [15:32:35] ^demon: Hey, quoting out of context FTW! :-P [15:32:40] Coren: Too closed for the offered help [15:33:04] Jan_Luca: ... you haven't offered help for me to deny. [15:33:57] (I thought my initial question was some kind of offering help...) [15:34:27] Coren, when reading the conversation between Petr and you, I got the question, if volunteers can really help in tools already or is Petr right that you want to do the first steps alone? [15:35:44] My intention of the question was to help Tool Labs [15:36:29] so I asked if volunteers (like me) can help set up the envirment [15:36:48] Jan_Luca: I did not understand the implicit question. [15:37:44] Jan_Luca: The answer, then, is yes you can; the best help possible atm is to try to run code and see what is missing (dependencies, environmental contraints, etc). Same for the webserver; are there missing settings to have thing work right, are there difficulties with the rewrite rules, etc. [15:39:20] Jan_Luca: Trying the grid scheduler to see if things run as expected is also useful, though that one is not entirely configured yet so problems are probable. [15:40:23] Coren: The problem is that I have only one tools that I think about to migrate [15:40:46] Jan_Luca: Other invaluable help includes documentation of things you found unclear; there is a lot I assume given my experience and the fact that I am setting the environment up that may be obscure or incomprehensible to others; finding those so I can document them is something I cannot do myself. [15:42:57] (For instance, it never occured to me that saying 'sudo to the tool user' wouldn't automatically mean 'sudo -iu' to most) [15:43:58] Jan_Luca: You can also help by assisting Silke in completing her inventory of actual tools and their dependencies. Last I checked, she was overwhelmed with the amount of work. [15:45:12] If you want, I can probably find even more things that could use help. :-) [15:45:52] Coren: To get a good end of the discussion, I ask you to add me to project so I can look into it [15:46:00] my user is jan [15:46:13] Jan_Luca: Sure. Do you want a tool user set up too? [15:46:21] yes [15:46:54] Jan_Luca: What's the name for it? (That's going to be part of the webserver URI, amongst other things) [15:47:32] is there some convention? [15:47:54] Not really; users right now take the name their tool is usually known as, like "voxelbot") [15:48:14] then use "commonshelper2" [15:48:15] The actual username will be 'local-xxx' so no conflicts there. [15:48:43] that is the tool I think about to migrate [15:48:45] {{doing}} [15:49:44] Coren: Creating a tool is the first thing I would add to the documentation ;-) [15:50:21] There's going to be a self-serve interface on wikitech for that; manual creation is a temporary hack. [15:51:25] Jan_Luca: {{done}} [15:51:32] You can hop onto tools-login now [15:52:03] Switching to your tool uses "sudo -iu local-commonshelper2" [15:54:17] Your tool has a default mysql database available, the credential are in its ~/.my.cnf [15:54:47] Also, the tool's public_html and cgi-bin are served by http://tools.wmflabs.org/commonshelper2/ [15:55:44] .. and since I've repeated that schpeil at least five times in the past two days, I go document it now. :-) [15:56:47] bbiab [15:57:43] One thing I would do is to setup a direct login to get it simlar to Toolserver [16:03:40] Jan_Luca: You can't log into MMPs on Toolserver. The setup is the same. [16:04:26] scfc_de: No, I mean a login adresss like tools-login.wmflabs.org [16:04:42] because not every toolserver user will understand bastion [16:06:58] Jan_Luca: https://wikitech.wikimedia.org/wiki/Help:Access#Using_ProxyCommand_ssh_option [16:07:42] scfc_de: I can use bastion but I think not every toolserver user can use it without problems [16:08:40] Jan_Luca: What problems? [16:09:48] for example use ProxyCommand [16:10:11] What do you mean by that? [16:13:47] I think that there will be some users that have problems with bastion and to avoid problems I suggest to set up a direct login server for Tool Labs [16:14:39] That would just be a new bastion [16:14:47] It wouldn't solve any issues [16:15:12] well it would solve some issues as you wouldn't need to mess with forwarding etc. I think he has a point [16:15:16] What problems? [16:15:25] not all people are unix guru's [16:15:33] Wouldn't need to mess with forwarding? [16:15:44] Krenair: yes [16:15:46] Krenair [16:15:55] A direct login server sounds like something you'd have to proxy through to get to the tool labs instance you want [16:16:12] Just a new bastion [16:16:17] Krenair you don't need to get to instance you want - you submit jobs from login instance ;) [16:16:20] that's how it works [16:16:28] There's one central instance..? [16:16:28] login instance is master server of grid [16:16:30] yes [16:17:47] I'm not sure what blocks being able to SSH directly to instances at the moment... [16:17:57] Krenair: Ryan :-). [16:18:13] Oh, ops is against it? [16:18:19] what's the point of bastion if you allow direct access to tool ? I agree some people will have trouble with forwarding and they'll need to login in two steps [16:18:20] nothing really - I had one instace with direct ssh long time ago [16:18:30] Krenair: (Just an assumption.) [16:18:50] but that was just because firewall wasn't operational back then, later I disabled it [16:19:02] Ryan_Lane: Want to chime in? [16:19:28] The problem is that direct ssh to instances would need a public IP for every instance [16:19:37] Not for every instance [16:19:39] and the number of this IPs are not very big [16:19:44] Just for one central tool labs instance [16:19:55] (I think) [16:20:07] Krenair: This is my idea [16:20:11] ;-) [16:20:28] .... [16:20:34] You said all instances... [16:21:00] Maybe my intention was not clear ... [16:21:04] The problem is that direct ssh to instances would need a public IP for every instance [16:21:09] every instance -> multiple [16:21:19] one central instance -> single [16:21:34] I mean: I think that there will be some users that have problems with bastion and to avoid problems I suggest to set up a direct login server for Tool Labs [16:23:02] the other was the answer to: I'm not sure what blocks being able to SSH directly to instances at the moment... [16:26:37] * Coren burps. [16:27:24] I'm not opposed to making tools-login accessible directly from the outside. I'll have to ask internally in Ops, but I think that's entirely feasible. [16:28:04] And no, the primary reason why we can't ssh directly to the instances is "not enough public IPs" [16:28:10] tools already has a public IP [16:29:21] Strictly speaking, though, the tools project never needs users logging in directly to instances other than -login [16:36:20] Coren: I mean that the public IP problem is one reason for creating bastion [16:36:51] and not access every instance (in all projects) with public IPS [16:37:03] Jan_Luca: Yes, but since the tools project already has a public IP, it should be okay to use it for SSH also to log in directly to tool-login [16:37:48] Coren: you cannot access two instances (-login and -webproxy) with one public IP [16:39:41] Jan_Luca: Sure you can, on different ports. But I'm probably simply going to use a second public IP if I can convince Ryan_Lane to assign it; I agree with you that given the variable level of unix expertise in tools maintainer, it's a good idea to reduce the complexity. [16:41:12] Coren: Yes, with a complex way you need only one IP but I think for Tool Labs you get a second (bastion has three) [17:00:26] petan: You should read https://wiki.toolserver.org/view/Job_scheduling [17:00:38] there is the docu about qcronsub [17:04:26] Jan_Luca OK :) [17:05:45] petan: Maybe you can just copy it so you don't have to write a new wrapper [17:06:15] well, I will need to modify it for sure, this version consist of multiple parts [17:06:22] but I will try to make it working [17:46:21] [bz] (ASSIGNED - created by: Antoine "hashar" Musso, priority: Immediate - normal) [Bug 45084] autoupdate the databases! - https://bugzilla.wikimedia.org/show_bug.cgi?id=45084 [18:00:09] wikisource had a tool at wsexport.wmflabs.org but look like it vanished [18:02:18] petan: bnr3 looks lovely :) [18:02:34] slowly stripping load from bnr1 and 2 [18:08:32] next thing should be work on a 'lowload' queue to contain procs that must run on a server with low(ish) load [18:08:55] possibly using a smaller instance or 2 [18:12:01] addshore one small instance is pretty enough to hold many of these [18:12:06] at leat all we have so far [18:34:37] * Coren tries to find the right balance between 'simple' and 'flexible' for his wrapper scripts. [18:35:06] the answer is always more perl [18:36:31] The problem is that if I want to allow most (all?) qsub options through, I find myself having to parse all of qsub's options. [18:38:18] howdy [18:42:15] Coren: petan: would you be able to grant user 'jenkins-bot' the loginviashell user right please ? :-] [18:43:17] hashar: There is no user by the name "jenkins-bot". Check your spelling. [18:43:29] that is what I thought :( [18:43:50] No, no, I would not be able to grant it. :-) [18:44:07] jenkins-bot is a user in LDAP already [18:44:19] so I can't create the account on wikitech :-] [18:45:00] why would jenkins-bot need to login via shell? [18:45:16] is integration sshing into labs? [18:46:57] yeahh worked [18:47:05] I created a `jenkins` user [18:47:19] Ryan_Lane: so yeah indeed. The plan is to get the production Jenkins to connect to a labs instance [18:47:34] what'll it do in the labs instance? [18:48:20] jenkins-bot already existed [18:48:25] and has an ssh key [18:48:34] its cn is jenkins-bot [18:48:38] I think that one is used to report back to gerrit [18:48:46] ah [18:49:09] why would it have a key in ssh, then? :) [18:49:12] err [18:49:14] in ldap [18:49:24] to ssh to gerrit and submit a comment ? [18:49:36] maybe I should have reused that one with a 'jenkins' login shell [18:52:18] hmm [18:53:00] that is never going to work [18:53:05] I hate myself [18:53:35] Ryan_Lane: do we have a way to change a labs user homedir ? [18:53:46] so it get /var/lib/ instead of /home ? :-] [18:55:07] why not use puppet systemuser for this? [18:55:24] I got something like that https://gerrit.wikimedia.org/r/#/c/53736/1/modules/jenkins/manifests/user.pp,unified [18:55:33] which add the ssh pub key for the jenkins user [18:55:41] and thought I could apply it to my instance [18:55:46] use systemyser [18:55:49] systemuser [18:55:51] not user [18:55:53] but then, I still need jenkins to be able to connect to the bastion [18:56:01] why to the bastion? [18:56:17] jenkins can connect directly to the instance [18:56:19] gallium -> bastion.wmflabs --> instance ? [18:56:21] ahh [18:56:33] even to the private IP, likely [18:56:37] if not we can give it a public IP [18:56:52] but I have a good feeling you can connect directly to the private [18:57:18] well, now the jenkins user exists in ldap [18:57:18] heh [18:57:28] you didn't log in with this user to gerrit, did you? [18:57:32] if not, I can delete it [18:57:38] Ryan_Lane: go ahead [18:57:51] Ryan_Lane: just created it in mediawiki [18:57:51] ok [18:57:51] err [18:57:51] on wikitech [18:57:52] I can't delete it there, I think [18:57:53] ssh: connect to host 10.4.0.58 port 22: Connection timed out [18:57:54] ;-] [18:57:56] but tha's fine [18:58:00] *that's [18:58:04] we'll give it a public IP [18:58:19] it's the "jenkins" user I'm deleting, right? [18:58:23] yeah [18:58:58] I can ping the instance from gallium though. I guess there is a security rule on the way [18:59:05] yeah [18:59:10] you need to change the ssh rule [18:59:17] ahh [18:59:18] true [18:59:24] also, we're going to need to change pam_security's config [18:59:34] since the user isn't in the project group [19:00:10] hashar: give me a bit. we're heading to lunch [19:00:17] sure :-] [19:00:39] <^demon> There's something probably evil about this: http://p.defau.lt/?UUE4yU4HJblwZry3SPW6Jw [19:01:50] !log deployment-prep updated security rule to allow TCP port 22 connection from gallium.wikimedia.org [208.80.154.135/32] [19:01:56] Logged the message, Master [19:02:57] ^demon: that is in production? doh [19:03:08] <^demon> No, I'm experimenting with something. [19:03:10] <^demon> localhost. [19:03:42] !log deployment-prep rebooting -bastion to find out whether the security rule is applied [19:03:44] Logged the message, Master [19:04:55] out of luck :-] [19:05:06] [bz] (NEW - created by: Antoine "hashar" Musso, priority: Normal - enhancement) [Bug 36994] [OPS] Add disk I/O to ganglia reports - https://bugzilla.wikimedia.org/show_bug.cgi?id=36994 [19:05:25] petan: ^^^ I have created that bug a looooong time ago [19:14:57] Yep, if wm-bot could name the person that actually "changed" the bug, would be really nice. [19:15:31] BTW, https://bugzilla.wikimedia.org/36994 would work as well. [19:23:03] hi, is there a way to get temporary instance? [19:25:06] or get access to some existing instance [19:25:30] some sandbox machine [19:30:35] shantanoo: What do you want to do? [19:32:17] there are copyright freed books on site. each page of the book is tif. i plan to download all the pages for each book. generate pdf/djvu. upload it to commons or respective wikisource site of the language. [19:32:47] have downloaded ~20 books, size is ~600+ MB [19:33:19] was trying to do it on toolserver, but it has 512 MB space restriction. [19:33:35] shantanoo: You can use /mnt/user-store. [19:33:46] (Or just ask DaBPunkt to raise your quota.) [19:34:04] scfc_de: oh. ok. [19:34:06] Is that 600 MByte per book or overall? [19:34:29] for 20 books. [19:34:43] some books were 100+ MB [19:35:09] have moved them to http://download.dhoomketu.net.in/dli/mar [19:36:05] scfc_de: the content in /mnt/user-store can be link to ~/public_html so that it can be downloaded? [19:36:38] shantanoo: I don't know, but I assume not :-(. [19:37:05] i plan do download the books for the languages which i don't understand. so need to share them with people who understand those languages [19:37:34] scfc_de, I've working symlink in ~/public_html but that's very dangerous to allow that [19:38:35] shantanoo: Then that would work for you as well. But 600 MByte is not so much about quota, so I would suggest asking DaBPunkt to raise yours first. It will be much easier. [19:38:54] (He should be on #wikimedia-toolserver.) [19:39:50] hashar I am a nagios guy [19:40:00] but I will try to fix ganglia anyway :P [19:41:08] scfc_de regarding wm-bot and person - wm-bot can only provide information which bugzilla provide and name of person who changed the bug is not provided :/ [19:41:10] phe: you have a way to convert images to text (using OCR) through some cli? [19:41:29] scfc_de the reason why it second bot can do it is that it receive the data in emails, where it is [19:42:42] scfc_de: will ping DaB. any idea regarding the max limit allowed? [19:42:52] shantanoo, I can add text layer to djvu, but as you are starting from pdf I suggest you use internet archive as service to convert the pdf to djvu and do the ocr, it'll do better quality ocr than I can [19:43:47] phe: i have tif of each page. is it possible to create djvu? [19:44:23] i use 'convert' (imagemagik) for converting tif to pdf and then gs (ghostscript) for creating single pdf. [19:44:27] shantanoo: No :-). /home has 118G free, so it shouldn't be too much, but I don't think a few hundred MBytes will have a large impact. [19:44:38] scfc_de: :) [19:46:00] petan: What bot do you mean by "second bot"? [19:48:24] Ryan_Lane: gallium is now allowed to ssh on the labs instance, seems the security rule ended up being applied somehow :-] [19:48:54] Mar 14 19:48:43 i-00000390 sshd[861]: fatal: Access denied for user jenkins by PAM account configuration [preauth] [19:48:55] :) [19:53:38] phe: which all scripts are supported by the OCR program? [19:55:33] by tesseract not a lot, internet archive support a lot of them, unsure where is the list [19:56:09] !test is petan petan [19:56:10] This key already exist - remove it, if you want to change it [19:56:16] !test2 is petan petan [19:56:16] Key was added [19:56:18] !test2 [19:56:18] petan petan [20:00:11] shantanoo, supported lang by tesseract are listed here, check if a file tesseract-ocr-3.02.XXX.tar.gz exist for the lang code you seach: http://code.google.com/p/tesseract-ocr/downloads/list [20:02:04] supported lang/script byr internet archive are likely to be the same as those supported by finereader 8.0 but I'm not sure [20:02:20] phe: :(. didn't find the one which i am looking for for tesseract. [20:02:27] checking for finereader [20:03:44] if i have sample set of images from the scanned files, can i provide the map which can be used by tesseract for ocr? [20:05:31] there is some tutorial on tesseract how to train tesseract, never used them, and I don't know if this sort of script are well supported by tesseract [20:06:09] shantanoo, wat lang/script is it ? [20:06:12] *what [20:06:23] marathi [20:06:38] script is devnagari [20:09:10] phe: i think i should spend some time on tesseract for training it to convert devnagari script [20:10:17] * shantanoo needs to sleep now. thanks scfc_de, phe for you help [20:15:56] petan: so no more bots-4? [20:16:18] rschen7754 bots-4 is running now but it should be removed [20:16:26] in future once all people move their bots from there [20:16:31] petan: how much longer will it be running? [20:16:39] until all bots from there are gone [20:16:45] hashar: yeah, that's the pam_security thing I was talking about :) [20:16:51] and is python-twisted on bots-gs? [20:16:58] yes [20:17:01] ok, cool [20:17:03] it's not on bots-gs [20:17:06] but it's on nodes [20:17:10] that's where you need it ;) [20:17:12] on all of them i assume? [20:17:57] Ryan_Lane: so you had to hack the instace? [20:18:04] no [20:18:07] I didn't change it yet [20:18:12] well it works :-] [20:18:16] it does? [20:18:22] you didn't get denied by pam? [20:18:25] I have updated the security rule in my project [20:18:32] ah sorry [20:18:36] that's what I mean ;) [20:19:08] Mar 14 20:18:17 i-00000390 sshd[2807]: Failed publickey for jenkins from 208.80.154.135 port 30092 ssh2 [20:19:08] Mar 14 20:18:17 i-00000390 sshd[2807]: fatal: Access denied for user jenkins by PAM account configuration [preauth] [20:19:18] but there is no jenkins user on that instance yet :-] [20:22:31] Ryan_Lane: isn't it enough to create the jenkins user on the box ? [20:24:45] somone as an idea why http://wsexport.wmflabs.org/ vanished ? the maintainer, Tpt is on holydays atm... [20:26:10] phe: does its instance still exist? [20:26:14] what project is it in? [20:26:27] we really need a semantic query on public dns names [20:26:56] Ryan_Lane, no idea, but how an instance could vanish w/o any notice ? [20:27:18] and yes, I've no idea of the instance name :) [20:28:37] it may have just OOM'd [20:28:40] phe: do you know the project? [20:28:46] I can look up the hostname [20:30:01] it's https://wikitech.wikimedia.org/wiki/Nova_Resource:I-000004f0 [20:31:05] wikisource-tools [20:31:22] but http://ganglia.wmflabs.org/latest/?c=wikisource-tools doesn't show the instance wsexport [20:31:22] and it is dead http://icinga.wmflabs.org/cgi-bin/icinga/status.cgi?hostgroup=wikisource-tools&style=detail [20:31:36] oh [20:31:41] maybe reboot it from the wiki console? [20:31:42] I think that one was owned [20:32:00] let me see [20:32:30] or not [20:32:47] let me look at the console [20:33:30] I'll reboot it [20:33:54] it's rebooting now [20:35:22] phe: it's back up [20:36:07] \O/ [20:36:49] Ryan_Lane, works luke a charm, ty [20:37:19] yw [20:37:45] "sexport". IYKWIM. [20:38:14] Sorry about that. It was just /too/ obvious. :-) [20:38:29] @labs-resolve 04f0 [20:38:29] I don't know this instance - aren't you are looking for: I-000004f0 (wsexport), [20:38:40] !log deployment-prep applying jenkins::user to deployment-bastion [20:38:43] Logged the message, Master [20:40:56] err: /Stage[main]/Jenkins::User/User[jenkins]/home: change from /home/jenkins to /var/lib/jenkins failed: Could not set home on user[jenkins]: Execution of '/usr/sbin/usermod -d /var/lib/jenkins jenkins' returned 6: usermod: user 'jenkins' does not exist in /etc/passwd [20:40:58] seriously puppet [20:41:00] be a bit smarter [20:41:06] f*** create the user first! [20:42:04] To be adding a require? [20:42:56] maybe the user define is wrong indeed [20:43:46] ahhh [20:43:50] the user is in LDAP [20:44:10] * hashar shoots himself [20:44:31] Ryan_Lane: have you deleted the jenkins user I have created via the wikitech wiki? [20:44:38] uid=2947(jenkins) gid=500(wikidev) groups=500(wikidev) [20:44:45] it does exist in LDAP now :-] [20:44:53] which prevents puppet from adding a local jenkins user hehe [20:45:29] I didn't yet [20:45:33] let me delete it [20:45:54] thx [20:46:23] hashar: done [20:46:29] you'll need to purge nscd [20:46:31] nscd -i passwd [20:46:34] nscd -i group [20:46:58] you are ruining all the fun, I was actually writing to ask you about clearing the ldap cache :-] [20:47:05] :) [20:47:22] ok. now let me see about pam_security [20:47:52] and I will need a review/merge of https://gerrit.wikimedia.org/r/#/c/53736/1/modules/jenkins/manifests/user.pp,unified which adds the public key for the jenkins user :-] [20:47:55] what's the user's group? [20:47:58] same as user? [20:48:02] yeah jenkins too [20:48:03] seems we can already make this work [20:48:12] I already have this case covered :) [20:48:42] confirmed jenkins::group has group { 'jenkins': name => 'jenkins' … } [20:49:02] ugh [20:49:02] crap [20:49:04] maybe not [20:49:25] it only handles a single group right now [20:49:27] let me fix this [20:51:22] well, if I don't time this well, then I'm going to lock everyone out of the bastions [20:52:09] # id jenkins [20:52:10] uid=993(jenkins) gid=997(jenkins) groups=997(jenkins) [20:52:17] Ryan_Lane: seems puppet has been happy enough to create it [21:02:04] !log deployment-prep creating jenkins homedir manually on -bastion [21:02:06] Logged the message, Master [21:08:05] Ryan_Lane: so I got my jenkins user setup with the pub key in authorized key but still get rejected :/ [21:08:08] Mar 14 21:07:46 i-00000390 sshd[9970]: Failed publickey for jenkins from 208.80.154.135 port 43814 ssh2 [21:08:12] Mar 14 21:07:46 i-00000390 sshd[9970]: Connection closed by 208.80.154.135 [preauth] [21:08:19] not much more details unfortunately [21:11:28] hashar: ok. so, it's possible to fix this now [21:12:04] hashar: add a new puppet group to your project [21:12:16] called: restrictions [21:12:25] then add: restricted_to [21:12:27] as a variable [21:12:34] then, configure the instance [21:12:43] and add: (project-jenkins) (jenkins) [21:12:53] and run puppet on the instance [21:13:04] :-] [21:13:06] lovely hack [21:13:07] I hesitate to add this as a global option [21:13:17] this is how the bastions restrict access [21:13:26] there's also a "restricted_from" varaible [21:13:28] *variable [21:13:39] to disallow specific groups [21:15:23] !log deployment-prep on -bastion: Added group restrictions and set variable restricted_to = (project-jenkins) (jenkins) thanks ryan [21:15:26] ahh /etc/security/access.conf got updated [21:15:26] Logged the message, Master [21:15:40] and the project was project-deployment-prep [21:16:15] !log deployment-prep -bastion changed restricted_to to (project-deployment-prep) (jenkins) [21:16:17] Logged the message, Master [21:18:26] btw Ryan_Lane, any idea why webtools-login doesn't accept being accessed? [21:18:35] (ssh connection times out) [21:19:15] works for me [21:19:25] :O [21:19:38] public address, or private one? [21:19:54] private, it doesn't have a public address [21:19:59] yeah. works for me [21:20:12] ssh: connect to host webtools-login port 22: Connection timed out [21:20:43] from which host are you connecting? [21:20:45] from where? [21:20:50] bastion-restricted1 [21:21:11] oh [21:21:12] weird [21:21:15] it's timing out from bastion1 [21:21:35] I know why [21:21:42] 10.4.0.0/24 [21:21:47] bad security group rule [21:21:53] that should be 10.4.0.0/21 [21:22:01] oh [21:22:01] crap [21:22:05] let me check the wiki config [21:22:14] it also fails from bastion3 [21:22:26] yeah [21:22:28] I don't seem to be allowed to bastion-restricted1 [21:22:34] they're all in a subnet that's being deined [21:22:36] *denied [21:22:46] bastion-restricted is limited to the ops group [21:22:55] I wonder why /24 is set [21:22:55] * Platonides blames the dhcp server [21:22:57] the default is /21 [21:23:05] :D [21:23:09] it's not the dhcp server [21:23:17] it's a security group rule [21:23:45] I'm going to change ssh to 0.0.0.0/0 [21:23:56] we'll eventually want webtools-login to have a public ip [21:23:57] I mean for giving it the ip 10.4.0.24 [21:24:09] there's nothing wrong with the ip [21:24:43] :-( [21:24:53] the security group is limiting access to a smaller subnet than it should be [21:24:57] I thought you said that 10.4.0.0/24 was reserved? [21:25:15] our original subnet was 10.4.0.0/24 [21:25:20] now it's 10.4.0.0/21 [21:25:43] the security group rule was limiting port 22 to 10.4.0.0/24 [21:26:07] the problem is bastion ip [21:26:09] and bastion1-3 have ips in a range outside of 10.4.0.0/24 [21:26:19] so, I'm changing the rule to be wider [21:26:22] and it'll start working soon [21:26:42] where soon is: it is now working ;) [21:27:08] so /etc/security/access.conf has :::: -:ALL EXCEPT (project-deployment-prep) (jenkins) root:ALL [21:27:11] poor ryan :-] [21:27:27] hashar: that should be correct [21:27:40] I spent the last 20 minutes trying to figure out how to more verbose logging [21:27:42] failed :D [21:28:01] is that somehow wrong? [21:28:10] oh [21:28:15] I know why :( [21:28:17] damn it [21:28:30] the authorized keys configuration [21:28:45] it isn't in home directories [21:29:07] ohh [21:29:09] * Ryan_Lane grumbles [21:29:10] straight from LDAP ? [21:29:15] yes [21:29:16] isn't there a fallback to $HOME ? [21:29:19] no [21:29:26] we wanted to avoid that for security reasons [21:30:14] so I guess i need to use the LDAP user as well [21:31:14] that's not a great solution, though [21:31:30] at minimum I don't want it named jenkins [21:31:43] I guess it's the sanest solution [21:31:54] in fact, I'm going to disallow the name jenkins [21:31:58] and gerrit and gerrit2 [21:32:48] hashar: what will this be used for? [21:32:52] let's name it after the function [21:33:40] yeah lets do that [21:33:52] the aim is to automatically deploy merged change on beta [21:34:05] ah, ok [21:34:08] maybe autodeployer [21:34:12] or betaupdater [21:34:17] so, remove that restrict_to config [21:35:04] deployster! [21:35:17] jenkins-deploy ? [21:35:26] yeah that will work :) [21:37:27] !log deployment-prep removing restrictions from deployment-bastion . authorized_keys is not read when in labs :] (thx Ryan) [21:37:30] Logged the message, Master [21:45:38] Ryan_Lane: have you managed to create jenkins-deploy in LDAP ? :D [21:45:49] I thought you were going to do so? [21:45:53] ah [21:45:54] sorry [21:45:55] create it via labsconsole [21:45:56] err [21:45:57] wikitech [21:46:06] I will need the homedir to be /var/lib/jenkins-deploy , not /home [21:46:12] why? [21:46:18] cause that is going to get a few GB of data which would fill /home [21:46:29] ugh [21:46:30] or maybe not [21:46:39] well, it's gluster [21:46:47] it's better if we avoid that [21:47:02] I probably don't need to clone the repositories on the host [21:47:12] simply run a shell script that update the already existing ones [21:47:13] forget me [21:47:21] oh. so, it's fine to be in /home? [21:47:51] I should really let users choose their home directory between /home and /var/lib/ [21:48:03] well /home/ and /var/lib/ [21:59:39] !log deployment-prep adding jenkins-deploy to the project [21:59:42] Logged the message, Master [22:00:16] Ryan_Lane: yeah /home/ will be fine [22:00:37] Ryan_Lane: can you possibly grant loginviashell to the jenkins-deploy user ? :-] [22:00:50] yep [22:01:13] done [22:01:54] I am cursed [22:02:06] how so? [22:02:12] ah wrong project [22:02:17] tried to add it to bastion ghuhu [22:02:36] !log deployment-prep Successfully added jenkins-deploy to deployment-prep. [22:02:39] Logged the message, Master [22:02:56] jenkins@gallium:~$ ssh jenkins-deploy@10.4.0.58 date [22:02:57] Thu Mar 14 22:02:50 UTC 2013 [22:02:59] Ryan_Lane: THANK YOU ! [22:03:03] yw :) [22:03:10] I should send you binasher for a hug [22:03:20] damn I hope you had nothing planned this afternoon [22:03:37] hahahaha [22:03:47] well, we have wikilove enabled on wikitech now [22:05:45] if you ever need to date a geek, I can recommend you :-] [22:06:05] you got a message on your talk! [22:07:35] :D [22:08:10] we need a puppet barnstar [22:13:32] !log deployment-prep manually installing openjdk-7-jre on -bastion [22:13:34] Logged the message, Master [22:14:18] [03/14/13 22:14:10] [SSH] Starting sftp client. [22:14:19] seriously [22:14:25] the never ending spaghetti plate [22:15:09] hashar: :D [22:15:13] check your talk page [22:15:28] maybe I should sleep a bit [22:15:32] I have been awake for 22hours and been mostly working all that time [22:15:40] sheesh. go to sleep [22:16:01] NICE [22:16:02] ahah [22:16:10] I love that barnstar [22:16:16] yeah. it's pretty great :) [22:16:26] it's actually a sockpuppet barnstar [22:16:36] but I'll abuse it for this purpose [22:17:04] ah [22:17:05] Caused by: hudson.util.IOException2: Could not copy slave.jar to '/var/lib/jenkins/slave.jar' on slave [22:17:06] :-] [22:17:17] I knew it [22:17:44] :D [22:17:54] that works now [22:18:04] <===[JENKINS REMOTING CAPACITY]===>ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins. [22:18:16] I don't know if I should laugh or cry [22:19:54] connected [22:19:59] Ryan_Lane: thanks project sprinted! [22:20:12] heh [22:20:18] !log deployment-prep deployment-bastion is now a jenkins slave of the production Jenkins machine [22:20:21] Logged the message, Master [22:20:34] and Zuul is not bugged :-] [22:21:59] [bz] (ASSIGNED - created by: Antoine "hashar" Musso, priority: Immediate - normal) [Bug 45084] autoupdate the databases! - https://bugzilla.wikimedia.org/show_bug.cgi?id=45084 [22:23:24] Ryan_Lane: labsconsole still does not let us pass parameter to a puppet parameterized class can it ? [22:23:31] nope [22:23:37] need to make role classes do that [22:23:47] this is a limitation of puppet's ldap implementation [22:24:24] oook :-] [22:25:45] we should probably write an ENC one of these days [22:26:04] role are fine :-] [22:26:08] we have enough projects already [22:26:13] indeed [22:26:29] I just stuck a lame $::realm == 'production' instead :-] [22:28:11] Ryan_Lane: I got a change for production that adds the authorized_keys to the jenkins user : https://gerrit.wikimedia.org/r/#/c/53736/1 [22:28:22] would eventually need that to setup a slave in production one day ] [22:28:44] are we not using a jenkins package? [22:28:47] does it not install a user? [22:28:58] and again, this should use systemuser [22:28:59] not user [22:29:14] the jenkins package is not needed at all. Just need a ssh connection and java installed on the slave. [22:29:18] ah [22:29:19] ok [22:29:22] but yeah, systemuser [22:29:24] pretty simple :-] [22:29:37] or systemaccount, or whatever it's called [22:29:40] well it has always been a user, I am not sure what is the different [22:29:54] it'll get a system uid [22:30:18] and by default I think it uses /var/lib/ [22:30:42] like uid 561 ? [22:30:50] probably less than 500 [22:30:59] but either way, a system uid [22:31:03] okk [22:32:35] system => true, [22:32:39] it is set already :-] [22:33:02] ah and I need managehome false [22:33:10] can't remember why [22:33:43] there's another define for this [22:33:47] that we use throughout the codebase [22:35:59] yeah systemuser [22:36:01] grar [22:36:05] guess I can add a parameter to that class [22:36:12] to let us manage or not the home [22:43:35] Now depends on: [22:43:36] https://gerrit.wikimedia.org/r/53879 [22:43:36] https://gerrit.wikimedia.org/r/53880 [22:43:39] I am out for real [22:43:44] thank you again! [22:43:49] yw [22:43:52] night! [22:47:23] So is Jenkins totally down? [22:47:29] It doesn't seem to be reviewing? [22:48:21] heh. bad time for hashar to leave [22:48:29] yeah know issue [22:48:35] been talking about it in #wikimedia-dev [22:48:40] hashar, that's what I thought, just checking. [22:48:48] hashar: hey, shouldn't you be gone? :) [22:48:49] basically : tooooo many patch sets are sent to gerrit :-] [22:48:52] yeah I should