[01:38:43] legoktm: I think the problem with afch-updater's failing grid job submissions is that you use /usr/*local*/bin/jsub, and on webserver-01, that is an obsolete version. I'll fix that, and we need to put it in Puppet probably for those people who rely that it is located in that directory. [01:51:34] [bz] (8NEW - created by: 2Tim Landscheidt, priority: 4Unprioritized - 6normal) [Bug 52258] Obsolete scripts in /usr/local/bin need to be "managed" - https://bugzilla.wikimedia.org/show_bug.cgi?id=52258 [01:51:50] Ryan_Lane: It lives. Its annoying stalls have gotten no better nor any worse. I wanted to switchover to labstore4 to eleminate hardware failiure as a possibility (at least the controller's) but I'm not keen on making a change of that magnitude just before Wikimania. [02:01:48] !log tools tools-webserver-01: Symlinked /usr/local/bin/{job,jstart,jstop,jsub} to /usr/bin; were obsolete versions. [02:01:52] Logged the message, Master [02:24:56] !log wikistats added missing el.wikivoyage, fixed wikivoyage timestamps/updates, running wiki-site update, add 1.22wmf12 to good versions [02:24:58] Logged the message, Master [02:28:39] Coren: understandable [04:25:45] !demon [04:25:46] <^demon> Docs exist solely for developers to go "omg you didn't read the docs!" when people ask common questions. In practice, nobody reads docs before asking questions. [06:22:59] petan? [07:16:16] Petan?? [07:16:32] addshore: [07:16:48] Or coren :) [07:17:42] Hi yuvipanda [07:17:56] * YuviPanda waves [07:21:21] hi [07:21:27] @notify addshore [07:21:27] This user is now online in #huggle. I'll let you know when they show some activity (talk, etc.) [07:30:22] Pet an [07:30:32] *petan [07:30:33] add hore [07:30:44] heya [07:31:03] Can you "qdel -u 'local-addbot-" for me ? :) [07:31:26] you can't? :o [07:31:29] Without the extra dash ;p [07:31:42] I'm on my phone! :( [07:32:00] !log tools petrb: deleted local-addbot jobs [07:32:03] Logged the message, Master [07:32:33] Cheers :) [07:33:08] *stop walking towards the internet like a mad man* [07:33:15] *stops [07:49:33] &info [07:49:34] http://tools.wmflabs.org/wm-bot/dump/%23wikimedia-labs.htm [08:23:56] hashar: is there installation of ZMQ on beta? [08:24:03] or something we can test that new code on [08:24:16] lo [08:24:23] lo? [08:24:30] hello [08:24:32] :D [08:24:35] aha :D [08:24:39] petan: I asked for you first [08:24:44] there is an event logging instance in beta [08:24:50] zhuyifei1999 did you? [08:24:53] zhuyifei1999: what's up [08:25:06] (06:22:59) petan? [08:25:08] so there must be some zero mqueue install on it. MaxSem or Ori-l would know about the EventLogging in beta -:] [08:25:12] hashar: how does it work is it documented? [08:25:26] zhuyifei1999 ok that isn't really a question [08:25:37] hopefully it is setup the same as in production so the prod documentation should apply to beta as well [08:25:41] you need to add a text between petan and ? [08:25:46] petan: Why aren't you in offtopic? [08:25:48] hashar: ok [08:26:01] idk [08:29:00] petan: and a bug: https://tools.wmflabs.org/?Rules of the sidebar of https://tools.wmflabs.org/ links to a non-existing page [08:31:22] zhuyifei1999 I know, [08:32:13] fixed XD [08:37:39] [bz] (8NEW - created by: 2Amir E. Aharoni, priority: 4Unprioritized - 6normal) [Bug 52249] localization messages loading issues in http://en.wikipedia.beta.wmflabs.org - https://bugzilla.wikimedia.org/show_bug.cgi?id=52249 [10:09:23] petan: do you know how to diyplay, how mandy log-actions (filemoves etc.) a user has? [10:09:33] Hm... is this possible via api? [10:23:48] Steinsplitter: I think so [10:24:31] petan: What's the rules fot Tool Labs? [10:24:37] ^^ [10:24:49] Steinsplitter: ? [10:24:58] hwo to? [10:25:00] *how [10:25:04] *of [10:25:29] Steinsplitter: finding... [10:25:41] like this https://github.com/Pathoschild/Wikimedia-contrib.toolserver/tree/master/stalktoy O_O [10:25:48] but work with db :P [10:27:24] :/ [10:27:39] so much for an interactive session :P [10:27:48] where did tools-login go? :P [10:28:27] I think nfs is dead [10:28:28] &ping [10:28:29] Pinging all local filesystems, hold on [10:28:30] Written and deleted 4 bytes on /tmp in 00:00:00.0002330 [10:28:33] see [10:28:34] :/ [10:28:42] petan: can i sql my jobs? [10:28:45] *qdel [10:28:53] I don't think so [10:28:56] :< [10:29:02] Written and deleted 4 bytes on /data/project in 00:00:32.6104720 [10:29:09] 32 seconds for 4 byes [10:29:10] bytes [10:29:15] &ping [10:29:15] Pinging all local filesystems, hold on [10:29:16] Written and deleted 4 bytes on /tmp in 00:00:00.0005060 [10:29:17] Written and deleted 4 bytes on /data/project in 00:00:00.0054240 [10:29:18] managed it :) [10:29:25] everything just came back to life xD [10:32:52] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 1.20.0.16 my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [10:34:11] Reliability of the tools cluster seems to be suffering these days; too much unmanaged change, or new problems manifesting as the system is excercised by new services and users? [10:34:48] Or are we still in an alpha enviroment really? [10:36:11] heh, it always seems to recover well [10:36:57] im just interested to know what has been happening on the webserver xd http://ganglia.wmflabs.org/latest/?c=tools&h=tools-webserver-01&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [10:40:01] petan: no doc about &ping [10:54:50] &ping [10:54:51] Pinging all local filesystems, hold on [10:54:52] Written and deleted 4 bytes on /tmp in 00:00:00.0006030 [10:56:11] Written and deleted 4 bytes on /data/project in 00:01:20.1649200 [10:56:26] &ping [10:56:26] Pinging all local filesystems, hold on [10:56:27] Written and deleted 4 bytes on /tmp in 00:00:00.0002040 [10:56:28] Written and deleted 4 bytes on /data/project in 00:00:00.0057190 [11:07:19] petan: I am no officily off the old db :> [11:07:38] english please :P [11:07:55] i dont need bots-bsql01 ;p [11:08:00] anyway, there is a bunch of other people who do [11:08:11] I need to get them off [11:08:11] :> [11:08:16] Damianz: you are one of them :P [11:08:30] also, is there a limited number of jobs on the grid per tool? [11:08:33] petan: -.- [11:08:36] yes [11:08:38] i seem to remember ebing told before :P [11:08:39] Damianz: no worries [11:08:40] what is it [11:08:41] ? [11:08:48] Damianz: in fact I don't care, it's Ryan who want to have it deleted [11:08:58] addshore: like 15 [11:09:08] that means 15 running, you can have a bunch of waiting [11:09:09] yup, 16 makes sense :/ [11:09:10] I'll see if I have time this weekend [11:09:13] or 16 [11:09:17] Damianz: no problem [11:09:22] shame I cant make this go any faster then :< [11:09:31] Damianz: I have vacation next 2 weeks [11:09:35] I should write a mail [11:09:53] I have like omg work to do for the next 2 weeks heh [11:13:57] addshore: hehe, too much power? :P [11:14:16] mhhm, id like to get this run over the db done as fast as possible :P [11:14:25] but I can only have 15 jobs ;_; [11:14:47] ;_; indeed :P [11:14:50] im sure there is a way to reduce the number of jobs to one :P [11:14:51] only 15 jobs :P [11:15:09] addshore: what exactly are you doing? [11:15:22] addshore: also is this now 15 times faster than doing it in one job? [11:15:54] yus :> [11:16:04] :D [11:16:05] but i tihnk there is a way to group things on oge [11:16:30] YuviPanda: I have 1.4 million ish thing to check, and it takes me 6 hours to do about 50k ish [11:16:46] addshore: ah, hmm. How much of that is IO vs computation? [11:16:53] addshore: are you doing lots of database work? [11:16:58] or network work? [11:17:14] probably half network and half computational [11:17:18] and 1% db work [11:17:33] hmm [11:17:59] as the script runs for longer the number of network requests decreases though ;p [11:18:18] addshore: 6 hours to do 50kish with the 15 jobs? [11:18:29] yup, i left it running last night :) [11:18:39] ah :) [11:18:42] thats a very rough estimate ;p [11:18:47] addshore: perhaps ask Coren to lift the limit? :P [11:19:02] alternatives being multi threading (co-operative or otherwise) [11:19:08] but that's going to be very, very painful in PHP :P [11:19:09] [bz] (8NEW - created by: 2Amir E. Aharoni, priority: 4Unprioritized - 6normal) [Bug 52222] fill http://he.wikipedia.beta.wmflabs.org/ with some useful data from he.wikipedia.org - https://bugzilla.wikimedia.org/show_bug.cgi?id=52222 [11:19:16] hmm, actually more than 50,000 [11:19:44] more like 150,000! [11:19:52] hmm [11:19:58] that's 600,000 per day [11:20:03] thats fine actually, ill be done in 2 days ;p [11:20:08] so that's still just 2 days [11:20:09] yeah ;) [11:20:24] addshore: you brought it down from 30 days to 2 days with Redis! :D [11:20:30] :> [11:20:48] addshore: write up about this to labs-l, no? other people can probably also use similar techniques [11:23:06] Coren: about the nfs problems: a friend of mine thinks that openafs or ceph might be suitable nfs replacements [11:23:19] http://www.openafs.org/ http://ceph.com/ [11:23:27] JohannesK_WMDE: we've Ceph in production for... something. Not too favorable opinions of it, IIRC [11:23:38] (we used Swift before, and not too favorable opinions of that either) [11:23:38] i haven't tried them myself, but might be worth a look [11:26:43] something must be done, i get horribly annoying delays all the time. maybe try openafs if ceph isn't suited [11:27:14] logCursor.execute('INSERT INTO logs VALUES (?, ?)', (timestamp, unicode(str(*args).decode('utf-8')))) [11:27:14] OperationalError: disk I/O error [11:27:29] ^^ speak of the devil [11:27:33] Coren ^^ [11:27:47] &ping [11:27:48] Pinging all local filesystems, hold on [11:27:48] Written and deleted 4 bytes on /tmp in 00:00:00.0005930 [11:27:50] Written and deleted 4 bytes on /data/project in 00:00:00.0081680 [11:27:58] I think Coren was about to replace the controller this thing was on [11:28:49] what kind of controller? [11:28:52] disk [11:28:55] ok [11:36:55] hah petan There are few trained monkeys who know how it works. But unfortunately [11:36:55] not many of them, [11:37:10] Remember: if nothing works, typically killing the bot entirely works :-) even if it's evil. [11:41:55] JohannesK_WMDE: You're missing the point of the issue; NFS stalling is the /symptom/ the the controller becoming unresponsive until it gets soft reset by the kernel. NFS - EXT4 - LVS - LVS - RAID - RAID - [ driver - hardware ] <- only the last has issues. [11:42:32] i didn't know that. *you* talked about NFS having issues. [11:42:48] if it's the hardware, it should be repaired or replaced... [12:38:19] JohannesK_WMDE: you have an idea how hard is it to repait a hard disk :-) like you need a McGiver to fix it with his screwdriver or something [12:39:15] %ping [12:39:19] wait xD [12:39:21] &ping [12:39:21] Pinging all local filesystems, hold on [12:39:22] Written and deleted 4 bytes on /tmp in 00:00:00.0002900 [12:40:05] Written and deleted 4 bytes on /data/project in 00:00:44.1246410 [12:44:09] petan: sonic screwdriver or magnetized needle usually works best [12:50:36] YuviPanda: what's redis function to rbpull [12:50:39] or whatever it was [12:50:40] in python [12:50:59] rbpop? [12:51:36] redis has no attribute rbpop [12:55:37] [02dispatcher-labs] 07benapetr pushed 031 commit to 03master [+0/-0/±3] 13http://git.io/jKqMYw [12:55:39] [02dispatcher-labs] 07benapetr 0336938c2 - implemented some more verbose output [13:01:11] petan: what's &ping? [13:01:33] [02dispatcher-labs] 07benapetr pushed 031 commit to 03master [+0/-0/±1] 13http://git.io/N1lCPQ [13:01:34] [02dispatcher-labs] 07benapetr 03407a2ef - fixed some issues with save / load [13:01:43] &help [13:01:43] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 1.20.0.16 my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [13:01:53] @ping [13:02:07] &help ping [13:02:08] Unknown command type @commands for a list of all commands I know [13:02:48] petan: http://redis.io/documentation [13:02:58] petan: to be more exact, http://redis.io/commands [13:03:08] you'll see that it is called 'brpop' [13:03:20] YuviPanda: I meant python command name [13:03:26] should be the same. [13:03:33] k [13:03:43] most clients mirror the official redis command names [13:07:48] [bz] (8NEW - created by: 2Yuvi Panda, priority: 4Unprioritized - 6normal) [Bug 52275] Status page should automatically refresh data - https://bugzilla.wikimedia.org/show_bug.cgi?id=52275 [13:08:51] petan: what's &ping? [13:08:59] it pings the disk [13:11:30] <{{Guy}}> :) [13:11:36] <{{Guy}}> !ping [13:11:36] pong [13:16:29] !ping [13:16:29] !pong [13:18:30] [02dispatcher-labs] 07benapetr pushed 031 commit to 03master [+0/-0/±4] 13http://git.io/XOSL4A [13:18:31] [02dispatcher-labs] 07benapetr 0325d8f0d - fixed missing check in Stream reader and some more [13:35:20] !ping [13:35:20] !pong [13:35:45] whoms bot is Not-002 ? [13:36:49] YuviPanda: you know? [13:37:03] AzaToth: petan [13:37:14] <{{Guy}}> It's the github bot, no? [13:37:17] could figure ツ [13:37:42] * AzaToth hates people who changes their nick [13:38:05] {{Guy}}: yes it is [13:38:11] unregistered offcoure [13:38:37] <{{Guy}}> What's unregistered? Not-002 ? [13:38:42] yup [13:39:09] and I can't parse the name [13:39:13] Not-002? [13:39:17] what does it mean? [13:39:18] petan: ? [13:39:42] notification [13:39:58] yes? [13:40:05] ah [13:40:14] can you please speak like petan: ? [13:40:20] not just petan? [13:40:25] petan: ↑? [13:41:08] <{{Guy}}> petan butter and jelly? [13:41:53] petan: I couldn't understand what Not-002 meant [13:42:03] not double zero two [13:42:03] it's a giyhub bot [13:42:06] * github [13:42:13] second notification bot [13:42:19] it is just like grrrit-wm, but it works for github repositories [13:42:28] petan: you might want to reg it [13:42:35] I can't it's not mine bot [13:42:36] second gerrit notification bot [13:42:43] it's a github service [13:42:47] k [13:43:00] anyway, I don't use gerrit because it takes months to get a repository set up [13:43:07] I need some system that is just as easy as github [13:43:30] petan: usually it takes a couple of hours max [13:43:37] you just need to poke the right person ツ [13:43:49] last repository I requested I had to wait more than 3 months for [13:43:53] lol [13:44:03] I was poking Chad frequently [13:44:05] ^demon|zzz: you hear that? [13:44:18] but that was when git was starting... [13:44:22] I see [13:44:25] there was quite a backlog back then [13:44:34] I assume that could have been a issue [13:44:50] anyway I just didn't like the mere fact that I have to ask someone else to create a repository instead of making it myself like on github [13:45:17] petan: well, it's prettyu logical from a basic point of view [13:45:58] can't just allow anyone to create a repo at any time with any name containing anything [13:47:45] as it's wmf's they need to have some control [13:50:37] YuviPanda: perhaps we should have colorized [repo/bar] as well [13:52:59] * YuviPanda reads scrollback [13:55:56] YuviPanda: would be nice to have the repo name as the first item of grrit bot notifications [13:56:23] (CR) JohnDoe: [C: 1] Some commit summary [repository/IdontcareAbout] .. [13:56:41] I got to read all the first bits before figuring out the change happened in some random extension :] [13:57:05] hmm, we could put that right after (CR) [13:57:13] AzaToth: thoughts? [13:57:23] we could even put it in the bracket [13:57:29] (CR: mediawiki/core) [13:57:44] addshore: did you notice your 2 processes on -login were autokilled [13:57:55] the e-mail is wrong [13:57:56] what processes? [13:58:11] you had 2x php process each of them using 1.6g of ram [13:58:28] they were automatically killed but the e-mail is wrong :P [13:58:34] ohhhh!!! yes :> [13:58:37] it's the other way [13:58:46] I was 'activly debuggin' and managed to debug it into a never ending looop ;p [13:58:47] process ws 1.6g and free memory was 40mb [13:58:56] :> [13:59:04] it reported it as your process was eating 40mb and free memory was 1.6g [13:59:09] I need to fix that template [13:59:23] where does ti report it? :O [13:59:34] to your e-mail [13:59:36] type mutt [13:59:45] it does? :P [13:59:46] unless you forward it [13:59:48] which email? :P [13:59:56] e-mail of owner of process [13:59:58] so your tool [14:00:07] heh, never checked my tool email ;p [14:03:22] YuviPanda: yea, that could be fine [14:32:08] yoo andrewbogott [14:32:10] you there? [14:34:03] ottomata, what's up? [14:34:05] I think there's a problem with role::puppet::self since the puppet repo refactor stuff [14:34:17] getting Could not find class passwords::puppet::database for i-00000861.pmtpa.wmflabs at /etc/puppet/manifests/base.pp:84 [14:34:43] yeah… there's kind of a chicken/egg problem, I'm not sure how to automate it. [14:35:05] If you add this line to your puppet.conf it should work: modulepath = /etc/puppet/private/modules:/etc/puppet/modules [14:35:22] Oh, and you'll need to update your private repo as well. [14:35:24] hmmmmm [14:35:45] Yeah, not ideal. [14:35:55] Actually, just updating private repo might work, I haven't tried that [14:36:28] hmm, not sure if I have permissions to pull [14:36:36] shoudl I change rem otes? [14:36:41] origin ssh://labs-puppet@gerrit.wikimedia.org:29418/labs/private.git (fetch) [14:37:00] $ sudo GIT_SSH=/var/lib/git/ssh git pull --rebase [14:38:02] hm, ok, well, the puppet self stuff gets a custom puppet.conf [14:38:09] should it add that line to it when it gets setup? [14:40:03] oh, hm [14:40:09] andrewbogott: that line already exists in my puppet.conf [14:40:18] great! [14:40:26] Then updating private should do it. [14:40:28] ORRRR [14:40:29] hmm [14:40:30] wait no [14:40:31] hm [14:43:28] ahhh i see, shoudl work for new self hosted puppetmasters [14:43:33] just not ones that have already been deployed [14:43:33] got [14:43:34] it [14:44:14] well… I think it's reasonable to require that both repos be updated. It's not too crazy to have them make changes that depend on each other. [14:44:36] right sure, i was just confused [14:44:51] because I saw the modulepath line in my local puppet repo's 10-self puppet.conf [14:44:57] but not on my labs puppetmaster [14:45:08] but then I realized thats' cause you recently committed it [14:45:15] great, works now [14:45:16] oh… so maybe you do have to add that line by hand then. that's not great :( [14:45:20] yeah [14:45:28] it'll work for new setups [14:45:30] but not for old ones [14:45:43] because the new setups can't apply the change to puppet.conf unless they can run puppet [14:45:46] and they need that line to run puppet [14:45:50] sorry [14:45:52] *old setups* [14:47:34] hmmm, i dunno, something is weird, [14:47:43] it looks like one of my labs instances was just puppetized as if it was in production [14:47:48] got ganglia prod adddress, etc. [14:48:01] oo, ja that's not good [14:52:54] Coren: thoughts on zeroconf / avahi for toollabs? [14:52:54] or some form of service registry [14:52:54] to let tools communicate with each other over the network [14:53:13] YuviPanda: Not something I have any thoughts to at this time. Do you have a suggestion to make? [14:53:22] s/have/gave*/ [14:54:23] Coren: I'm looking at avahi, but am looking for an early -2 from you. Don't want to waste time on something that won't be implemented. [14:55:03] Coren: use case is to have continuous jobs that can connect to other ones without having to depend on dirty tricks like putting host and port number on a static file somewhere [14:55:34] Avahi is a reasonable approach to this, I wonder how well the OpenStack "networking" layer copes with multicast though. [14:55:57] Or, for that matter, /whether/ it does. [14:56:08] I don't think we need to have multicast for this. [14:56:14] we just need DNS-SD (service discovery) [14:56:18] not dynamic DNS [14:56:29] we perhaps should even explicitly disable dynamic DNS [14:56:41] anyway, considering that you think this is a reasonable approach, let me read up more on avahi :) [15:01:16] Coren: don't think we want to use Avahi. Has a DBus dependency [15:01:36] * YuviPanda looks for other ways to do service discovery [15:01:59] I don't mind dbus; it's a reasonable local delivery mechanism for events, and even supports remote bus connections. [15:02:55] yeah, but it makes me a little queasy. [15:03:07] plus we only need DNS-SD, not all of avahi. [15:06:56] man that feels too desktop centered :( [15:10:38] Coren: http://linux.die.net/man/8/sge_st [15:11:03] very poorly documented tho [15:11:51] That also doesn't seem to quite fit. [15:12:17] yeah [15:12:20] and seems rather abandoned [15:12:29] Honestly, I don't mind the desktop slant of either avahi nor dbus, and I can see a number of valuable uses for dbus anyways enough that I already considered it. [15:12:43] such as? [15:13:07] avahi requires the 'servers' to register manually, and the libraries for that seem unsupported [15:13:19] and very desktop oriented. [15:13:27] that, or the docs suck :P [15:13:50] I don't think you can get away from the registration requirement; I know of no discovery system that magically knows what a new daemon does. :-) [15:14:19] of course, but the wrappers that let you do registration should be simple enough. and maintained. [15:14:28] http://stackoverflow.com/questions/3430245/how-to-develop-an-avahi-client-server doesn't look like either :( [15:17:16] http://avahi.org/wiki/PythonPublishExample doesn't look so bad. [15:18:09] noooooo! :P [15:18:27] I'll probably just create files with appropriate perms for now. [15:20:54] i haven't used my labs account in over a year I think ; and i'm looking to try (again) to create an instance of wikistream to run in the labs environment under this project https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikistream [15:21:24] is there a good place for me to start to see what I need to do to set up and test an instance of the software? [15:21:48] at the moment it's running on my own personal server http://wikistream.inkdroid.org/ [15:22:07] and i'd like to transition it to run in the labs environment [15:22:33] also, hi :-) [15:22:35] * YuviPanda waves at edsu [15:23:15] edsu: https://wikitech.wikimedia.org/wiki/Help:Contents? [15:23:54] YuviPanda: thanks! i was somehow missing that :) [15:24:00] :) [15:34:35] well that was easy ... [15:34:42] I think I need some handholding. I have a patch to a MW extension, and I want to test it on a labs instance. I'm not sure how to start. [15:42:38] ragesoss: Vagrant is most likely what you need, then. [15:43:07] ragesoss: what Coren said. It's far easier to do it that way than to create a labs project, set up mediawiki, etc [15:43:53] https://www.mediawiki.org/wiki/MediaWiki-Vagrant [15:43:59] * ragesoss licks [15:44:03] *clicks [15:44:14] hehe [15:44:22] <{{Guy}}> ... [15:45:08] <{{Guy}}> !ragesoss is [11:44] * ragesoss licks [15:45:08] Key was added [15:45:21] <{{Guy}}> !ragesoss [15:45:21] [11:44] * ragesoss licks [15:45:26] <{{Guy}}> Got it.. [15:51:37] I fail at vagrant. [15:52:08] I get a NoMethodError upon running 'vagrant up' [16:11:16] ragesoss: poke ori-l [16:11:30] he should be on -dev or somesuch [16:11:50] okay. [17:02:25] Coren, these spikes are growing out of control. [17:02:29] :| [17:03:09] &ping [17:03:10] Pinging all local filesystems, hold on [17:03:11] Written and deleted 4 bytes on /tmp in 00:00:00.0010400 [17:03:12] Written and deleted 4 bytes on /data/project in 00:00:00.0095740 [17:04:14] Cyberpower678: They're bound to actual disk usage, annoyingly enough. We still get < 1/h on average, but they tend to cluster around peak usage. I'm not going to be switching hardware right before Wikimania, but it's my first priority on return. [17:09:30] Coren, cool. [17:10:33] Coren: would uwsgi be done before wikimania? I'd hate to have to tell people to use CGI for python... [17:10:44] YuviPanda: I'm trying to. [17:10:58] ok :) [17:18:10] is someone mucking around with beta labs right now? it was 503 for a bit, now it is pure HTML? [17:24:57] manybubbles: you have many privileges on beta labs now [17:25:08] chrismcmahon: thanks1 [17:46:48] &ping [17:46:48] Pinging all local filesystems, hold on [17:46:49] Written and deleted 4 bytes on /tmp in 00:00:00.0005990 [17:46:50] Written and deleted 4 bytes on /data/project in 00:00:00.0081030 [17:49:27] <^d> I've got a couple instances I can't delete :( [17:49:57] I've got an instance that constantly locks up.... [17:50:36] <^d> Ryan_Lane: /deplotment-memc[0-1]?/ I screwed up configuring but it won't delete :( [17:50:45] * YuviPanda has 99 problems but right now an instance ain't one [17:51:02] * ^d gives YuviPanda an instance problem [17:51:06] <^d> Now you have 100 problems. [17:51:15] and instance is undefined [17:52:02] I was having problems last week where if I deleted an instance and tried to recreate it too quickly it'd be stuck in dns [17:52:55] <^d> YuviPanda: I got 99 problems and all of them are instances :( [17:53:11] you got a lot of instances, did you make them at an InstanceFactory? [17:53:44] <^d> Instance::factory() [17:54:12] <^d> Actually, I annotate them all as @Instance so I can just inject them with Guice. [17:54:25] pfft, that'll kill your startup time! [17:54:31] or at least, it does on Android :( [17:55:40] <^d> A-ha, I got one of them to delete. [17:59:26] greg-g: take it back, still getting no CSS [18:00:25] Ryan_Lane: https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource%3AI-00000601&action=history https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource%3AI-00000601&diff=79263&oldid=79262 [18:00:30] chrismcmahon: blugh [18:01:01] andrewbogott: ^^ [18:01:04] [bz] (8NEW - created by: 2Tim Landscheidt, priority: 4Unprioritized - 6normal) [Bug 52258] Obsolete scripts in /usr/local/bin need to be "managed" - https://bugzilla.wikimedia.org/show_bug.cgi?id=52258 [18:01:12] andrewbogott: looks like the bot is adding whitespace on changes [18:01:43] an empty line on top and an empty line between the two templates [18:01:54] indeed! I'll have a look shortly. [18:02:00] builds up over time, had a page full of spaces :) [18:02:12] * Coren says evil things about uwsgi. [18:03:21] <^d> YuviPanda: I think I just had to keep pressing delete a bunch of times and refreshing. [18:03:26] <^d> Two out of 3 now deleted :) [18:09:06] Tools is spiking again. :/ [18:10:45] &ping [18:10:45] Pinging all local filesystems, hold on [18:10:46] Written and deleted 4 bytes on /tmp in 00:00:00.0005930 [18:10:47] Written and deleted 4 bytes on /data/project in 00:00:00.0121250 [18:15:12] ^d: you just spammed me with a lot of delete instances :) [18:15:41] <^d> YuviPanda: :D [18:19:50] ottomata, did adding that line to puppet.conf and updating the private repo make things work again? I'm thinking I should send an email... [18:20:04] yes [18:20:19] cool [18:20:31] i had some other weirdness with realm, um [18:20:46] oh but i think that was my fault [18:20:48] yeah, so that works [18:21:03] for existing puppetmaster::self / role::puppet::self instances out there [18:21:08] that line needs to be added [18:21:12] they shoudl probably add it to both [18:21:21] yeps [18:21:25] puppet.conf and puppet.conf.d/10-self.conf [18:21:54] not sure which one takes precedence, puppet will cat everything in puppet.conf.d into puppet.conf [19:08:23] !log tools tools-webproxy: Purged popularity-contest and ubuntu-standard [19:08:25] Logged the message, Master [19:08:43] petan: Happy holidays! Could you disable terminatord before you leave, please? [19:09:06] why [19:09:38] it has saved -login from complete failure at least 5 times [19:09:40] Because it kills processes like gmond or "random" process like before the last reboot of tools-login? [19:10:06] it killed gmond because I gave it +19 priority [19:10:19] it works like OOM killer, just isnt so dangerous [19:10:43] low priority processes are first on list, and to be hones is gmond important at all? that thing restart itself anyway [19:11:08] It *killed* gmond, that's dangerous :-), and before the last reboot it apparently brought the house down. [19:11:17] nonsense [19:11:23] how could it happen? [19:11:38] it kills only user processes, it never touches anything what is running as root [19:11:47] I don't know, and I don't want to investigate yet another OOM killer when you're away. [19:11:58] gmond doesn't run as root [19:12:13] yes that is why it was killed, which IMHO make perfect sense [19:12:33] the machine was dying OOM so it killed processes which arent crucial in order to save it from dying [19:12:47] gmond definitely isnt crucial for system to work [19:13:18] well I can turn it off if you are afraid of that thing bringing server down [19:13:43] but it is nonsense, it can only prevent system from dying it can never kill it [19:13:48] unlike OOM killer in kernel [19:14:01] if the machine died it was probably caused by kernel OOM killer [19:14:11] you can always check syslog to confirm that [19:14:21] hi all, i think i've set up an instance of an app running on port 80 at wikistream-web.pmtpa.wmflabs ; at least i can see it locally and from bastion [19:14:58] hey edsu. to make it acccessible publicly you need a public ip. you can ask Ryan_Lane or andrewbogott_afk for one. [19:15:01] btw scfc_de: if it wasnt running today, -login would be rebooted again today [19:15:47] I think there is a good reason for similar thing to exist on toolserver [19:15:50] petan: Turning it off would be much appreciated. -login wouldn't have to be rebooted, because the kernel OOM killer would have kicked in. [19:15:56] YuviPanda: gotcha, thanks -- if i have a socks proxy to bastion and i've confiugred my browser to use it, shoudln't i be able to put http://wikistream-web.pmtpa.wmflabs in my browser to see it? [19:16:15] edsu: depends on how you've configured your proxy. [19:16:16] yes kernel OOM killer would have kicked in and probably would kill some important process which it usually does when it kicks in :P [19:16:19] i can browse other stuff ok, so i know the socks proxy is working ok [19:16:22] petan: /var/log/syslog.2.gz, line 22867 is the last entry before the reboot. [19:16:33] what that entry is [19:16:47] "Jul 29 00:55:16 tools-login terminatord: System is out of memory, only 104464384 bytes remaining, killing random process" [19:17:01] well, but there is no other record [19:17:10] before it would kill the process it would be logged [19:17:15] petan: Yes, because it probably killed something important. [19:17:27] what is important there and not running as root? [19:17:47] it completely ignores all processes that run as root, unlike kernel OOM killer which doesnt [19:18:02] * YuviPanda merges terminatord into the kernel [19:18:12] it could have kill some user process maybe, I doubt that would break the system [19:18:21] I don't know; I'll leave the design of OOM killers to people who are knowledgeable there. If you think your design is better, submit a patch. [19:18:38] YuviPanda: http://inkdroid.org/tmp/screenshot.png [19:18:51] why should I? I believe that OOM killer should never be a part of any kernel [19:18:55] it is a silly idea to have this in kernel, and many people actually agree on that [19:19:06] it should be completely separate service [19:19:13] edsu: hmm, unsure then. andrewbogott_afk or Ryan_Lane will know better, I think [19:19:25] edsu: if you want it to be publicly accessible, you need a public ip anyway. [19:19:41] YuviPanda: and i did a ssh edsu@bastion.wmflabs.org -D8080 [19:20:02] YuviPanda: true, i might as well ask for one, is in here a good place for that or on discussion list, priv email? [19:20:14] btw scfc_de: guess why ganglia.wmflabs.org is not crashing anymore... [19:20:29] edsu: just poke 'em here, I think? [19:20:35] petan: Again: I don't know. Apparently Debian, Ubuntu and some others disagree with you, and I don't have the knowledge to judge. [19:20:39] maybe it is related to replacement of freaking kernel OOM killer with terminatord which is more safe [19:21:00] hm whatever, I will turn it off on friday then [19:21:19] petan: Thank you. [19:21:32] Ryan_Lane: hi there, i'm trying to transition wikistream.inkdroid.org to wmflabs, i've got an instance (it's node) running behind varnish on wikichanges-web and was wondering if it could be publicly accessible [19:23:20] Ryan_Lane: is there a better place to put in a request like this? [19:23:30] Ryan_Lane: so that i don't interrupt whatever is going on right now? [19:34:16] YuviPanda: i emailed labs-l just in case it needs to be asynch :) [19:34:26] edsu: :) [19:34:41] edsu: btw, great work on wikistreams! hopefully some day we can get realtime stuff runnable on toollabs [19:34:46] YuviPanda: thanks for your help [19:35:04] edsu: yw! [19:35:09] YuviPanda: is tool labs the environment in labs for toolserver related projects? [19:35:15] edsu: it is! [19:35:18] nice [19:35:28] edsu: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [19:35:40] edsu: and more! we've, for example, redis :) [19:35:57] so one instance of redis for the various projects to share? [19:35:59] edsu: eventually I hope to have a hipache service running, so we can do websockets, etc from here [19:36:09] edsu: yeah, with some tweaks to ensure security [19:36:27] edsu: disabled all commands that let you 'list' keys, so you can't really mess with other people's data unless you know the key [19:36:37] edsu: and then ask people to use a long enough, openssl generated key :) [19:37:01] YuviPanda: neat, how much memory does it have? [19:37:25] edsu: it currently has 1G, but it'll get upped tdo about 4 or so in a week's time [19:37:59] edsu: let me know if you're planning on doing anything with it :) [19:39:05] YuviPanda: will do [19:39:10] edsu: :) [19:50:37] edsu: any reason that can't run on tools? [19:50:49] or is it already in another project? [19:51:20] ah [19:51:24] sorry, just read backscroll [19:52:11] Ryan_Lane: can't run on tools because it needs websocket [19:52:49] (03PS1) 10Yuvipanda: Add separate process that listens for subscription changes [labs/tools/gerrit-to-redis] - 10https://gerrit.wikimedia.org/r/76764 [19:54:03] ah [19:54:06] * Ryan_Lane nods [19:54:48] * Damianz looks at Ryan_Lane nodding and wonders if his neck hurts yet [19:55:06] (03PS2) 10Yuvipanda: Add separate process that listens for subscription changes [labs/tools/gerrit-to-redis] - 10https://gerrit.wikimedia.org/r/76764 [19:55:19] Damianz: digital nodding is cheaper than RL? [19:55:41] You could digitally nod in a seperate thread [19:55:44] Damianz: whenever I digital nod, just assume I'm doing it IRL for like 20 mins [19:55:59] and if I do it more than once, I'm just nodding harder [19:56:00] * YuviPanda makes puppet joke [19:56:29] so, are people at opscode really good cooks? [19:56:51] if they make all their food with non edible precious stones, I dunno... [19:56:54] i want to see luke do a sockpuppet show [19:57:00] Ryan_Lane: I just imagine you muttering in the random understanding like it's a break though that was obvuios and now requires many mins of mumbling to ones self to figure out why it took so long to realise, while nodding. [19:57:26] heh, well, I did ask before reading the backscroll and that doesn't help :) [19:58:05] huh, Ryan_Lane could wear a helmet cam and then we can see it bouncing around when he nods [19:59:06] edsu: what's the labs project name your instance is in? [19:59:11] so that I can give it an IP [19:59:32] andrewbogott_afk: We miss you secretly ;) [19:59:36] Coren: petan has no sanity [19:59:40] * Damianz finished reading email [19:59:52] what [20:00:20] Damianz: I can haz contexts? [20:00:28] yes plis [20:00:40] Coren: labs-l mailing list for petan (or labs, depending on email) being offline for 2 weeks [20:00:46] * YuviPanda writes 0mq code [20:00:46] :D [20:00:50] What's sanity? [20:00:57] Though I think generally, he has no sanity is acceptable without context [20:01:07] puppetVar: instanceproject=wikistream [20:01:07] puppetVar: instancename=wikistream-web [20:01:10] Ryan_Lane: ^ [20:01:10] Damianz lol if you want I will just disappear silently next time [20:01:36] I do that all the time - it takes people about 3 months to realise... maintaining that's a good thing [20:02:56] petan: 3. 8. till 18. 8 does not compute [20:03:02] petan: iso8601 pls. kthx [20:03:16] what is that XD [20:03:35] 3 aug - 18 aug [20:03:48] https://en.wikipedia.org/wiki/ISO_8601 [20:04:23] that is kinda funny, taking labs offline for 2 weeks! [20:04:36] Ryan use to do it all the time [20:05:33] petan: wm-bot is still bots-labs? [20:05:41] yes [20:05:43] and why is there a wm-bota and a wm-bota ? [20:05:45] gah. [20:05:48] Lol [20:05:50] wm-bota and wm-bot [20:05:56] bota is test on tools [20:05:59] aha [20:06:04] &help [20:06:04] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 1.20.0.16 my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [20:06:06] I am testing how much suitable is it for my baby [20:06:07] er. [20:06:09] bot [20:06:36] are you building a nursery? [20:06:41] yes [20:06:49] !petan... [20:06:50] OMG Petan deleted me and now he is going to have to spend 2 days trying to put me back together..... [20:06:57] petan's hobby is mating bots [20:07:06] !petan [20:07:06] Petr Bena - http://enwp.org/User:Petrb (hates python) :D [20:07:06] * making [20:07:10] !petan. [20:07:10] OMG Petan deleted me and now he is going to have to spend 2 days trying to put me back together..... [20:07:16] !python [20:07:16] EEEEEEEEWWWWWWWWWWWWWWWWWW! [20:07:21] !petan...... [20:07:27] .. [20:07:28] !t13 [20:07:29] T13 really needs to stop telling various bots to ping addshore in various channels... [20:07:31] !python del [20:07:31] Successfully removed python [20:07:39] YuviPanda++ [20:07:44] !delete [20:07:44] petan deleted me once, and then andrewbogott came and deleted whole my server by accident. I tell you people, deleting software is evil and should be illegal. If you don't like some program, don't delete it, just shoot yourself or something... [20:07:47] !technical_13 [20:07:48] !delete del [20:07:49] Successfully removed delete [20:08:08] wm-bot: You should version control yourself [20:08:08] Hi Damianz, there is some error, I am a stupid bot and I am not intelligent enough to hold a conversation with you :-) [20:08:14] ewwwww, no instructions to shoot oneself, pls [20:08:29] jeremyb: yeah, deleted that. [20:08:31] they came from a bot! [20:08:44] ^ worst defence ever! :P [20:08:48] meh [20:08:57] given that I wrote it, yes [20:09:03] YuviPanda: defense* ! [20:09:11] defence [20:09:15] no! [20:09:16] that is british? [20:09:25] british people made english or not [20:09:35] maybe they should be the ones to decide what is correct :P [20:09:46] jeremyb: as someone who learnt english at an Anglo Indian Catholic School while watching american movies and cartoons, I seem to have an interesting mix :) [20:09:58] or maybe Czech people should decide on that [20:10:00] color but defence [20:10:03] because we got a beer [20:10:18] I like defenze more [20:10:28] YuviPanda: orly! [20:10:34] jeremyb: yarly! [20:10:35] I like defenestrate more [20:10:35] YuviPanda: flavor? [20:10:38] jeremyb: yep, I had tracked it down on wikitech, thanks though :) [20:10:43] jeremyb: flavor! [20:10:45] Ryan_Lane: k :) [20:10:59] YuviPanda: huh [20:11:03] &ping [20:11:03] Pinging all local filesystems, hold on [20:11:04] Written and deleted 4 bytes on /tmp in 00:00:00.0005660 [20:11:04] jeremyb: but lorry, not truck. [20:11:05] Written and deleted 4 bytes on /data/project in 00:00:00.0320580 [20:11:16] boot? [20:11:27] my baby is testing his toys [20:11:33] !ping [20:11:34] !pong [20:11:40] or her, whatever it is fucking bot [20:11:44] Why can't it test it and just be 'ok' or 'not ok' [20:11:47] is it he or she? [20:11:57] because I like details [20:12:04] I want to know how much good it is [20:12:07] Don't even start a conversation on what defines a gender [20:12:08] YuviPanda: boot? [20:12:15] jeremyb: shoe, more like? [20:12:24] YuviPanda: or car boot [20:12:43] or car brake shoe! [20:12:52] Damianz: you english people, you are so... restricted. can you speak about something? you cant say someont to shoot himself, you cant define gender, you cant say defence... what can you say? [20:13:09] jeremyb: ah, no. I've probably sat in a car for like ~20 times in my entire life, so no idea :D [20:13:23] do you really have to follow so many rules in your life [20:13:27] https://en.wikipedia.org/wiki/Brake_shoe [20:13:28] https://en.wikipedia.org/wiki/Car_boot [20:13:39] YuviPanda: are you urban? [20:13:43] !smb [20:13:43] YuviPanda: What's SMB? Some Magic Bean? [20:13:45] jeremyb: verily. [20:13:47] !smb del [20:13:48] Successfully removed smb [20:13:49] petan: Shooting is messy, drugs or hanging or bleeding ftw, gender is sterotyped too much and labels are boring.... no labels are fun [20:13:52] YuviPanda: aha [20:13:54] :( [20:14:01] petan: http://en.wikipedia.org/wiki/Gender-specific_and_gender-neutral_pronouns#Alternatives_to_generic_he ;) [20:14:31] jeremyb: usually take public transport if possible. Or walk. Or have someone give me a lift in a motorbike [20:14:33] mutante: don't get them started all over again! [20:15:00] YuviPanda: or manual bike? [20:15:08] I will not call my bot they [20:15:26] jeremyb: I move around between cities a fair bit, so do not usually have steady access to a bicycle [20:15:31] he is for male... she for female... maybe we need a new word bhe [20:15:41] jeremyb: plus the roads here are incredibly unsafe for bicycles, so I tend to walk rather [20:15:43] bhe for robots and machines [20:15:44] it's too bad citibike isn't closer to home. but i guess it will be within a year maybe [20:16:07] YuviPanda: ohhh, ouch [20:16:16] male, female, robomale... who cares [20:16:26] jeremyb: yeah, when I used to bike around I've escaped being killed by inches about 4-5 times [20:16:31] so... tend to avoid doing that much. [20:16:36] YuviPanda: http://lessig.tumblr.com/post/50418001718/the-unintended-consequences-of-bike-lanes [20:16:48] petan: [20:16:48] http://www.qwantz.com/index.php?comic=2079 [20:16:56] jeremyb: some cities have 'bike lanes' but, they mostly have cars parked [20:16:58] ottomata: tried citibike? [20:17:00] or motorbikes parked [20:17:08] jeremyb: naw, but I'm sure I will one day [20:17:11] I usually have mine w me [20:17:14] right [20:17:16] but there are times when I don't and wish I did [20:17:22] but they are awesome~! [20:17:25] yeah [20:17:29] so many more people biking, its great [20:17:31] they're nowhere near me [20:17:33] :( [20:17:41] i think I see different demographics biking than I usually would too [20:17:45] yeah, seen people using them [20:17:56] the other day I saw a old white bearded hasidic dude riding across the manhattan bridge! [20:17:58] so awesome@ [20:18:14] one guy i work with seems to be carrying a helmet with him everywhere so he can use them [20:18:52] i guess probably helmet use on citibike is much less than normal bikes [20:19:18] yeah for sure [20:19:24] but whaaatevs, its cool [20:19:28] thon bots... [20:19:40] THON SAY PATCHSET MERGED [20:19:56] that was tron... [20:20:35] ottomata: speaking of hasidic... seen http://cityroom.blogs.nytimes.com/2009/12/08/cyclists-redraw-the-lines-in-brooklyn/ ? [20:21:11] ha, yeah, heard of that for sure [20:21:24] !gender is my gender is a... box! I am a box. In a rack. You dont want to sleep with me. [20:21:24] Key was added [20:21:53] !gender del [20:21:53] Successfully removed gender [20:21:55] that should protect my baby from ugly bot-perverts [20:22:05] YuviPanda seems to be one of them [20:22:12] is there 3RR here? [20:22:35] 98RR only [20:22:35] we need more patience [20:23:20] !asl [20:23:28] @search asl [20:23:29] No results were found, remember, the bot is searching through content of keys and their names [20:23:33] @meh [20:23:43] YuviPanda: http://208.80.153.187/ [20:23:45] @kick jeremyb 22/m/India [20:23:55] lol [20:23:57] abuse! [20:24:07] grrrrrr [20:24:10] edsu: [20:24:11] * jeremyb stabs YuviPanda [20:24:12] jeremyb: that proves how pervert he is [20:24:13] edsu: \o/ [20:24:16] he like to abuse things [20:24:19] and powers [20:24:30] now i have to look at the logs! [20:24:45] YuviPanda: is it typical to want a wikimedia hostname to associate with the IP? [20:24:56] edsu: yeah, you should be able to get a DNS entry too [20:24:59] edsu: something.wmflabs.org [20:25:23] oh, didn't miss much [20:25:49] edsu: I... don't know how to do that, though. poke Ryan_Lane [20:26:10] edsu: go to "manage addresses" [20:26:16] allocate the ip [20:26:22] associate the ip with the instance [20:26:26] add a hostname to the IP [20:26:29] edsu: btw, if you're taking feature requests - can you add a flag there to show only mobile edits? They have a tag, I believe [20:26:33] not sure if they show up tho [20:26:34] oh right, it was right there thanks http://wikistream.wmflabs.org/ [20:26:41] yw [20:27:01] is there any sort of analytics tracking stuff that labs webapps use? [20:27:02] edsu: it's not working if you limit it to a certain wikipedia [20:27:08] jeremyb: thanks [20:27:33] edsu: like google analytics ? or just parsing UA strings? [20:27:36] idk [20:27:57] Ryan_Lane: would a in-labs Piwik installation be against our TOS? [20:27:59] (and referers) [20:28:22] YuviPanda: yes [20:28:25] and piwik suks [20:28:26] i guess there's not too much going on for wikistream. a relatively small set of possible things a user could be doing [20:28:27] *sucks [20:28:29] edsu: you can probably set one up. IIRC nothing automatic. [20:28:38] jeremyb: yeah, i need to disable the analytics code that's in there for my old hostname [20:29:01] jeremyb: yeah, it's pretty much look for a few seconds and then go do something else :) [20:29:15] well, I guess it wouldn't be against the ToS, depending on how it was implemented [20:29:19] Ryan_Lane: even if we set it up in-labs and disable ip tracking properly? [20:29:20] yeah [20:29:32] I guess all we need to do is to not expose raw data [20:29:38] edsu: but even the options at the top of the page. only so many permutations [20:29:44] if we did it in-labs, we'd need to make sure it was only NDA users [20:29:59] and we'd need to make sure it didn't expose anything private [20:30:03] Ryan_Lane: what's the alternative? make it an ops thing? [20:30:07] but again, piwik kind of sucks :) [20:30:11] Ryan_Lane: if it is implemented as a toollabs project, then can it be considered violating Priv policy? [20:30:16] jeremyb: no, the alternative is for it to not exist ;) [20:30:30] Ryan_Lane: since toollabs should already strip away most things [20:30:43] this doesn't really seem like a tool labs kind of thing [20:30:57] Ryan_Lane: sure, but will it still violate TOS? [20:31:10] if people without NDAs can access the raw data, yes [20:31:12] since it can't do anything tools can do by themselves. [20:31:23] idk what strip away most things means. what does tool labs do now? [20:31:36] jeremyb: no ips, no referrers, and I think no UA? [20:31:58] I guess that is all the 'private' info [20:32:40] YuviPanda: those things are all interesting. especially for a website taking advantage of newish html5 features to know what the market share of user agents hitting it is [20:32:48] i guess s/market // [20:33:05] I am not sure about UA [20:33:39] IP for geolocation. referer to see if you've been mentioned on reddit or NYT [20:33:55] yeah, I know. but I think our TOS doesn't let you get those two anyway [20:34:03] edsu, you can assign a name to your instance on this page, here: https://wikitech.wikimedia.org/wiki/Special:NovaAddress [20:34:14] IP I know is completely off limits (for the most part), and referrer too, I thinki [20:34:27] andrewbogott: done already, wikistream.wmflabs.org [20:34:39] ah, ok, should've read the backscroll :) [20:34:43] * YuviPanda just made a commit to https://git.wikimedia.org/summary/?r=USERINFO.git [20:34:57] edsu: idk the privacy rules so well but i guess maybe google fonts and google JS CDN would both be problems [20:35:42] I am pretty sure they are, sadly. [20:35:56] idk about "sadly" [20:36:28] Fonts is very nice, IMO [20:36:40] we can host them on labs too, of course. perhaps make a labs tool [20:37:07] Anyone else getting stuck while trying to ssh into tools-login.wmflabs.org? [20:37:34] I'm at debug2: we sent a publickey packet, wait for reply [20:37:49] that happened yesterday, and scfc_de restarted it [20:38:00] Ryan_Lane? [20:38:11] oh, there it goes [20:38:12] nevermind [20:38:13] Coren: ^^ ? [20:38:26] https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&s=by+name&c=Labs%2520NFS%2520cluster%2520pmtpa&tab=m&vn= [20:38:32] NFS [20:38:49] Krenair: WFM? [20:38:49] Ryan_Lane: Yep. Happy fun controller stall. Of Doom. [20:38:53] heh [20:38:58] jeremyb: ok good to know [20:39:11] heh YuviPanda http://ganglia.wmflabs.org/latest/?r=20min&cs=&ce=&c=tools&h=tools-mc&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [20:39:24] just added 117,682 to the list ;p [20:39:27] jeremyb, yep, suddenly started working for me [20:39:29] Krenair: Known issue. The raid controller on the NFS server occasionally stalls for 1-2 minutes. It gets better quickly. [20:39:52] Fixing this is my first priority upon returning from Hong Kong. [20:40:01] :> [20:49:37] ^d: having fun creating and deleting instances? [20:49:45] <^d> Yep :) [20:49:48] * Krinkle got like the 100th notification now [20:49:54] <^d> I'm gonna get them right eventually. [20:50:01] no worries [20:51:06] ^d: you have another that won't die? [20:51:15] usually waiting a bit and trying again is best [20:51:22] <^d> No, it's fine now. [20:51:23] otherwise you're just flooding the queue :) [20:51:34] <^d> You had just deleted the one that was ok (when I didn't tell you I was fine) [20:51:37] <^d> So had to redo it [20:51:42] <^d> :D [20:51:45] oooohhhh [20:51:46] crap [20:51:47] sorry [20:51:54] <^d> No worries :) [20:51:54] it was in the ERROR state [20:51:57] <^d> I'm all good now. [20:59:51] <^d> What's the proper way to restart memc? There doesn't seem to be an init script :\ [21:01:04] ^d, service memcached restart? [21:01:16] <^d> That would be great, if an init script existed. [21:01:26] eh? [21:01:44] <^d> Derp, typo. [21:01:46] <^d> Ignore me. [21:02:16] Coren: scfc_de when SGE kills a process, does it send it SIGTERM? [21:02:18] or... soemthing else/ [21:02:23] ? [21:02:33] YuviPanda: SIGKILL, for great justice. [21:02:38] heh [21:02:45] I... guess I can't handle that? [21:02:46] It doesn't do half-measures. [21:02:53] No, by definition. [21:03:02] true [21:03:11] arr, durr. [21:03:36] Coren: so I'm writing to a file the hostname and port on which my service is running, so clients can connect to it. This file is writeable only by me, but world readable. so far so good. [21:04:11] Coren: however, security issue! someone can find a way to kill the process, and then open their own process in the same host and port [21:04:23] It's actually kinda somewhat possible in a twisted sort of way. I once saw a daemon that stored state in a mmap(), opened a pipe between the daemon and an overwatch, and had the overwatch handle SIGPIPE by saving state from the mmap() and exiting. :-) [21:04:26] and since the file is still there, clients will connect to the new service thinking it is the old one, but it isn't! [21:04:58] it's only open for a small amount, since I'll ensure I always exit with -1, so SGE will keep restarting me. [21:05:00] but... still! [21:06:11] Coren: is this a legit concern with this 'store addresses in files' strategy, or am I missing / overstating something? [21:06:21] ... it's not clear that you'd want some system that is listening on a port being sensitive enough that being substituted by another would be a problem. [21:06:29] I.e. without auth. [21:07:11] Coren: true, the worst thing that can happen here is someone else empties your redis queue, so grrrit-wm is silent for a while :P [21:07:12] not much [21:07:15] Worse that happens is someone connects to the wrong thing, errors out because it's not what it expected, and connects back to your corrently restarted thing. [21:07:20] yeah [21:07:36] correctly* [21:07:53] Coren: also depends on how you do auth, right? if you do auth by connecting to the service and sending credentials, this can be used as a sort of 'phishing' attack [21:08:36] If you have clients send credentials over a socket, you're already doing it wrong. [21:08:53] heh, true. [21:09:01] I still have no idea how to do actual auth, though. [21:09:07] this one doesn't really need auth [21:09:10] but... still! [21:09:47] Coren: I guess the same 'attack' of reusing the port/host can be done even if you use DNS-SD or something, so I guess I don't have to worry about it too much [21:10:08] There are a number of ways to do real auth; I can probably look up some good books for you if you are really interested. Beware: security protocol design and implementation is a /hard/ field, and much toil lies that way (but it's fun) [21:10:28] indeed, and that is why I never want to do it myself :P) [21:10:41] was hoping to find something that can tell me something like 'this is LDAP user X, I verified!' [21:10:55] and now suddenly tools-login is not responding to any of my commands [21:11:19] Krenair: wait for two minutes :) [21:11:24] Even that is seriously nontrivial. [21:11:42] auth is easy :) [21:11:56] and amazingly terribly hard at the same time :( [21:11:57] Krenair: http://ganglia.wikimedia.org/latest/graph.php?r=1hr&z=xlarge&h=labstore3.pmtpa.wmnet&m=cpu_report&s=descending&mc=2&g=cpu_report&c=Labs+NFS+cluster+pmtpa shows the current ills. It's very variable over current usage and the worse is between 21h and 03h UTC [21:12:02] Coren: indeed. Best way to do security is to make sure I don't do it myself :P [21:12:22] YuviPanda: You have *no* idea how often I've wished that more devs thought that way. [21:12:25] Ryan_Lane: you mentioned something about 'keystone', and the docs for that that I could find read very much like Oracle's documentation... [21:12:36] (aka, bad) [21:12:49] Coren: heh. [21:12:51] YuviPanda: I did security for a long time. The basic rule is: any security written by the coders is worse than bad. [21:12:54] and now it finally works again [21:13:09] YuviPanda: keystone's docs suck [21:13:15] Ryan_Lane: yup. verily. [21:13:23] YuviPanda: basically, people pass a token to your service [21:13:29] and you verify it with keystone [21:13:56] Ryan_Lane: how do they get the token [21:14:07] unless you are using PKI tokens, then you just check the validity of the PKI token (correct CA, not expired) [21:14:18] YuviPanda: they get a token by authenticating to keystone first [21:14:29] I don't currently expose a user's token, but I could [21:14:33] aaaah [21:14:34] right [21:14:34] if there's a service that needs it [21:14:50] tokens currently last for a week [21:15:13] are they scope limited? [21:15:18] can I impersonate users with just their token? [21:15:27] yes [21:15:30] but... [21:15:33] they are scoped to a project [21:15:41] and keystone is looking at further scoping [21:15:48] I wish they'd just go with OAuth [21:15:50] but alas [21:15:58] ... wait, how did they solve the obvious replay attacks? [21:15:59] right, so tools will have their own scope [21:16:08] Coren: they didn't [21:16:12] haha :P [21:16:13] OAuth doesn't either, though [21:16:24] in fact, most protocols don't [21:16:31] Ryan_Lane: how hard is it to expose this? [21:16:33] they assume you are going to secure the channel [21:16:54] YuviPanda: I'd just need to unfirewall the keystone API [21:16:55] *sigh* [21:16:58] Is crap! [21:17:17] also, newer versions of the cli tools store the keys in a keyring [21:17:28] is the cli trivial? [21:17:37] I wouldn't use the CLI [21:17:40] I'd use the API [21:17:41] then? [21:17:46] the API is actually easier [21:17:52] there's libraries as well [21:18:02] if only there were docs... :P [21:18:13] there are [21:18:14] one sec [21:18:30] YuviPanda: http://docs.openstack.org/api/openstack-identity-service/2.0/content/ [21:18:57] Ryan_Lane: grr, that page has no mention of the word 'keystone' [21:19:07] yeah, it's also named openstack-identity [21:19:22] (and yes, this drives me insane too) [21:19:27] hehe [21:19:31] http://docs.openstack.org/api/openstack-identity-service/2.0/content/POST_authenticate_v2.0_tokens_.html [21:19:52] Ryan_Lane: 'password required'? [21:20:01] yep. [21:20:06] to get a token [21:20:17] and the token has an expiry and a scope [21:20:26] which makes it really annoying [21:20:38] how does that work for service accounts? [21:20:56] ah. right. keystone has no clue about them [21:21:00] lol [21:21:09] what would this be for? [21:21:15] also what password? LDAP? [21:21:23] yep. [21:21:26] what are you writing? [21:21:38] service users shouldn't be doing auth [21:21:46] we designed them with that in mind [21:21:47] Ryan_Lane: right now? I'm opening up gerrit-to-redis to whoever wants it [21:22:10] could add it as a wikitech interface [21:22:17] Ryan_Lane: /subscribe.py should generate a cryptographically secure prefix, and start publishing gerrit events to it [21:22:24] * Ryan_Lane nods [21:22:35] Ryan_Lane: how hard is that? [21:22:42] depends [21:22:49] heh [21:22:49] :P [21:22:56] if you make an API, and add keystone support... [21:23:15] Ryan_Lane: :P in this case, there is *some* form of auth - there is the secret key [21:23:17] then wikitech would just send commands to it and you'd return the credentials for redis [21:23:43] your service is just a means of handing out credentials [21:23:51] hmm, right. [21:23:51] but you also need to make it multi-tenant [21:24:01] so, you could store the creds in your own prefix [21:24:01] Ryan_Lane: well, 'credentials' in this case is just the secret key [21:24:06] Ryan_Lane: that's what I do now :) [21:24:21] _clients has the list of secret keys [21:24:22] but you need to know who is authorized to get them [21:24:30] ah, right. [21:24:36] but then that'll be bots, right? [21:24:38] that's what keystone is for [21:25:05] though in this situation, keystone doesn't know about the bots, it just knows about users [21:25:14] it can validate a user is who they are (identity + auth) [21:25:28] but it won't do authorization (this user is a member of this bot) [21:25:42] because keystone doesn't know that info [21:25:47] Ryan_Lane: yeah, so I'm not sure how that can be used? since it is bots that'll need auth. [21:26:00] the bots don't need auth [21:26:09] right. [21:26:09] just adding the key needs auth. [21:26:13] but in this case, I don't think it is worth it [21:26:15] the user that's getting the creds needs auth [21:26:30] the queues are restricted to 512 items anyway, so you can't even DoS the server [21:27:37] Ryan_Lane: btw, I want to work on hipache during wikimania :) [21:27:43] cool :) [21:27:49] I'll be there, I can discuss things with you [21:27:52] Ryan_Lane: yeah! [21:28:15] hm. openstack's call for presenters closes tomorrow [21:28:20] what do I want to give a talk on [21:28:22] Coren: ideas? [21:28:34] I was thinking LDAP integration for private clouds [21:28:45] Ryan_Lane: btw, have you seen the 'bots dispatcher' thread? [21:28:50] I have, yes [21:28:54] I purposely avoided it :) [21:29:01] Ryan_Lane: heh :) [21:29:19] Ryan_Lane: because of the bikeshedding that usually follows anything related to the IRC RC feeds? [21:29:20] I think you should showcase the mediawiki integration; it's kinda neat how you tied the two together. "Integrating openstack in your existing management system" or somesuch? [21:29:38] hm. could do that too [21:29:44] I should really put in for more than one [21:30:21] I'd at least want to clean up the projects page before then :) [21:30:28] that's like 3 months, though, so that's doable [21:31:50] [bz] (8NEW - created by: 2Yuvi Panda, priority: 4Unprioritized - 6normal) [Bug 52297] Setup tools-redis dedicated to redis - https://bugzilla.wikimedia.org/show_bug.cgi?id=52297 [21:43:45] YuviPanda: removed the google refs in wikistream, seems to be running smoothly now http://wikistream.wmflabs.org/ [21:43:51] Ryan_Lane: :) [21:44:01] edsu: cool :) [21:44:13] err, edsu :) [21:44:15] not Ryan_Lane :) [21:44:21] heh [21:44:23] That's just a bit trippy [21:44:46] Ryan_Lane: when we have hipache setup, wikistream can run on toollabs :) [21:44:54] yep [21:44:57] that'll be nice [21:53:23] (03PS1) 10Yuvipanda: Try to erase file whenever app exits. Also never exit cleanly [labs/tools/gerrit-to-redis] - 10https://gerrit.wikimedia.org/r/76837 [22:05:06] (03PS1) 10Yuvipanda: Make android app commits go to #wikimedia-mobile [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76838 [22:05:37] (03CR) 10Yuvipanda: [C: 032 V: 032] Make android app commits go to #wikimedia-mobile [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76838 (owner: 10Yuvipanda) [22:05:48] <^d> !log deployment-prep Memcached moved off of the apache instances to their own dedicated hosts (-memc0 and -memc1). Should have a lot more memc storage now. [22:05:52] Logged the message, Master [22:09:19] &ping [22:09:19] Pinging all local filesystems, hold on [22:09:20] Written and deleted 4 bytes on /tmp in 00:00:00.0005570 [22:09:24] NFS stuck again [22:10:22] Written and deleted 4 bytes on /data/project in 00:01:00.5361310 [22:10:32] ^d: I'm going to bring memc0 down [22:10:35] temporarily [22:10:40] <^d> Mmk. [22:10:50] ephemeral disks are using absurd amounts of disk space [22:10:57] because they are raw disks [22:11:08] I need to recreate their ephemeral disks [22:11:10] (/mnt) [22:11:23] memc0 uses a 160G ephemeral [22:11:49] &ping [22:11:50] Pinging all local filesystems, hold on [22:11:51] Written and deleted 4 bytes on /tmp in 00:00:00.0005430 [22:11:52] Written and deleted 4 bytes on /data/project in 00:00:00.0066440 [22:11:59] hmm, why is my 'git stash' hung then? [22:12:14] nevermind, unhung now [22:13:47] andrewbogott: hm. some change has made it so that I can't ssh into labs instances as root [22:14:00] * andrewbogott tries [22:14:27] hm, working for me [22:14:37] what instance? [22:14:39] to which instance? [22:14:42] most for me [22:14:44] surely the bastions [22:14:51] :) puppet-testing-3 [22:15:04] I can log in there [22:15:39] and, you're right, I'm shut out of bastion-restricted. [22:15:59] oh, wow [22:16:09] there's a file and a directory in /etc/ssh/userkeys/root/.ssh [22:16:14] on puppet-testing-3 [22:16:16] with the same name [22:16:20] how is that even possible/ [22:16:29] hm [22:17:08] root@puppet-testing-3:/etc/ssh/userkeys/root/.ssh/authorized_keys /public/keys/root# pwd [22:17:08] /etc/ssh/userkeys/root/.ssh/authorized_keys /public/keys/root [22:17:11] * Ryan_Lane twitches [22:17:13] whaaaaaat? [22:17:49] that is a self-hosted instance… looks like I can't ssh to regular instances (sample size = 2) [22:18:05] yeah [22:18:07] I can't either [22:20:43] you're seeing a dir called authorized_keys owned by root and a file with the same name owned by ganglia? [22:21:11] Jul 30 22:20:25 i-0000019b sshd[23524]: Authentication refused: bad ownership or modes for file /etc/ssh/userkeys/root/.ssh/authorized_keys [22:21:39] -rw------- 1 998 root 2379 Mar 28 17:06 /etc/ssh/userkeys/root/.ssh/authorized_key [22:21:41] (thanks to salt) [22:21:56] why in the world would that file have that ownership? [22:22:45] oh [22:22:45] wow [22:23:26] no owner defined :) [22:24:30] (03PS1) 10Ryan Lane: Make root the explicit owner of its authorized_keys [labs/private] - 10https://gerrit.wikimedia.org/r/76841 [22:24:35] you'd think that puppet would be smart about this [22:25:20] It's because the files are owned by 'gitpuppet' or something on the server, and it tries to preserve that ownership? [22:25:28] likely, yeah [22:25:31] (03CR) 10Ryan Lane: [C: 032] Make root the explicit owner of its authorized_keys [labs/private] - 10https://gerrit.wikimedia.org/r/76841 (owner: 10Ryan Lane) [22:25:38] (03CR) 10Ryan Lane: [V: 032] Make root the explicit owner of its authorized_keys [labs/private] - 10https://gerrit.wikimedia.org/r/76841 (owner: 10Ryan Lane) [22:26:06] hm, well, easy fix. Weird. [22:27:10] works [22:28:31] yep [22:28:41] fun times [22:56:10] (03PS1) 10Yuvipanda: Do the key generation for registering on the server [labs/tools/gerrit-to-redis] - 10https://gerrit.wikimedia.org/r/76845 [23:03:11] addshore: ping [23:03:42] AzaToth: btw, if you want to review some python code, there's a patch series ^ :) [23:03:52] opens up gerrit-to-redis subscriptions for everyone [23:04:05] ok, I'll look into it asap [23:04:12] just need to poke addshore some [23:04:12] ty [23:04:19] let me merge your nikserv patch now [23:04:20] moment [23:04:42] Coren: wanna poke you too [23:05:03] Hmmm? [23:05:34] Coren: is there any infrastructure on tools to in a secure way save and collect passwords and keys for bots? [23:06:02] I meant passwords like nickserv password for irc bottie [23:06:39] AzaToth: Well, proper permissions in the tool's home is a reasonably secure way to do it; once SUL is complete we'll be able to deploy OAuth, which has some advantages. [23:07:07] yeah, making it o-r should be good enough, methinks [23:07:14] (that's what I do for my current config files) [23:07:24] I meant secure in the way to minimize the risk of loosing it [23:07:46] (03PS3) 10Yuvipanda: adding password [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76634 (owner: 10AzaToth) [23:07:53] (03PS4) 10Yuvipanda: Authenticate via password [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76634 (owner: 10AzaToth) [23:08:27] YuviPanda: "Commit message was updated" still returns the old commit message [23:08:35] AzaToth: gerrit bug :) [23:08:39] @notify addshore [23:08:39] This user is now online in #huggle. I'll let you know when they show some activity (talk, etc.) [23:08:46] AzaToth: there's a bug for it on bugzilla also. No activity tho [23:08:52] k [23:09:57] (03CR) 10Yuvipanda: [C: 032 V: 032] "This doesn't explicitly seem to send nickserv the password, but it does seem to work, so..." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76634 (owner: 10AzaToth) [23:10:46] YuviPanda: it's "server password" [23:11:13] YuviPanda: sadly node-irc doesn't handle certfp [23:11:21] hmm [23:11:25] I've no idea what either of them are :D [23:11:28] this is my first 'IRC bot' [23:11:29] as such [23:11:40] so I'm going to take your word for it, AzaToth :) [23:11:43] should read that RFC someday [23:12:54] (03PS1) 10Yuvipanda: Remove config.yaml. Shouldn't be commited, really [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76848 [23:13:14] (03CR) 10Yuvipanda: [C: 032 V: 032] Remove config.yaml. Shouldn't be commited, really [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76848 (owner: 10Yuvipanda) [23:13:52] (03PS1) 10Yuvipanda: Revert "Authenticate via password" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76849 [23:14:06] (03CR) 10Yuvipanda: [C: 032 V: 032] Revert "Authenticate via password" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76849 (owner: 10Yuvipanda) [23:15:08] (03PS1) 10Yuvipanda: Revert "Revert "Authenticate via password"" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76851 [23:15:32] (03PS1) 10Yuvipanda: Revert "Remove config.yaml. Shouldn't be commited, really" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76852 [23:15:45] (03CR) 10Yuvipanda: [C: 032 V: 032] Revert "Remove config.yaml. Shouldn't be commited, really" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76852 (owner: 10Yuvipanda) [23:16:42] (03CR) 10Yuvipanda: [C: 032 V: 032] Revert "Revert "Authenticate via password"" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76851 (owner: 10Yuvipanda) [23:17:43] YuviPanda: wtf are you doin? [23:17:58] AzaToth: warning about how I should never do anything when sleepy [23:18:18] AzaToth: for some reason thought config only had passwords, made a commit to remove them. Realized I was an idiot, tried to revert it. Reverted wrong patch [23:18:18] * AzaToth slaps YuviPanda with a wet  [23:18:21] then reverted that revert [23:18:26] and then reverted my actual change [23:18:33] now restartin :) [23:18:36] hmm [23:18:45] after that I go to sleep, I promise! [23:18:58] !ping [23:18:59] !pong [23:19:02] ?ping [23:19:08] &ping [23:19:08] Pinging all local filesystems, hold on [23:19:09] Written and deleted 4 bytes on /tmp in 00:00:00.0008770 [23:19:10] Written and deleted 4 bytes on /data/project in 00:00:00.0094000 [23:19:47] okay, now I go to sleep! [23:19:56] (03CR) 10Yuvipanda: "Testing!" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/76851 (owner: 10Yuvipanda) [23:20:00] yup. works [23:20:01] nite! [23:20:09] nite