[00:34:51] Coren: https://wikitech.wikimedia.org/wiki/New_Project_Request/james [00:35:06] he's really looking to join the tools project [00:36:42] So he is. Have you responded to him anywhere? [00:36:50] on the discussion page [00:41:45] When DB replication arrives will it only be in the tools project? [00:42:02] Or will labs users not in tools be able to use it as well? [00:43:09] our first goal is tools for ams hackathon [00:43:21] but I think it'll be very soon after we'll have it for all projects [00:43:26] ok. gotta run [00:54:02] Coren (and Ryan_Lane and anyone else): I have to say, Tool Labs is mostly working well for AnomieBOT. [00:54:28] anomie: That... is the intent. :-) I'm glad it works. [00:54:38] "Mostly"? Irritants to fix? [00:58:00] Coren: At somewhere between 2013-04-30 17:44 and 17:30, an error "Lost connection to MySQL server during query" killed one of the bot processes. And a few times (most recently probably around 2013-04-25 18:10) the bot has been killed for no readily-apparent reason (I submit the bot's job with -notify which sends SIGUSR2 a minute before terminating the bot processes, although due to an oversight in my code the USR2 was itself killing the "launcher [00:58:00] " process) [00:59:49] Is that UTC? [01:00:21] Server local time, which looks like is UTC [01:01:06] 2013-04-25 18:10 is expected; there was a planned maintenance window. :-) (Are you subscribed to labs-l?) [01:01:27] cat /etc/timezone [01:01:42] Oh no, another mailing list to subscribe to. I'll go do that now. [01:01:55] It's fairly low-volume. :-) [01:02:08] I also announced it on toolserver-l though. [01:02:21] * Coren checks the other outage. [01:03:24] * anomie subscribed now [01:10:25] anomie: I'm not seeing anything with the DB itself; no interruptions or errors in the log whatsoever. [01:11:01] Coren: Could have been a network hiccup between the bot process and the server, I suppose. [01:11:11] Still bugs the hell out of me. [01:11:53] On the positive side, your bot cleanly restarted after a complete restart of the entire grid, as expected. [01:12:36] Keep an eye out for that error again? I'll want to try to catch it "in the act" [01:12:55] When was the restart? [01:14:26] Ok. The bot logs its errors to /data/project/anomiebot/bot.err, although most of that file is going to not be stuff you'd be concerned about [01:24:46] 04/25/2013 18:57:18 [01:25:55] Hmph. http://bugs.mysql.com/bug.php?id=27613 [01:29:29] I don't think the bot actually came back up after that one, I had to log in and restart it manually. Possibly because the USR2 killed the bot process. I'll keep an eye on it for the next time. [01:32:46] hi, is it possible to assign a big block of IPs (private) to a lab instance [01:33:25] I hope there will never be a next time needing to shut down the whole grid; but switching filesystems will do it. :-) [01:33:25] but those individual ips must be visible to the production cluster [01:33:48] yurik: Probably; this will need manual intervention though. [01:34:09] are the labs instances on the same private network as the cluster? [01:34:48] Coren, my goal is to imitate calls from different IPs to the production [01:34:57] but those IPs must always be the same [01:35:10] yurik: It's not the same segment, but they're interconnected so it's workable. [01:35:12] so I would reserve a 1000 ips to my name :) [01:35:36] Coren, how would i go about setting it up? [01:36:03] yurik: It's probably possible you give you a /19 or /20, but the right person to ask is Leslie. [01:36:11] The mistress of Networks. :-) [01:36:21] hehe, i suspect she has gone home [08:15:07] !ping [08:15:07] pong [08:31:29] petan / wm-bot : http://www.youtube.com/watch?v=glC9_8Ijt9k&t=0m51s [08:36:20] ori-l http://www.youtube.com/watch?v=ga6zAEB9fOM [08:37:21] petan: i'm not sure why, but the pairing with 'yellow submarine' is very funny [08:38:12] [bz] (NEW - created by: Antoine "hashar" Musso, priority: Unprioritized - normal) [Bug 47980] udp2log logs are recreated as root:root preventing logging - https://bugzilla.wikimedia.org/show_bug.cgi?id=47980 [09:19:11] @requests [09:19:11] There are no shell requests waiting [09:52:32] You can manipulate strings on the cluster *now*, so either [09:52:33] staff was wrong or you misunderstood something. [09:52:36] hehe [10:12:08] !log wikidata-dev update dev system settings to use $wgWBRepoSettings and $wgWBClientSettings [10:12:10] Logged the message, Master [12:12:43] http://en.wikipedia.beta.wmflabs.org/ is broken: Original exception: exception 'MWException' with message 'LBFactory_Multi::newExternalLB: Unknown cluster "extension1"' in /data/project/apache/common-local/php-master/includes/db/LBFactory_Multi.php:163 [12:53:58] !sh [12:53:58] http://bit.ly/10eZZoa [12:54:05] Damianz: hello [12:58:15] @notify Damianz [12:58:15] This user is now online in #wikimedia-tech so I will let you know when they show some activity (talk etc) [12:58:18] http://en.wikipedia.beta.wmflabs.org/ is broken: Original exception: exception 'MWException' with message 'LBFactory_Multi::newExternalLB: Unknown cluster "extension1"' in /data/project/apache/common-local/php-master/includes/db/LBFactory_Multi.php:163 [12:59:14] Warning: There is 1 user waiting for shell: Deyan (waiting 0 minutes) [12:59:29] hashar: not sure whether to ping you or not about ^^^, but you fixed the "en.wp labs unaccessible" issue last time (though this time it does not look to me like gluster fun) [13:00:27] andre__: ouch! please fill a bug against Wikimedia Labs > deployment-prep (beta) [13:00:35] andre__: that is a mediawiki-config issue I guess [13:00:41] extension1 does not exist on beta [13:02:52] alright, will do [13:05:02] [bz] (ASSIGNED - created by: Antoine "hashar" Musso, priority: Normal - normal) [Bug 47980] [OPS] udp2log logs are recreated as root:root preventing logging - https://bugzilla.wikimedia.org/show_bug.cgi?id=47980 [13:05:03] [bz] (NEW - created by: Andre Klapper, priority: Unprioritized - critical) [Bug 47992] http://en.wikipedia.beta.wmflabs.org/ broken: Unknown cluster "extension1" - https://bugzilla.wikimedia.org/show_bug.cgi?id=47992 [13:08:41] * hashar digs in configuration [13:12:48] Warning: There is 1 user waiting for shell: Deyan (waiting 13 minutes) [13:14:12] andre__: should be fixed [13:15:45] thx for the bug! [13:16:30] hashar, yes. thanks! [13:23:53] @notify binasher [13:23:53] I will notify you, when I see binasher around here [13:23:56] Hello labs [13:24:08] what's wrong with /notify ? :) [13:24:37] paravoid :calvino.freenode.net 421 petan NOTIFY :Unknown command [13:24:43] that is :D [13:26:09] !report [13:26:09] to report new bug open: https://bugzilla.wikimedia.org/enter_bug.cgi [13:26:13] :) [13:26:18] Warning: There is 1 user waiting for shell: Deyan (waiting 27 minutes) [13:26:26] @infobot-detail report [13:26:26] Info for report: this key was created at N/A by N/A, this key was displayed 1 time(s), last time at 5/2/2013 1:26:09 PM (00:00:16.5173310 ago) this key is normal [13:26:29] aha [13:26:33] but this is not my key [13:26:37] !rep [13:26:37] to report new bug open: https://bugzilla.wikimedia.org/enter_bug.cgi [13:26:50] !report del [13:26:51] Successfully removed report [13:27:12] !report is to report new bug open https://bugzilla.wikimedia.org/enter_bug.cgi?product=Wikimedia%20Labs [13:27:12] Key was added [13:27:16] @infobot-detail report [13:27:16] Info for report: this key was created at 5/2/2013 1:27:12 PM by petan, this key was displayed 0 time(s), last time at N/A this key is normal [13:27:21] !report [13:27:22] to report new bug open https://bugzilla.wikimedia.org/enter_bug.cgi?product=Wikimedia%20Labs [13:27:59] @search bugz [13:27:59] Results (Found 5): bz, osm-bug, bug, rb, report, [13:28:05] !rb [13:28:05] broken? report a bug: https://bugzilla.wikimedia.org/enter_bug.cgi?product=Wikimedia%20Labs [13:28:11] this is my key :D [13:39:44] Warning: There is 1 user waiting for shell: Deyan (waiting 40 minutes) [13:53:14] Warning: There is 1 user waiting for shell: Deyan (waiting 54 minutes) [14:04:03] hello could anyone tell me something about transcoding queues? how does it work and all? [14:06:44] Warning: There are 2 users waiting for shell, displaying last 2: Deyan (waiting 67 minutes) Oantnri (waiting 2 minutes) [14:08:43] * hashar is shutting down Jenkins for a while. NO ETA [14:09:19] [bz] (NEW - created by: Chris McMahon, priority: Unprioritized - major) [Bug 47892] internal error for new user creating account and logging in - https://bugzilla.wikimedia.org/show_bug.cgi?id=47892 [14:20:18] Warning: There are 2 users waiting for shell, displaying last 2: Deyan (waiting 81 minutes) Oantnri (waiting 15 minutes) [14:33:49] Warning: There are 2 users waiting for shell, displaying last 2: Deyan (waiting 94 minutes) Oantnri (waiting 29 minutes) [14:47:23] Warning: There are 2 users waiting for shell, displaying last 2: Deyan (waiting 108 minutes) Oantnri (waiting 42 minutes) [15:27:20] * jeremyb_ spies a sumanah "OFFLIST" :-P [15:27:25] also, hi :) [15:27:27] YES [15:27:28] FAIL [15:27:32] EMBARRASSMENT OF THE YEAR [15:27:34] argh [15:29:22] where? [15:29:32] toolserver-l [15:29:39] * jeremyb_ is reading halfak's message now [15:30:13] that's not a huge fail :) [15:30:39] i agree it's not the end of the world. but it does at least contradict itself [15:30:51] oh it could have been much worse [15:31:49] ok, yes, you're right, it could have been A LOT worse [15:32:02] paravoid: http://lists.wikimedia.org/pipermail/gendergap/2013-May/003567.html ? :-) [15:32:05] I think both times in the past year that I've accidentally sent an OFFLIST mail onlist [15:32:18] have been "thank you" notes [15:43:15] Hi Andrew. Can we talk about the logo update? [15:48:03] The file /modules/mediawiki_singlenode/templates/labs-localsettings includes the orig/LocalSettings.php file and then overwrites the $wgLogo variable. Two lines need to be switched in labs-localsettings. The file should set the $wgLogo and then include the orig/LocalSettings.php [16:01:07] slevinsky: That makes sense… let me look. [16:05:48] slevinksy: Bah, it looks like the default orig/LocalSettings.php also defines a logo, so moving the puppet line above doesn't quite work. Hm... [16:12:29] hello petan [16:13:30] is it considered abuse to serve apps on webservers provided by Bots project, and I should use Webtools instead? [16:18:38] slevinski, this should help, right? https://gerrit.wikimedia.org/r/#/c/61994/ [16:19:58] liangent: Alternately, the tools project supports combined web/bot tools. [16:20:46] Coren: Are jsub & Co. in a repo somewhere? [16:22:36] scfc_de: labs/toollabs in gerrit [16:22:46] https://gerrit.wikimedia.org/r/#/admin/projects/labs/toollabs [16:23:06] Warning: There is 1 user waiting for shell: Elad (waiting 0 minutes) [16:23:43] Coren: Thanks. [16:28:04] Coren: so is it going to be a rework of toolserver? [16:29:10] how can I request being added? [16:29:35] liangent: It's a new environment "strongly inspired" by the toolserver. The infrastructure is all new and built from scratched, but designed to be close to the TS so that maintainers don't have to start from scratch. [16:29:47] liangent: Just tell me your wikitech username and you just have. :-) [16:30:03] Coren: liangent [16:30:45] liangent: Done. There is a reasonably useful guide at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [16:31:03] liangent: And I or petan are available for help almost all the time on this channel. [16:32:00] Coren: so now it's a question, which one should I choose when I want to set up something? - bots or tools? [16:32:27] andrewbogott: Is the change because of the ASL Wikipedia Project or other sites on Labs too? [16:33:16] slevinksi: It was labs-wide. The above patch should resolve your issue… it'll take an hour or so before I can roll it out though. [16:33:29] thanks [16:33:46] liangent: Bots is the right place for more experimental stuff where flexibility is needed; it's more of the "dev/experimental" environment and is entirely volunteer-supported. Tools is the "stable/semi-prod" environment where stability is the primary objective, but has a slightly more restrictive environment and disciplined change management. [16:34:08] liangent: So it depends on your needs. [16:35:05] liangent: I would say that unless you are doing something really special, tools is the "low maintenance" option since it has dedicated support. [16:36:41] Warning: There is 1 user waiting for shell: Elad (waiting 13 minutes) [16:37:54] Coren: on tools project, is it enforced that each tool should live in its own account? [16:38:01] Hey Coren [16:38:08] or can I create some tool called "liangent's tools" [16:38:27] Rember that cgi pthyon script you made for me to get started? [16:38:52] liangent: You can create a multi-tool if you want; it may make things more complicated if you ever want to spin off one or more of them to other maintainers in the future though. [16:38:55] you added a line in there to write the stderr to a file [16:39:00] sys.stderr = open('/data/project/flask-stub/public_html/traceback.htm', 'w') [16:39:03] liangent: But a tool account is, like, 3 clicks away. :-) [16:39:13] greenrosetta: Yes? [16:39:44] I'm trying to do something similar so that "print" statements go to a file as well so I can debug with the browser instead of the terminal [16:40:04] Ah, you just want to change sys.stdout then. :-) [16:40:07] would stdout work? [16:40:21] ah... [16:40:22] :D [16:40:36] Coren: for example on toolserver, I created a django project for all my tools and every tool is an app in it [16:40:54] there's also some meta-app holding code shared by others [16:40:59] But there is a gotcha, though: if you print before Flask outputs the headers, it could garble the response completely. You'll have to experiment a bit. [16:42:40] liangent: Yeah, you could do the same here; create a django tool account and put everything in it. Alternately, if your meta-app's architecture lends itself to it, you could still split them in several tool accounts if the permissions are set right (every tool account is a "normal" user so you can allow cross-use) [16:43:24] liangent: That might be overly complicated for your needs, though. Just the one tool account will do, so long as you understand that this means you can only set maintainers for all of them and not just some of them, (which may well be all you need) [16:48:45] Coren: btw do you allow 'running large web applications'? [16:50:11] Warning: There is 1 user waiting for shell: Elad (waiting 27 minutes) [16:56:37] liangent: Define 'large web application'? Do you mean an application server a la tomcat? [16:58:17] Coren: there's one saying " Large web applications (e.g. phpMyAdmin, MediaWiki, etc.) may not be installed. " on https://wiki.toolserver.org/view/Rules [16:58:47] there's also http://lists.wikimedia.org/pipermail/toolserver-l/2005-December/000086.html [16:59:51] liangent: Do you want to set up a wiki? [16:59:59] Ah, that's actually rather a bad way of saying what the intent of the rule is. The point is that "management" applications are generally insecure and problematic for a number of reasons, and should not be allowed on any instance for those reasons; phpMyAdmin is still not allowable, for instance. [17:00:17] liangent: But Mediawiki is perfectly allowable. In fact, we already have one running. [17:00:43] scfc_de: Coren: I may want to install a mediawiki with special pages used as web tools [17:01:01] liangent: http://tools.wmflabs.org/peachy/wiki/ [17:01:15] liangent: That's quite allowable. [17:02:52] liangent: Web applications have only two enforced requirements: (1) they must be unprivileged and (2) they cannot have long-running processes unless those are sent to the compute grid rather than on the webserver/ [17:03:34] what does "be unprivileged" mean? [17:03:40] running under local-* account? [17:03:47] Warning: There is 1 user waiting for shell: Elad (waiting 40 minutes) [17:03:49] liangent: Right. [17:04:25] liangent: That's actually a given on this architecture, you couldn't not do that if you tried. :-) But it means you can't run a webserver on -login, for instance, or on the grid. [17:11:52] Coren: maybe it's possible to call sudo with a hardcoded password? [17:11:56] I haven't tried it [17:17:16] liangent: For sudo you need the password of the "originating" user, and tools don't have any so this shouldn't work. [17:17:18] Warning: There is 1 user waiting for shell: Elad (waiting 54 minutes) [17:17:51] But su works apparently. [17:20:20] scfc_de: It would, if we used password authentication. :-) [17:20:45] Coren: Do you mean su or sudo? [17:23:35] scfc_de: su. sudo we use, but with NOPASSWD: and only in selected directions. [17:27:48] Coren: Well, I *can* "su - scfc" from local-dbreps. Is this supposed to be disabled? [17:30:28] scfc_de: ... what? [17:30:55] Warning: There is 1 user waiting for shell: Elad (waiting 68 minutes) [17:31:14] scfc_de: I'm not clear how that's supposed to be /possible/. Can you tell me what the exact command sequence is that gets you there? [17:31:38] I mean, from scfc to local-dbreps and back? [17:33:36] Coren: "ssh scfc@tools-login.wmflabs.org", "become dbreps", "su - scfc", enter my (scfc's) labs password -> "scfc@tools-login:~$" [17:34:31] Oh! Okay. *phew* [17:34:42] Coren: Works for "become wikilint" as well. [17:34:44] I thought you meant it worked without password. :-) [17:35:05] Then I would be root now :-). [17:35:16] scfc_de: Hence my... unease. :-) [17:37:38] I thought "It would, if we used password authentication." meant that either su is supposed to be disabled or not allowing password authentification, which it is not. [17:37:45] Anyway, till later. [17:42:40] scfc_de: It isn't, for /tool accounts/ :-) [17:44:26] Warning: There is 1 user waiting for shell: Elad (waiting 81 minutes) [17:58:01] Warning: There are 2 users waiting for shell, displaying last 2: Elad (waiting 95 minutes) Corvus (waiting 11 minutes) [18:11:33] Warning: There are 2 users waiting for shell, displaying last 2: Elad (waiting 108 minutes) Corvus (waiting 24 minutes) [18:50:13] Coren: what about bug 47900? [18:50:59] giftpflanze: Not for a couple of days; DB replication has priority atm [18:51:03] ok [20:10:22] using oursql, how do you get the number of rows returned in a query? [20:27:49] hi Ryan_Lane [20:27:56] howdy [20:28:02] :) [20:28:20] bah... everyone is quiet this night [20:28:32] labs sleep? :P [20:30:41] addshore is your bot even running? [20:30:49] addshore there used to be heavy load on bots [20:49:24] !log tools petrb: uploaded motd to exec-N as well, with information which server users connected to [20:49:26] Logged the message, Master [20:53:49] greenrosetta: shouldnt you just do a select count(*)? [20:54:00] but if for some reason you need to do it in python, it would be [20:54:15] cursor.execute('SELECT blah...') [20:54:20] data = cursor.fetchall() [20:54:23] len(data) [20:54:54] legoktm, I'm opposing your RfA for incessently pinging on IRC. :p [20:55:22] heh [20:57:46] legoktm link me [20:57:48] to rfa [20:57:55] er [20:57:59] * petan is lazy [20:58:05] i'll pm you [20:58:11] lol [20:58:16] Do you have the reporter? [20:58:19] petan, ^ [20:58:24] Just use that to get there. [20:58:53] !rfa is legoktm is having secret rfa cabal... ssssh [20:58:53] Key was added [20:58:58] * Ryan_Lane grumbles ganglia.wmflabs.org is down [20:59:08] lol [20:59:15] For all of you out there, this is NOT convassing. :p [20:59:36] I need to make that instance larger [20:59:38] petan, give me some permissions. :p [20:59:46] where [20:59:51] so, if I do a resize it may just die [20:59:52] to wm-bot [20:59:56] ah [21:00:00] did you read doc? [21:00:06] doc? [21:00:10] What's up doc? [21:00:13] !wm-bot [21:00:13] http://meta.wikimedia.org/wiki/WM-Bot [21:00:16] this [21:00:44] Successfuly added .*@wikipedia/Cyberpower678 [21:00:44] @trustadd .*@wikipedia/Cyberpower678 admin [21:00:49] try it [21:00:52] don't break it [21:01:02] @infobot-snapshot Cyberpower678 [21:01:02] Snapshot snapshots/#wikimedia-labs/Cyberpower678 was created for current database as of 5/2/2013 9:01:02 PM [21:01:10] better safe XD [21:01:20] petan, I'm an admin now? [21:01:26] in this channel yes [21:01:35] ugh. it can't even stay alive now [21:01:39] I may as well do a resize [21:01:54] worst case it dies. it's no worse than now [21:02:07] Cyberpower678 I thought you read the docs so you know all :D [21:02:24] * Cyberpower678 is reading it now. [21:02:39] Ryan_Lane ganglia runs on instance? [21:02:44] yeah [21:02:56] is it possible to get access there? I might help you keeping it alive :P [21:03:09] it's dying due to memory [21:03:22] but yeah, I can give you access [21:03:22] enable swap? :P [21:03:25] ewww [21:03:25] no [21:03:32] @labs-project-instances ganglia [21:03:33] Following instances are in this project: aggregator1, ganglia-test2, aggregator-test1, aggregator2, [21:03:36] just initiated a resize [21:03:44] so, hopefully it doesn't fail [21:04:02] @labs-info aggregator1 [21:04:02] [Name aggregator1 doesn't exist but resolves to I-0000010c] I-0000010c is Nova Instance with name: aggregator1, host: virt6, IP: 10.4.0.79 of type: m1.medium, with number of CPUs: 2, RAM of this size: 4096M, member of project: ganglia, size of storage: 50 and with image ID: lucid-server-cloudimg-amd64.img [21:04:12] 4gb ram? [21:04:32] yeah [21:04:32] well, which instance the ganglia itself live on? [21:04:35] upping it to 8 [21:04:38] aggregator1 [21:04:42] what is 2? [21:04:51] the 2nd aggregator [21:04:58] @recentchange-off [21:05:06] which may not even be used [21:05:11] @recentchanges-off [21:05:11] Feed disabled [21:05:19] @recentchanges-on [21:05:19] Feed is enabled [21:05:23] cool [21:05:34] wm-bot: he loves u :> [21:05:34] Hi petan, there is some error, I am a stupid bot and I am not intelligent enough to hold a conversation with you :-) [21:05:50] !log ganglia resizing aggregator1 to m1.large [21:05:52] Logged the message, Master [21:05:59] Ryan_Lane is it possible to scale ganglia to multiple nodes? [21:06:04] or it can be on 1 box only [21:06:26] I'd let everyone do resizes, but they have often resulted in dead instances [21:06:40] it's not really possible [21:06:45] aha [21:06:51] and memory is only an issue because we need to run so many copies of gmond [21:06:55] is it possible to resize disk storage? [21:06:59] we should probably adjust the script [21:07:03] aha [21:07:13] what is it written in? [21:07:13] to not run gmond for projects with no instances [21:07:16] c? [21:07:16] python [21:07:18] oh [21:07:22] the script is [21:07:26] no clue what ganglia is written in [21:07:29] petan, thanks for that. I'll be responsible. [21:07:32] well, that eat some memory :P [21:07:36] Cyberpower678 no worries [21:07:50] the script just runs occasionally [21:08:03] Cyberpower678 it's just a bot ;) [21:08:07] ok [21:08:21] it's the 165 gmond processes that eat the memory [21:08:27] I could implode it though. :p [21:08:39] well maybe killing unused aggregator 2 and enlarging the primary instance would do [21:08:48] no need to kill the second [21:08:52] I just resized the primary [21:08:59] petan, admin rights aren't for everyone though. [21:09:03] well, just to safe some resources :P if it isn't used at all [21:09:03] I want to see if aggregator2 is even being used [21:09:04] it may not be [21:09:08] * save [21:09:34] seems everything is just pointing at one ost [21:09:38] *host [21:09:39] \ [21:09:52] aggregator2 can die [21:10:14] * Cyberpower678 is finishing his new script for Cyberbot II [21:10:28] I am killing some boxes in bots project too, is doesn't need to be large as people are moving to tools [21:10:35] * Ryan_Lane nods [21:11:06] also since addshore optimized his bot usage dropped by 90% :P [21:11:14] hahaha [21:11:15] nice [21:14:10] petan: added you to ganglia project [21:14:17] ok [21:14:25] thank you [21:14:29] yw [21:14:41] all the stuff that controls it is in ganglia [21:14:50] production is actually pretty similar to labs now [21:14:52] that is aggregator1? [21:15:04] yeah. it's resizing right now [21:15:07] hopefully successfully :) [21:15:14] what the other instances are for [21:15:17] nothing [21:15:19] I deleted them [21:15:32] ok [21:15:40] @labs-project-instances ganglia [21:15:40] Following instances are in this project: aggregator1, [21:15:46] they are probably on virt6 so they are taking a while [21:19:56] Coren if someone ssh to exec node and start a process by hand [21:20:04] is there any memory restriction on it? [21:26:29] * legoktm feeds petan a plateful of cookies [21:27:20] * petan eats. [21:28:36] when connecting to irc.wikimedia.org from tools i get "Too many user connections (global)" but i somehow think that cannot be solved in labs but only within the irc module, but anyhow i try here … [21:29:16] OHO [21:29:23] !rb | giftpflanze [21:29:23] giftpflanze: broken? report a bug: https://bugzilla.wikimedia.org/enter_bug.cgi?product=Wikimedia%20Labs [21:29:32] this needs to be fixed on ircd side [21:29:45] any idea what's even broken? [21:29:50] yes [21:29:56] there is a user limit on ircd [21:29:59] max user count per ip [21:30:02] ah [21:30:05] ok [21:30:09] thx :) [21:30:11] you can specify exception for labs [21:30:20] but for that one needs to have access to box where ircd run [21:30:23] which is not me [21:38:50] Warning: There is 1 user waiting for shell: Jakub Vrána (waiting 0 minutes) [21:42:37] !sh [21:42:38] http://bit.ly/10eZZoa [22:08:12] Damianz: hello [22:20:53] Coren: when i use the grid engine from cron it sends me emails that my "job has been submitted". is that intentional? is it to be redirected to /dev/null everytime? will there be a special version for cron? [22:21:41] Hm. I hadn't considered the annoyance of the email; I'll add a -quiet option to tell it to stfu. :-) [22:21:54] hm, ok [22:22:06] petan: No, which is why this will be made impossible soon. [22:22:28] well, how do you want to make it impossible? [22:23:03] if you restrict access to boxes, people won't be happy :o [22:23:15] petan, who owns wm-bot? [22:23:20] what will be impossible? [22:23:21] Cyberpower678 community? [22:23:39] Then we can't use it. :( [22:23:50] petan: Happiness is unimportant on tools vs stability. :-) [22:23:58] tbh wm-bot is a free bot, nobody owns it XD [22:24:06] Cyberpower678 use what? [22:24:11] wm-bot [22:24:16] use for what [22:24:27] Coren well, but important for users ;) [22:24:42] we were considering adding it to #wikipedia-en-accounts, but there are too many security flaws. [22:24:53] petan: They'll be happy the cluster is rock-solid. :-) [22:24:53] security flaws in what [22:25:14] Coren if you think... who knows [22:25:51] But honestly, petan, why would they even care? There's nothing they can do there they can't do on -login, and plenty they couldn't. :-) [22:25:51] all these restrictions bring some complexity into operation of tools [22:26:10] they can check how the bot behaves [22:26:20] check exact values of ram it uses, cpu usage, threads etc [22:26:25] qstat [22:26:27] :-) [22:26:32] qstat tell you nothing [22:26:45] how you get a number of threads your process is running in qstat? [22:26:48] or cpu usage? [22:26:48] It tells you "exact values of ram it uses, cpu usage" :-) [22:26:57] huh really? [22:27:00] qacct? [22:27:00] Maybe threads, I didn't check /that/ [22:27:08] No, qstat, if you specify a job. [22:27:12] petan, it's a private channel that handles and discusses private information daily. Such as user IP addresses and emails. We couldn't let a bot in that anybody could access. [22:27:27] some flags to turn on? [22:27:30] usage 1: cpu=00:00:07, mem=0.69950 GBs, io=0.00665, vmem=107.766M, maxvmem=107.766M [22:27:39] qstat -j [22:27:47] Cyberpower678 well, being owned by community doesn't really mean anyone can access it :P [22:27:54] oh [22:28:07] Cyberpower678 they use it in #wikipedia-zh-admin and that is private channel as well, but it's up to you of course [22:28:07] Who can access it? [22:28:17] everything can be hacked ;) [22:28:27] me, Ryan, Damian... maybe addshore [22:28:40] Then it can't join [22:28:42] list of maintainers is on @help [22:29:07] poor bot :'( [22:29:17] They can use it to log into the channel. [22:30:03] why is that channel so secret? [22:30:10] some cabal? [22:30:11] Because the WMF will probably be pissed since they apparently consider ACC data to be as secure as CU/OS data :/ [22:30:11] :P [22:30:20] or damn close to it [22:30:38] ok, then probably nothing labs-hosted is appropriate there ;) [22:30:51] petan, it's a private channel that handles and discusses private information daily. Such as user IP addresses and emails. We couldn't let a bot in that anybody could access. [22:30:53] you will need own bot [22:32:10] this bot is supposed to be open source maintained by the community of developers, just as almost everything on wikimedia project IMHO should be, it's not CIA / FBI compatible ;) [22:33:38] Coren that is cool didn't know that [22:34:14] the ACC tool itself is also open sourced [22:34:21] indeed. please no private data in labs :) [22:34:24] ok that probably removes some needs to be able to ssh, for other stuff it is always possible to create some tools [22:34:52] FunPika ok but I hope it's not hosted on labs :P [22:35:06] * FunPika is still wondering how stwalkerster's idea of moving the ACC tool from toolserver to labs, including the database with the private info is ever going to happen <_< [22:35:07] you probably shouldn't log that channel at all [22:35:18] wait, wait, wait [22:35:27] I'm going to PM you. is that ok? [22:35:30] I think we were only planning on using it to watch changes of a few related pages on enwiki [22:35:46] yeah i guess [22:36:11] FunPika create another channel, like #wikipedia-acc-feed and put the bot there? [22:36:25] idk of any more secure solution [22:37:56] giftpflanze, Coren: cron sends mail? Or do you mean the log file? [22:38:08] petan, we just stuck it in #wikipedia-en-accounts-unreg and #wikipedia-en-accounts-devs [22:38:14] cron does send mail [22:38:17] ok [22:38:18] which log file? [22:48:29] scfc_de: By default, cron dumps any stderr or stdout to local mail. [22:49:04] Coren: http://tools.wmflabs.org/flask-stub [22:49:32] greenrosetta: Ooo. Nice! [22:50:10] click about... brings you to the project page [22:50:50] I was thinking it might be nice to have an admin be able to run this with "Foo" and then it would create a "Foo" project based off this [22:50:51] heh. bitbucket? [22:51:08] beats SSH [22:51:10] we really need to get something setup in gerrit [22:51:20] so that tool authors can create their own repos [22:51:25] i just like working locally and want a few keypresses to get it online [22:52:31] greenrosetta: It's really great that you made a skeleton like this available. It's going to be very instructive for newbies. [22:52:45] thats the point [22:53:11] * Ryan_Lane sighs [22:53:13] I was thinking this might be a good template where people could write their own specic query tool and just add a new page [22:53:15] I killed aggregator1 [22:53:31] we really need to upgrade openstack [22:53:34] resize is totally unreliable in this version [22:53:54] Have you penciled in an upgrade time? [22:54:06] no. I need to do some testing in eqiad first [22:54:20] and I need to update puppet [22:54:34] Coren: I know, but mail is disabled on Tools, isn't it? [22:54:40] was actually planning on doing the puppet stuff today [22:55:02] scfc_de: Mail /forwarding/ is, atm, but local mail works. [22:55:38] Coren: "echo Test | mail -s Test scfc", "mail" => "No mail for scfc". [22:55:39] scfc_de: Legal has okayed getting some domain name to dedicate to the Tool Labs, though, so we'll get proper mail soon. [22:56:10] scfc_de: I think only system processes can deliver atm. We don't have an actual mx installed. :-) [22:57:03] hm. maybe I can fix aggregator1 [23:00:35] Coren: "echo '* * * * * /bin/echo test' | crontab" doesn't generate mail either. giftpflanze: Do you use those qsub parameters that send mail when a SGE job is started/finished/etc.? [23:01:49] scfc_de: Huh. I hadn't actually tested it and just took guftplanze at his word. :-) [23:01:55] /var/mail is empty as well at least on tools-login. [23:02:14] live damn you LIIIIIIIIVVVEEEEE [23:02:24] scfc_de: Which is reasonable since I didn't install any MX. :-) [23:02:25] * Ryan_Lane gives aggregator1 chest compressions [23:02:48] well, it's pinging [23:03:12] Coren: g*i*ftpflanze ("poisonous plant"), and I believe he is a she :-). [23:03:15] heh. well, the database thinks it's been resized [23:03:19] it, of course, has not [23:05:46] well, I'll fix that :) [23:06:05] * Ryan_Lane edits the files manually and undefines/redefines the instance [23:06:57] and now it has 8GB :) [23:11:32] [19:03:37] Yeah, the 'u' was a typo. Didn't know about gender though. :-) [23:11:32] [19:05:16] Ryan_Lane: Yeay! Celebrate by giving https://gerrit.wikimedia.org/r/#/c/59969/ a +2! :-P [23:12:41] hm [23:18:11] Coren: I'll take a look when I get a chance :) [23:37:37] \o/ [23:37:43] ganglia.wmflabs.org is back up [23:37:47] and shouldn't die any more :) [23:39:26] Up and not dying is good.