[00:46:41] [bz] (8NEW - created by: 2Damian Z, priority: 4Unprioritized - 6normal) [Bug 40943] Fix the instance types - https://bugzilla.wikimedia.org/show_bug.cgi?id=40943 [00:47:40] that's a lot of color [01:05:53] PROBLEM Total processes is now: WARNING on bots-salebot i-00000457.pmtpa.wmflabs output: PROCS WARNING: 172 processes [01:10:52] RECOVERY Total processes is now: OK on bots-salebot i-00000457.pmtpa.wmflabs output: PROCS OK: 94 processes [01:34:32] RECOVERY Total processes is now: OK on nova-precise1 i-00000236.pmtpa.wmflabs output: PROCS OK: 146 processes [02:21:40] !project webplatform [02:21:40] https://labsconsole.wikimedia.org/wiki/Nova_Resource:webplatform [02:37:42] RECOVERY Free ram is now: OK on bots-sql2 i-000000af.pmtpa.wmflabs output: OK: 21% free memory [03:10:43] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af.pmtpa.wmflabs output: Warning: 15% free memory [03:33:52] PROBLEM Free ram is now: WARNING on ipv6test1 i-00000282.pmtpa.wmflabs output: Warning: 19% free memory [03:38:53] RECOVERY Free ram is now: OK on ipv6test1 i-00000282.pmtpa.wmflabs output: OK: 22% free memory [03:49:13] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f.pmtpa.wmflabs output: Warning: 14% free memory [04:09:12] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f.pmtpa.wmflabs output: Critical: 4% free memory [04:19:12] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f.pmtpa.wmflabs output: OK: 95% free memory [04:59:12] RECOVERY Disk Space is now: OK on testing-arky i-0000033b.pmtpa.wmflabs output: DISK OK [05:04:53] PROBLEM Disk Space is now: WARNING on echo-xmpp i-00000351.pmtpa.wmflabs output: DISK WARNING - free space: / 567 MB (5% inode=91%): [05:06:53] PROBLEM Free ram is now: WARNING on ipv6test1 i-00000282.pmtpa.wmflabs output: Warning: 18% free memory [05:07:13] PROBLEM Disk Space is now: WARNING on testing-arky i-0000033b.pmtpa.wmflabs output: DISK WARNING - free space: / 73 MB (5% inode=51%): [05:39:53] PROBLEM Free ram is now: WARNING on ipv6test1 i-00000282.pmtpa.wmflabs output: Warning: 18% free memory [06:44:52] RECOVERY Disk Space is now: OK on echo-xmpp i-00000351.pmtpa.wmflabs output: DISK OK [06:52:13] RECOVERY Disk Space is now: OK on testing-arky i-0000033b.pmtpa.wmflabs output: DISK OK [06:54:13] RECOVERY Disk Space is now: OK on conventionextension-trial i-000003bf.pmtpa.wmflabs output: DISK OK [11:50:23] Is it possible to create a new project name "Wikisource" in order to migrate some Toolserver tools that are used by Wikisources like https://toolserver.org/~tpt/wsexport/book.php and https://toolserver.org/~phe/ ? [12:10:47] Hi Hydriz [12:14:29] hi Jan_Luca [12:16:55] Hydriz: You do not know to query Gerrit with a script without a user, do you? [12:17:18] whua? [12:18:02] I want to query Gerrit in a script and the only thing I found is: https://gerrit.googlecode.com/svn/documentation/2.1.5/cmd-query.html [12:18:22] oh, yeah, I don't know how to query Gerrit, sorry :( [12:20:36] Then I have to ask ^demon or sumanah to create a Gerrit user for the CentralAuth-project ... [12:25:31] Jan_Luca: What do you need to query? [12:32:13] PROBLEM Disk Space is now: WARNING on conventionextension-trial i-000003bf.pmtpa.wmflabs output: DISK WARNING - free space: / 78 MB (5% inode=51%): [13:04:33] PROBLEM Total processes is now: WARNING on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS WARNING: 189 processes [13:09:32] RECOVERY Total processes is now: OK on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS OK: 96 processes [13:36:13] PROBLEM Disk Space is now: WARNING on testing-arky i-0000033b.pmtpa.wmflabs output: DISK WARNING - free space: / 78 MB (5% inode=51%): [14:22:42] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5.pmtpa.wmflabs output: Warning: 14% free memory [14:27:42] RECOVERY Free ram is now: OK on bots-3 i-000000e5.pmtpa.wmflabs output: OK: 40% free memory [14:41:03] PROBLEM Total processes is now: CRITICAL on dumps-bot2 i-000003f4.pmtpa.wmflabs output: CHECK_NRPE: Socket timeout after 10 seconds. [14:41:53] RECOVERY Free ram is now: OK on dumps-bot1 i-000003ed.pmtpa.wmflabs output: OK: 3995% free memory [14:43:53] RECOVERY Current Load is now: OK on dumps-bot2 i-000003f4.pmtpa.wmflabs output: OK - load average: 0.11, 0.10, 0.05 [14:45:52] RECOVERY Total processes is now: OK on dumps-bot2 i-000003f4.pmtpa.wmflabs output: PROCS OK: 99 processes [14:49:42] Change on 12mediawiki a page Developer access was modified, changed by Sharihareswara (WMF) link https://www.mediawiki.org/w/index.php?diff=593543 edit summary: /* User:Nad */ done [14:51:16] 10/14/2012 - 14:51:15 - Creating a home directory for nad at /export/keys/nad [14:52:17] Change on 12mediawiki a page Developer access was modified, changed by Sharihareswara (WMF) link https://www.mediawiki.org/w/index.php?diff=593549 edit summary: /* User:Chuscade */ [14:53:06] Change on 12mediawiki a page Developer access was modified, changed by Sharihareswara (WMF) link https://www.mediawiki.org/w/index.php?diff=593550 edit summary: /* User:Hendrik Brummermann */ [14:53:40] Change on 12mediawiki a page Developer access was modified, changed by Sharihareswara (WMF) link https://www.mediawiki.org/w/index.php?diff=593551 edit summary: /* User:Nx */ [14:53:53] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [14:56:13] 10/14/2012 - 14:56:13 - Updating keys for nad at /export/keys/nad [15:07:32] PROBLEM Total processes is now: WARNING on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS WARNING: 189 processes [15:24:12] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [15:27:32] RECOVERY Total processes is now: OK on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS OK: 95 processes [15:50:57] Is there any process active that is heavily affecting the (how do I say this ...) efficiency of bots-2 [15:52:17] Probably for the last 3 hours and 20 minutes or so [15:52:56] (linkwatcher.pl started suddenly to build up a backlog, first backup on 12:32 today) [15:54:12] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [15:57:40] linkwatcher is using the most atm, it might be the sql server is slowish [15:58:45] Did not see anything strange on bots-sql2 either [15:59:30] No, it can not be mysql, the processes that have the backlog do not use the mysql .. [15:59:58] it is the parserqueue that overloads, not the analyserqueue [16:03:26] Or Wikipedia suddenly gets many more edits to chew somewhere, over rate of edits seems to slowly increase [16:04:02] hmm [16:04:47] Ganglia is showing some spikes of io wait, most of everything else seems to be network traffic [16:05:00] yesterday the average speed of edits to parse was 520 per minute, now it is 531 per minute [16:05:36] (overal average over a runtime of 5 days 8 hours now) [16:06:33] average went up 2 over the last 40 minutes ... [16:06:51] On a Sunday .. [16:21:26] I added 5 extra parsers .. queue is slowly slowly going down [16:24:12] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [16:31:13] 10/14/2012 - 16:31:12 - Updating keys for nad at /export/keys/nad [16:45:33] linkwatcher starts to eat backups again .. no clue what changed 3 hours ago [16:45:47] Ryan_Lane: Is it possible to get a Gerrit user for a Labs-project so the Gerrit query command can be used? [16:45:57] hmmm [16:46:13] Jan_Luca: You could use the key puppetmaster::self uses [16:46:24] yeah. could. [16:46:43] Why puppetmaster::self? [16:46:47] when open registration is open this is easier, since you could just create your own user [16:46:49] * Damianz thinks he should finish bots stuff and get it merged before re-doing sql servers [16:48:37] The problem is that Gerrit query (http://gerrit.googlecode.com/svn/documentation/2.1.5/cmd-query.html) only works with ssh and key ... [16:49:04] I wish it just had a rest interface or suck, ssh seems so silly for scripting stuff [16:51:48] There should be a interface but there is no docu... [16:54:12] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [16:54:40] Do you mean the key from https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=blob;f=manifests/puppetmaster.pp;h=5c64428e1b597b82c6e5717b55bc6af8e7489189;hb=refs/heads/production [16:58:19] labs-puppet-key yes [16:58:31] Though imo we should kill that with fire and use https clones [16:58:37] problem is the private repo is screwed permissions wise [16:59:52] Ok build a new instance with the puppetmaster::self role ... [17:03:52] PROBLEM Current Load is now: CRITICAL on centralauth-puppet i-000004c2.pmtpa.wmflabs output: Connection refused by host [17:04:32] PROBLEM Current Users is now: CRITICAL on centralauth-puppet i-000004c2.pmtpa.wmflabs output: Connection refused by host [17:05:12] PROBLEM Disk Space is now: CRITICAL on centralauth-puppet i-000004c2.pmtpa.wmflabs output: Connection refused by host [17:06:02] PROBLEM Free ram is now: CRITICAL on centralauth-puppet i-000004c2.pmtpa.wmflabs output: Connection refused or timed out [17:07:32] PROBLEM Total processes is now: CRITICAL on centralauth-puppet i-000004c2.pmtpa.wmflabs output: Connection refused or timed out [17:07:52] PROBLEM dpkg-check is now: CRITICAL on centralauth-puppet i-000004c2.pmtpa.wmflabs output: Connection refused or timed out [17:09:34] labs-nagios-wm: Silence, I know I have selected tiny -.- [17:09:46] lol [17:09:56] we need to kill tiny, I wonder if there's a bug for that [17:11:50] Damianz: I tried :( [17:11:55] there's instances using it [17:12:07] do they actually work? [17:12:08] even marking it as deleted causes 500 errors in queries [17:12:11] no clue [17:12:20] last time I tried puppet oomed out every run [17:12:24] yep [17:12:34] Hmm, we could filter it out of the interface on the select I guess [17:12:37] Kinda lame but meh [17:12:38] yeah [17:12:40] very lame [17:12:45] openstack needs a way to handle this [17:12:53] maybe a "mark as hidden" option? [17:13:02] or hell, even metadata would help here [17:13:02] Or change the default [17:13:08] but it isn't available through the api [17:13:24] Jan_Luca: if I did that I may as well just filter it [17:13:32] I'd need to make interface changed anyway [17:13:35] It would be nice to be able to add metadata so we could put notes like 'DON'T USE THIS' on the form in a sane away [17:13:36] ok [17:13:41] boarding time for my flight [17:13:48] :o have fun [17:13:53] PROBLEM Current Load is now: CRITICAL on centralauth-puppetmaster i-000004c3.pmtpa.wmflabs output: Connection refused by host [17:14:03] I think this flight has wifi :) [17:14:07] :o [17:14:07] * Ryan_Lane waves [17:14:29] My flight didn't have wifi, was running to the departure gate with my laptop open finishing up a conversation lol [17:14:33] PROBLEM Current Users is now: CRITICAL on centralauth-puppetmaster i-000004c3.pmtpa.wmflabs output: Connection refused by host [17:15:12] PROBLEM Disk Space is now: CRITICAL on centralauth-puppetmaster i-000004c3.pmtpa.wmflabs output: Connection refused by host [17:15:52] PROBLEM Free ram is now: CRITICAL on centralauth-puppetmaster i-000004c3.pmtpa.wmflabs output: Connection refused by host [17:17:32] PROBLEM Total processes is now: CRITICAL on centralauth-puppetmaster i-000004c3.pmtpa.wmflabs output: Connection refused or timed out [17:18:12] PROBLEM dpkg-check is now: CRITICAL on centralauth-puppetmaster i-000004c3.pmtpa.wmflabs output: Connection refused or timed out [17:24:12] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [17:24:52] PROBLEM host: i-000004c4.pmtpa.wmflabs is DOWN address: i-000004c4.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000004c4.pmtpa.wmflabs) [17:28:53] PROBLEM Current Load is now: CRITICAL on centralauth-test123 i-000004c5.pmtpa.wmflabs output: Connection refused by host [17:29:33] PROBLEM Current Users is now: CRITICAL on centralauth-test123 i-000004c5.pmtpa.wmflabs output: Connection refused by host [17:30:12] PROBLEM Disk Space is now: CRITICAL on centralauth-test123 i-000004c5.pmtpa.wmflabs output: Connection refused by host [17:30:52] PROBLEM Free ram is now: CRITICAL on centralauth-test123 i-000004c5.pmtpa.wmflabs output: Connection refused by host [17:33:52] RECOVERY Current Load is now: OK on centralauth-test123 i-000004c5.pmtpa.wmflabs output: OK - load average: 0.41, 0.79, 0.47 [17:34:32] RECOVERY Current Users is now: OK on centralauth-test123 i-000004c5.pmtpa.wmflabs output: USERS OK - 0 users currently logged in [17:35:12] RECOVERY Disk Space is now: OK on centralauth-test123 i-000004c5.pmtpa.wmflabs output: DISK OK [17:35:52] RECOVERY Free ram is now: OK on centralauth-test123 i-000004c5.pmtpa.wmflabs output: OK: 4943% free memory [17:36:24] I wish if an email is sent to 2 lists that you're a member of mailman just sent out 1 copy :( [17:54:13] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [18:24:13] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [18:54:22] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [19:05:32] PROBLEM Total processes is now: WARNING on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS WARNING: 190 processes [19:23:08] Change on 12mediawiki a page Developer access was modified, changed by Jussi24 link https://www.mediawiki.org/w/index.php?diff=593578 edit summary: [19:24:23] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [19:42:55] Anyone have requests for software on bots instances? (before I start migrating them next week) [19:44:11] hi there Damianz [19:44:43] Damianz: (you may already be doing this) could you email the bots mailing list to ask? [19:44:52] Hi Sharihareswara. [19:45:01] ha! [19:45:51] You mean labs-l? I will be doing at some point before I try and get these manifests merged. The main instances that need doing are sql stuff so not a huge issue package wise, application instances will matter more as we move into restricted 'production' ones where stuff is expected to be puppetized. [19:46:47] Damianz: no, I mean the bots mailing list [19:47:00] There's a bots mailing list? [19:47:25] Damianz: https://lists.wikimedia.org/mailman/listinfo/wikibots-l [19:47:41] very low-traffic but there [19:48:09] never knew that existed [19:48:12] [bz] (8NEW - created by: 2Robert Hanke, priority: 4Unprioritized - 6normal) [Bug 41023] Make a stats table for 85 W3C wikis - https://bugzilla.wikimedia.org/show_bug.cgi?id=41023 [19:48:31] Damianz: yeah, it's unfortunately underpublicized and underused [19:54:42] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [19:57:47] indeed, I don't think it has got more than a few mails in year [19:57:53] *years [20:00:33] PROBLEM Total processes is now: CRITICAL on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS CRITICAL: 283 processes [20:09:20] https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-prn1/s320x320/525643_3630714859313_440093177_n.jpg lol so true [20:09:44] Does anyone know what version http://deployment.wikimedia.beta.wmflabs.org/wiki/Special:Version is at? [20:11:52] Krenair 8451240ac04c6d7001dc4f4ea3e575d49f7abe47 [20:13:12] hmm [20:13:42] I swear I've asked for the git ref to be in the version before and then it was and now it's not.... [20:14:04] yes, it's odd [20:14:12] you can see how it appears in http://en.wikipedia.org/wiki/Special:Version [20:14:18] Platonides, that seems to include ContentHandler, what about Sites and High-res image support? [20:16:12] RECOVERY Disk Space is now: OK on testing-arky i-0000033b.pmtpa.wmflabs output: DISK OK [20:18:05] I really hate captchas, such a bleh on ux [20:18:45] Hmm yep it's all there [20:19:30] Damianz, jonglaur was noting a few minutes ago about so many accounts being registered on toolserver wiki [20:19:42] > 500 in this month [20:20:10] I didn't think the toolserver required a captcha tbh but it's been so long since I reg'd my account heh [20:20:39] PROBLEM Total processes is now: WARNING on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS WARNING: 190 processes [20:21:25] Imo Captchas should be removed and replaced with something that takes into account your browsing pattern and if there is something not normal in your workflow then it should give you puzzels to complete. [20:21:57] I use to like reCaptcha, but then it gives you stuff that's impossible to type out on an english keyboard and it's a real urgh in ux on forms [20:23:48] Damianz: what do you mean by requests for software, what migration? [20:24:13] PROBLEM Disk Space is now: WARNING on testing-arky i-0000033b.pmtpa.wmflabs output: DISK WARNING - free space: / 78 MB (5% inode=51%): [20:24:19] Damianz, the spam bots would first visit a random wikipedia page, then click edit, and go to create an account from there [20:24:35] they would then edit that page [20:24:43] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [20:24:50] (plz see this web for v1agr4) [20:25:06] giftpflanze: common packages used (bot specific stuff should be puppetized seperatly), migration of sql servers to bigger hardware to fix our oom/ram/performance issues [20:25:50] That sounds like the bots that where hitting beta until it got fancy captcha or w/e the ext is called [20:26:19] do i have to worry about the packages that i have installed? [20:27:09] not right now, at some point they should be puppetized as needed [20:55:52] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [21:00:32] PROBLEM Total processes is now: CRITICAL on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS CRITICAL: 283 processes [21:25:53] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [21:30:32] PROBLEM Total processes is now: WARNING on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS WARNING: 189 processes [21:57:02] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [22:00:32] PROBLEM Total processes is now: CRITICAL on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS CRITICAL: 283 processes [22:06:12] PROBLEM Disk Space is now: CRITICAL on deployment-apache32 i-0000031a.pmtpa.wmflabs output: DISK CRITICAL - free space: / 271 MB (2% inode=78%): [22:11:13] PROBLEM Disk Space is now: WARNING on deployment-apache32 i-0000031a.pmtpa.wmflabs output: DISK WARNING - free space: / 308 MB (3% inode=78%): [22:27:02] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [22:30:33] PROBLEM Total processes is now: WARNING on wikistats-01 i-00000042.pmtpa.wmflabs output: PROCS WARNING: 190 processes [22:57:02] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [23:29:13] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs) [23:35:57] 10/14/2012 - 23:35:57 - User nemobis may have been modified in LDAP or locally, updating key in project(s): dumps [23:36:12] 10/14/2012 - 23:36:12 - Updating keys for nemobis at /export/keys/nemobis [23:40:58] 10/14/2012 - 23:40:57 - User nemobis may have been modified in LDAP or locally, updating key in project(s): dumps [23:41:14] 10/14/2012 - 23:41:13 - Updating keys for nemobis at /export/keys/nemobis [23:45:34] Change on 12mediawiki a page Wikimedia Labs/Toolserver features wanted in Tool Labs was modified, changed by Nemo bis link https://www.mediawiki.org/w/index.php?diff=593621 edit summary: +imagemagick [23:49:12] RECOVERY Disk Space is now: OK on testing-arky i-0000033b.pmtpa.wmflabs output: DISK OK [23:57:12] PROBLEM Disk Space is now: WARNING on testing-arky i-0000033b.pmtpa.wmflabs output: DISK WARNING - free space: / 78 MB (5% inode=51%): [23:59:22] PROBLEM host: i-000003ef.pmtpa.wmflabs is DOWN address: i-000003ef.pmtpa.wmflabs CRITICAL - Host Unreachable (i-000003ef.pmtpa.wmflabs)