[00:00:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [00:00:27] Ryan_Lane, Wikinaut: works for me, now. [00:00:34] after reboot? [00:00:48] yep [00:00:50] yeah. it does now [00:02:32] RECOVERY Current Load is now: OK on etherpad-lite.pmtpa.wmflabs 10.4.0.87 output: OK - load average: 0.02, 0.04, 0.02 [00:03:50] ^confirmed ! [00:03:54] ty [00:04:03] I'm in [00:04:22] What shall we do without Ryan ? [00:04:39] we all would be lost [00:04:40] ;-) [00:04:52] PROBLEM Free ram is now: WARNING on aggregator1.pmtpa.wmflabs 10.4.0.79 output: Warning: 19% free memory [00:05:41] Ryan_Lane: re E:OpenID, i work together with Tyler [00:05:57] great [00:06:01] it's good to work in the same wiki [00:06:11] the wiki itself should pull from git [00:06:24] you mean core ? [00:06:25] so, you push changes into gerrit, then you pull them onto the wiki to test [00:06:31] ? [00:06:32] well, into your extension [00:06:40] oh my god [00:06:46] there's gerrit links to pull down the individual change [00:06:52] review is not installed yet [00:07:03] you don't necessarily need review [00:07:04] I hoped that is done by puppet [00:07:13] I will stick to my workflow [00:07:13] well, this is a manual process [00:07:17] puppet can't do this for you [00:07:18] Gt/Tutorial [00:07:30] that's fine [00:07:37] so I need to do this with the small python script ? [00:07:48] however you want to work is fine [00:07:58] I was suggesting an alternative that works with multiple people [00:08:02] Hm… I was thinking about this the other day. Ryan_Lane, what do you think about adding git-review to the standard package set in labs? [00:08:10] andrewbogott, there's only one business day remains of this week, lets just wait for Antoine [00:08:11] it's a good idea [00:08:18] there's only one thing that worries me [00:08:18] MaxSem: Fair enough [00:08:20] sudo easy_install pip [00:08:22] $ sudo pip install git-review [00:08:28] I don't really want people forwarding their agent past bastion [00:08:34] I've considered disabling it, in fact [00:09:00] Wikinaut: apt-get install git-review [00:09:09] andrewbogott: +1 [00:09:09] no need to use pip/easy_install [00:09:17] for add git-review to standard [00:09:28] Ryan_Lane, what process would you use instead of forwarding? Keeping a different private key on the instance? [00:09:40] ^good question [00:09:41] Whoah, underlining! [00:09:56] what's wrong with forwarding? [00:12:04] sudo apt-get install python-pip ; sudo pip install git-review [00:12:25] Wikinaut: I think that forwarding to an untrusted host runs the risk of your key being compromised. Although I'm not sure how exactly. [00:12:55] but its behind the bastion... [00:13:12] We must have the fingerprints of the instance servers [00:13:40] we you first connect, PuTTY or WInSCP show the fingerprints [00:14:05] and you have to compare ... with a trusted version of the fingerprint... [00:14:12] something differnt [00:14:15] Exception: Could not connect to gerrit at ssh://wikinaut@gerrit.wikimedia.org:29418/mediawiki/extensions/OpenID.git [00:14:44] from the instance, I cannot "git review -d ..." [00:15:11] that's another blocker. [00:15:37] root@openid-wiki:/srv/mediawiki/extensions/OpenID# git review -d Ie162fd53 [00:15:37] The authenticity of host '[gerrit.wikimedia.org]:29418 ([208.80.154.152]:29418)' can't be established. [00:15:38] RSA key fingerprint is dc:e9:68:7b:99:1b:27:d0:f9:fd:ce:6a:2e:bf:92:e1. [00:15:40] Are you sure you want to continue connecting (yes/no)? yes [00:15:41] Could not connect to gerrit. [00:15:42] Enter your gerrit username: wikinaut [00:15:44] Trying again with ssh://wikinaut@gerrit.wikimedia.org:29418/mediawiki/extensions/OpenID.git [00:15:45] [00:15:47] We don't know where your gerrit is. Please manually create [00:15:49] a remote named "gerrit" and try again. [00:15:50] Traceback (most recent call last): [00:15:51] File "/usr/local/bin/git-review", line 863, in [00:15:53] main() [00:15:55] File "/usr/local/bin/git-review", line 798, in main [00:15:57] config['hostname'], config['port'], config['project']) [00:15:59] File "/usr/local/bin/git-review", line 389, in check_remote [00:16:00] add_remote(hostname, port, project, remote) [00:16:00] Wikinaut: use pastebin! [00:16:02] File "/usr/local/bin/git-review", line 250, in add_remote [00:16:03] raise Exception("Could not connect to gerrit at %s" % remote_url) [00:16:05] Exception: Could not connect to gerrit at ssh://wikinaut@gerrit.wikimedia.org:29418/mediawiki/extensions/OpenID.git [00:16:06] root@openid-wiki:/srv/mediawiki/extensions/OpenID# Exception: Could not connect to gerrit at ssh://wikinaut@gerrit.wikimedia.org:29418/mediawiki/extensions/OpenID.git [00:16:08] Exception:: command not found [00:16:09] root@openid-wiki:/srv/mediawiki/extensions/OpenID# [00:16:11] wanted yes [00:16:14] sorry [00:16:24] andrewbogott: proxycommand [00:16:28] and I see I have to set up the gerrit ermote [00:16:34] andrewbogott: but also, no pushes into gerrit from inside of labs [00:16:39] remote [00:16:42] I push from my local system [00:16:53] push into gerrit [00:16:55] and pull to the instance [00:17:02] or push directly to my instance [00:17:08] listing it as a remote [00:17:10] oh :( I like being able to develop on an instance. [00:17:20] you can do so by listing it as a remote [00:17:22] an pushing to it [00:17:26] *and [00:17:29] or... [00:17:34] develop on the instance [00:17:36] and pull from it [00:17:38] then push into gerrit [00:17:49] I develop on my instance since last week [00:18:03] but pushing into gerrit from labs is dangerous [00:18:05] Ryan_Lane: pls. can you send me the remote defintion [00:18:21] pls [00:18:46] git remote set-url labs openid-wiki.pmpta.wmflabs:/srv/mediawiki/extensions/OpenID [00:18:52] git pull labs [00:18:55] git push labs [00:20:34] that requires proxycommand to be configured [00:20:38] !access [00:20:38] https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [00:20:46] configuration for that is listed there [00:20:56] Ryan_Lane: It also presumes that the 'local machine' is something that can comfortably run a git client. [00:21:08] andrewbogott: yes [00:21:14] which in the case of windows isn't true [00:21:19] Which I guess is always possible, but… I don't even have git installed on my local machine at the moment [00:21:34] yeah [00:21:48] well, there's also the possibility of using bastion as your git location [00:21:59] and pushing/pulling from there [00:22:06] No such remote "labs" [00:22:09] uh [00:22:09] Yeah… that's much easier. [00:22:51] Anyway, if no one is supposed to push from an instance, that argues against having git-review installed :) [00:23:25] oh [00:23:26] whoops [00:23:32] I think you needed this: [00:23:41] git remote add labs openid-wiki.pmpta.wmflabs:/srv/mediawiki/extensions/OpenID [00:23:48] yep found it [00:24:00] because it wasn't yet in [00:24:03] yep [00:24:11] because I added a local git repo [00:24:20] what my intention was last week [00:25:52] ssh: Could not resolve hostname openid-wiki.pmpta.wmflabs: Name or service not known fatal: The remote end hung up unexpectedly [00:25:58] yeah [00:26:01] oh [00:26:08] pmtpa [00:26:11] not pmptpa [00:26:18] i see [00:26:24] heh. hard to type [00:26:48] btw, what does it stand for? [00:27:22] s/pmtpa/somethingmeaningfulandeasytoremember/ [00:29:00] It's the name of a data center… 'tpa' is Tampa, and 'pm' is… the name of the hosting company. [00:29:04] So, not very memorable. [00:29:07] ah [00:29:09] ty [00:29:19] and moved ... [00:29:35] yeah. it's annoying [00:30:10] git pull labs fails with Permission denied (publickey). [00:30:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [00:30:58] are you using an agent locally? [00:31:13] you want to use an agent, you just don't want to forward it [00:33:57] I re-connected with deactivated PuTTY setting "forwarding". Same error, sorry to bother you with this stupid problem [00:34:27] "Permission denied (publickey). fatal: The remote end hung up unexpectedly" [00:34:43] ah. right. windows. [00:34:44] heh [00:35:03] it's hard for me to troubleshoot this [00:35:11] I understand [00:35:25] and will now go to bed .. (01:35 local time Berlin) [00:35:33] * Ryan_Lane waves [00:35:42] thanks for assistance so far [00:35:45] yw [00:35:58] git review -d and git pull labs fail [00:36:21] I will report tomorrow or so with some lines on pastebín [00:36:30] (trying to be concise) [00:36:34] GN8 [00:36:38] !log webtools Added Fox Wilson [00:36:40] Logged the message, Master [00:38:23] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 30% free memory [00:38:43] RECOVERY Free ram is now: OK on swift-be2.pmtpa.wmflabs 10.4.0.112 output: OK: 20% free memory [00:39:23] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 20% free memory [00:39:53] RECOVERY Free ram is now: OK on aggregator1.pmtpa.wmflabs 10.4.0.79 output: OK: 20% free memory [00:40:44] virtio, even without being able to saturate the instance's NIC is 4x faster [00:41:06] Ryan_Lane: where we are at it: http://dpaste.org/n0MPX/ [00:41:42] PROBLEM Current Load is now: WARNING on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: WARNING - load average: 6.82, 6.58, 5.49 [00:41:50] ah. right [00:41:55] it's not going to work with ssh [00:42:03] uh [00:42:07] https:// [00:42:09] ? [00:42:14] oops [00:42:15] that will work for pulls [00:42:17] meant git [00:42:23] git:// will not work [00:42:32] I leave that problem to _you_ now [00:42:39] I think, I did nothing wrong [00:43:16] you can perhaps tell me here https://labsconsole.wikimedia.org/wiki/User:Wikinaut what I can do better [00:46:23] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 17% free memory [00:46:43] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 15% free memory [00:52:23] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 15% free memory [00:58:22] PROBLEM Disk Space is now: CRITICAL on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: DISK CRITICAL - free space: / 280 MB (2% inode=84%): [01:00:24] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [01:07:43] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 171 processes [01:10:30] andrewbogott: did you do the sudo fix deployment? [01:10:46] I did -- is it working as you'd expect? [01:10:56] oh. haven't tried [01:11:25] Is it me or is labsconsole-testing broken at the moment? [01:11:42] nova-precise2? [01:11:45] yup [01:11:49] I hope not [01:11:53] Krenair: If it's broken, that's a recent development [01:12:33] I've logged back in to find I can't access pages, I just get Fatal exception of type MWException [01:12:37] ah [01:12:42] RECOVERY Total processes is now: OK on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS OK: 96 processes [01:12:52] PROBLEM Free ram is now: WARNING on aggregator1.pmtpa.wmflabs 10.4.0.79 output: Warning: 19% free memory [01:13:19] maximum execution time [01:13:31] [Thu Jan 24 21:36:21 2013] [error] [client 10.4.0.54] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /srv/org/wikimedia/controller/wikis/w/includes/db/DatabaseMysql.php on line 46, referer: https://nova-precise2.pmtpa.wmflabs/wiki/Special:NovaInstance [01:13:35] does that mean it's ooming again? [01:13:47] nope [01:13:55] something in MW is causing it to time out [01:14:03] Also I noticed: PROBLEM Disk Space is now: CRITICAL on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: DISK CRITICAL - free space: / 280 MB (2% inode=84%): [01:14:13] That looks bad. But unrelated [01:14:31] andrewbogott: can you announce the sudo cange to the list? [01:14:37] Ryan_Lane: Sure. [01:14:38] yeah, disk size warning is ok [01:14:54] Shall I checkout master OSM instead of my testing branch which is currently there? [01:15:08] maybe [01:15:13] something is causing issues with the db [01:15:23] Krenair, it's easy enough to switch branches and see if that helps... [01:15:44] heh. 30 seconds waiting on a db call is a really long time [01:16:10] Doesn't seem to have done anything [01:16:20] I wonder what's doing it [01:16:29] I'm going to start disabling extensions [01:18:30] it's none of the extensions [01:23:52] hmm [01:23:56] Any idea what's going on here? I'm trying to run VoxelBot [01:23:57] vacation9@bots-4:~/VoxelBot$ python bot.py [01:23:57] [01:23:57] c9f5db474712bb85489f1540b5822cad+\ [01:23:57] 2013-01-25T00:52:42Z [01:23:58] http://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rcstart=2013-01-25T00:52:42Z&rcend=2013-01-25T01:22:42Z&rclimit=5000&rcdir=newer&rcprop=comment|ids&format=xml [01:24:05] Connection to bots-4 closed. [01:24:15] It just closes the connection whenever I try it [01:24:22] You probably shouldn't go around posting login tokens on public channels, Vacation9. [01:24:35] Krenair: Just realized that [01:24:55] Anyway, just disregard that @_@ [01:30:59] Ryan_Lane, are you still looking at nova-precise2? When you lose interest we can just reboot it :) [01:31:24] No, you can reboot it :) [01:31:52] bots-4 is giving LDAP authorization failed now... [01:31:53] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [01:31:59] it worked literally one minute ago [01:33:00] anyone else having the problem? [01:33:19] A smattering of other instances will let me log in, but bots-4 will not. [01:33:34] same... [01:33:47] yeah. I cannot tell you what that means. [01:35:47] !log bots fwilson: Move VoxelBot to bots-3 [01:35:50] Logged the message, Master [01:36:29] Oh that was to Ryan_Lane... I need to be more careful with what I add to my highlight list! [01:39:37] And not IRC at this hour. gnight [01:46:37] well, I'm not sure a reboot is going to help [01:46:43] Krenair: good night! [01:47:22] oops, I rebooted a second before you said that [01:47:26] heh [01:47:26] I guess we will soon find out [01:47:27] is ok [01:48:17] Still getting Permission denied (publickey). [01:49:14] Vacation9, sorry, I'm talking about a different problem. I don't know what to think about labs-4. [01:49:40] andrewbogott: Oh sorry. [01:50:16] that seemed to work for nova-precise2 [01:50:22] Yeah, for the moment. [01:50:27] maybe bots-4 is oom'd [01:50:31] hmm, you talking about bot4s Vacation9 ? [01:50:50] yes [01:50:54] just walked back to my pc and spotted Connection to bots-4 closed. [01:50:55] :< [01:50:55] VoxelBot wasn't running [01:50:58] switched to bots-3 [01:51:04] now I can't even log into bots-4 [01:51:15] yep I get Permission denied (publickey). [01:51:15] also [01:51:48] :( [01:51:53] did you break it again? [01:51:57] nope :P [01:52:02] :D [01:52:03] let me check it's console log [01:52:30] OOM'd [01:52:34] let me reboot it [01:52:44] rebooting [01:52:47] may take a bit [01:53:22] PROBLEM Disk Space is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: DISK WARNING - free space: / 399 MB (4% inode=84%): [01:56:21] !log bots addshore: bots3 installing php5-curl [01:56:23] Logged the message, Master [01:57:08] !log bots addshore: bots2 installing php5-curl [01:57:10] Logged the message, Master [01:58:55] Ryan_Lane, is /a on nova-precise2 your doing? [01:59:16] andrewbogott: puppet's maybe? [02:00:01] it's using half the volume, in a directory called 'backup' [02:00:07] Maybe that's krenair's thing. [02:01:53] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [02:03:23] PROBLEM Disk Space is now: CRITICAL on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: DISK CRITICAL - free space: / 83 MB (0% inode=84%): [02:05:33] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 915% free memory [02:07:35] 915%... is that normal? [02:21:44] Vacation9: heh. it's a screwed up check [02:22:18] Ryan_Lane: I don't think that's enough though. Might want to consider upgrading the memory. [02:23:53] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Unable to read output [02:28:52] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [02:32:52] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [03:02:52] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [03:32:52] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [04:02:52] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [04:26:42] RECOVERY Current Load is now: OK on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: OK - load average: 4.84, 4.89, 4.99 [04:34:14] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [04:37:22] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [04:41:23] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 30% free memory [05:04:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [05:05:22] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 16% free memory [05:09:22] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 17% free memory [05:34:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [05:48:22] PROBLEM dpkg-check is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: CHECK_NRPE: Error - Could not complete SSL handshake. [05:48:52] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: CHECK_NRPE: Error - Could not complete SSL handshake. [05:53:23] RECOVERY dpkg-check is now: OK on aggregator2.pmtpa.wmflabs 10.4.0.193 output: All packages OK [05:53:53] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [06:04:23] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [06:10:54] PROBLEM Free ram is now: WARNING on swift-be3.pmtpa.wmflabs 10.4.0.124 output: Warning: 19% free memory [06:28:14] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 151 processes [06:28:34] PROBLEM Total processes is now: WARNING on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: PROCS WARNING: 154 processes [06:30:23] PROBLEM Total processes is now: WARNING on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS WARNING: 152 processes [06:33:33] RECOVERY Total processes is now: OK on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: PROCS OK: 150 processes [06:34:23] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [06:43:13] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 145 processes [06:45:22] RECOVERY Total processes is now: OK on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS OK: 148 processes [06:53:22] PROBLEM Disk Space is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: DISK WARNING - free space: / 420 MB (4% inode=84%): [06:58:54] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:05:24] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [07:23:53] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [07:36:53] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [08:07:52] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [08:37:54] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [08:39:23] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 30% free memory [08:40:23] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 25% free memory [08:40:53] RECOVERY Free ram is now: OK on swift-be3.pmtpa.wmflabs 10.4.0.124 output: OK: 22% free memory [08:49:02] PROBLEM Free ram is now: WARNING on swift-be3.pmtpa.wmflabs 10.4.0.124 output: Warning: 19% free memory [08:57:22] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 17% free memory [08:58:22] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 15% free memory [09:08:12] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [09:39:13] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [10:09:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [10:29:02] RECOVERY Free ram is now: OK on swift-be3.pmtpa.wmflabs 10.4.0.124 output: OK: 21% free memory [10:33:16] andrewbogott_afk, great! [10:39:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [10:51:52] PROBLEM Free ram is now: WARNING on swift-be3.pmtpa.wmflabs 10.4.0.124 output: Warning: 19% free memory [11:05:42] j #mediawiki-i18n [11:09:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [11:12:20] Yes! andrewbogott_afk the sudo thing is a good thing. Cool! [11:13:04] !log wikidata-dev wikidata-testrepo: turned memcached off again. m( [11:13:06] Logged the message, Master [11:39:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [12:05:32] PROBLEM Free ram is now: WARNING on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: Warning: 19% free memory [12:09:22] PROBLEM host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [12:37:23] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 30% free memory [12:37:48] ACKNOWLEDGEMENT host: orgcharts-dev.pmtpa.wmflabs is DOWN address: 10.4.0.122 CRITICAL - Host Unreachable (10.4.0.122) [12:38:23] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [12:40:32] RECOVERY Free ram is now: OK on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: OK: 20% free memory [12:41:52] RECOVERY Free ram is now: OK on swift-be3.pmtpa.wmflabs 10.4.0.124 output: OK: 22% free memory [12:45:23] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 17% free memory [12:53:54] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Unable to read output [12:54:54] PROBLEM Free ram is now: WARNING on swift-be3.pmtpa.wmflabs 10.4.0.124 output: Warning: 19% free memory [12:58:52] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [12:59:52] RECOVERY Free ram is now: OK on swift-be3.pmtpa.wmflabs 10.4.0.124 output: OK: 21% free memory [13:01:23] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 15% free memory [13:12:44] PROBLEM Free ram is now: WARNING on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: Warning: 19% free memory [13:13:23] PROBLEM Free ram is now: WARNING on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: Warning: 16% free memory [13:38:52] PROBLEM Free ram is now: WARNING on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: Warning: 18% free memory [13:57:53] PROBLEM Free ram is now: WARNING on swift-be3.pmtpa.wmflabs 10.4.0.124 output: Warning: 16% free memory [15:38:52] RECOVERY Free ram is now: OK on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: OK: 25% free memory [15:45:42] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 172 processes [15:50:42] RECOVERY Total processes is now: OK on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS OK: 99 processes [15:51:12] PROBLEM Current Load is now: WARNING on bots-sql1.pmtpa.wmflabs 10.4.0.52 output: WARNING - load average: 1.43, 9.45, 7.44 [16:01:12] RECOVERY Current Load is now: OK on bots-sql1.pmtpa.wmflabs 10.4.0.52 output: OK - load average: 0.14, 1.49, 4.01 [16:37:44] RECOVERY Free ram is now: OK on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: OK: 27% free memory [16:40:22] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 23% free memory [16:48:23] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 15% free memory [16:49:48] !log gerrit Installed maven2 and jython to compile on-server [16:49:49] Logged the message, Master [17:00:42] PROBLEM Free ram is now: WARNING on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: Warning: 19% free memory [17:10:12] PROBLEM Free ram is now: WARNING on techvandalism-bot.pmtpa.wmflabs 10.4.0.194 output: Warning: 17% free memory [17:58:55] PROBLEM Free ram is now: UNKNOWN on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Call to fork() failed [18:03:53] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [18:05:12] Krenair, do you know what's going on in the /a dir on nova-precise2? [18:12:57] There's an /a dir, andrewbogott? [18:13:26] Krenair: Yeah, and it's using ~5g of disk space. [18:13:32] No idea where it came from or if it's imporant [18:14:31] andrewbogott, loads of backup .gz files in /a/backup... Ryan'll probably know [18:14:42] He doesn't [18:19:12] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 152 processes [18:22:39] andrewbogott, well to me it just looks like something has been triggering some sort of backup every day since christmas day [18:22:48] yeah. [18:39:13] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 145 processes [19:13:10] Krenair: Turns out the openstack puppet class automatically sets up db backups. I linked things into /data/project so we have room to breathe. [19:13:22] RECOVERY Disk Space is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: DISK OK [19:18:54] PROBLEM Free ram is now: UNKNOWN on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Call to fork() failed [19:37:13] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 151 processes [19:48:53] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [19:54:43] PROBLEM Current Load is now: WARNING on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: WARNING - load average: 7.29, 6.78, 5.79 [20:38:23] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 23% free memory [20:40:42] RECOVERY Free ram is now: OK on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: OK: 27% free memory [20:40:52] RECOVERY Free ram is now: OK on techvandalism-bot.pmtpa.wmflabs 10.4.0.194 output: OK: 26% free memory [20:43:52] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:01:24] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 15% free memory [21:03:53] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [21:03:54] PROBLEM Free ram is now: WARNING on techvandalism-bot.pmtpa.wmflabs 10.4.0.194 output: Warning: 18% free memory [21:13:42] PROBLEM Free ram is now: WARNING on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: Warning: 19% free memory [21:59:08] Ryan_Lane: do you want to code review this https://gerrit.wikimedia.org/r/#/c/42757/4 --- or do you want me to remove you from the list ? [21:59:59] * Damianz high fives Wikinaut [22:00:06] that shizzle makes openid in labs possible [22:00:25] no yet fully, and thanks to TYLER ! [22:00:31] not yet fully: [22:00:45] 'progress' [22:00:51] becasue the auto-discovery is missing, and protocol-independent urls [22:01:08] Login page was a bit I couldn't figure out when I looked at it heh [22:01:15] Damianz: do you want to test please [22:01:42] http://openid-wiki.instance-proxy.wmflabs.org/wiki/Main_Page [22:01:55] login as you wisk (standard or with an OpenID) [22:02:02] add content to your user page [22:02:15] I can, not right now though - re-configuring spam filtering on some servers [22:02:20] then you can use your userpage Url as OpenID somewhere else [22:02:28] but we need testers [22:02:49] s/wisk/wish/ [22:38:52] PROBLEM Current Load is now: CRITICAL on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: Connection refused by host [22:39:32] PROBLEM Disk Space is now: CRITICAL on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: Connection refused by host [22:40:13] PROBLEM Free ram is now: CRITICAL on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: Connection refused by host [22:41:43] PROBLEM Total processes is now: CRITICAL on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: Connection refused by host [22:42:23] PROBLEM dpkg-check is now: CRITICAL on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: Connection refused by host [22:43:53] RECOVERY Current Load is now: OK on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: OK - load average: 0.75, 0.88, 0.50 [22:44:04] andrewbogott: Oh god passwordless sudo is nice... I just need to stop typing my password into bash now [22:44:25] I keep doing it too… so far I've avoided hitting and getting it in the history [22:44:33] RECOVERY Disk Space is now: OK on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: DISK OK [22:44:50] yeah [22:45:14] RECOVERY Free ram is now: OK on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: OK: 91% free memory [22:46:44] RECOVERY Total processes is now: OK on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: PROCS OK: 103 processes [22:47:24] RECOVERY dpkg-check is now: OK on damian-keytest.pmtpa.wmflabs 10.4.0.97 output: All packages OK [22:53:44] PROBLEM Free ram is now: CRITICAL on abogott-request-tracker.pmtpa.wmflabs 10.4.1.48 output: Critical: 5% free memory [22:57:49] hi Ryan_Lane [22:57:54] hi [22:58:31] I want to create lab for testing [22:59:00] Ryan_Lane: this is my project https://github.com/harshkothari410/TwitterCards [22:59:53] Ryan_Lane: http://www.mediawiki.org/wiki/Extension:TwitterCards [23:01:04] * Ryan_Lane nods [23:01:07] I wonder if this is hard hmmm [23:01:51] we really need to set up a generic non-root-needed mediawiki project [23:02:08] indeed [23:02:20] I think it's doable with salt [23:02:27] less so with puppet [23:02:27] Quite a lot of extensions could be done on the same mw install also [23:02:40] harshkothari: let me create a project for you [23:02:50] well, best not to mix them together [23:02:57] thanks Ryan_Lane :) [23:03:07] otherwise one person's screw up will break everyone [23:03:07] it's a lot of cruft to have hundreds of mw installs [23:03:11] oh [23:03:13] I see what you mean [23:03:23] just like if(domain) then (load user shit) [23:03:32] only db breakage then, if you break the db you get shot [23:03:36] but… people may need to change core, too [23:03:42] PROBLEM Free ram is now: WARNING on abogott-request-tracker.pmtpa.wmflabs 10.4.1.48 output: Warning: 9% free memory [23:03:48] or multiple extensions [23:03:50] I dunno why anyone would ever want to do that :P [23:03:54] heh [23:04:05] the applications servers would die pretty quickly, too [23:04:18] because they'd be running too many versions of mediawiki [23:04:33] I guess it's LRU, right? [23:04:40] so lots of stuff just wouldn't stay in cache [23:04:51] hmm yeah caching could suck [23:04:51] could make 2-3 large memory instances for it, I guess [23:05:14] we'd either need a deployment system or to run off of gluster [23:05:20] * Ryan_Lane sighs [23:05:32] yay gluster [23:05:56] at minimum the configuration would need to live in gluster [23:06:03] you could use salt modules to install a wiki with x extensions and update the wiki via a scheduled task pretty easy [23:06:09] yep [23:06:18] well, I wouldn't have it update the wikis [23:06:59] yay security [23:07:00] -.- [23:07:12] the wikis wouldn't be open [23:07:25] urgh wtf windows [23:07:33] I can't by default just edit the ldap schema to add an attribute [23:07:36] harshkothari: what's your labsconsole username. [23:07:37] ? [23:07:47] Damianz: what do you mean? [23:07:59] why would you do that? :) [23:08:04] Apparently you have to set a registry value to allow you to update the schema [23:08:09] oh [23:08:10] right [23:08:15] So I can store ssh keys on the user and have puppet deploy them? [23:08:54] just glusterfs mount them everywhere ;) [23:09:09] oh yeah because glusterfs is going to work awesomly over a 300ms latency link. [23:09:41] Damianz: ;) [23:09:57] I'd rather patch openssh to search ldap directly [23:10:05] I wouldn't mind that either [23:10:16] I don't want to manage that patch, though [23:10:19] There's a patch for it but no one will package it for security [23:10:29] I wish openssh would just fucking accept the patch [23:11:00] I wish you could do it at the level of pam.... [23:11:04] harshkothari: ? [23:11:24] Ryan_Lane: giving 1 sec [23:11:35] how about now? [23:11:37] now? [23:11:40] what about now? [23:11:42] done yet? :) [23:12:13] Soemone give Ryan his pills :P [23:13:10] Ryan_Lane: Harshkothari410 [23:15:02] harshkothari: did you create this account right now? [23:15:34] no.. I have created on 6th november https://labsconsole.wikimedia.org/w/index.php?title=User:Harshkothari410&action=history [23:15:39] ok [23:15:39] just checking [23:16:12] http://doc.wikimedia.org/puppet/classes/__site__/role/ldap/server/labs.html < thats just ugly [23:16:35] :D [23:17:05] I hate labsconsole so much, it's slow, I have to click too much and the ux is ugly :( We should be able to do better in 2013 [23:17:14] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 145 processes [23:17:38] harshkothari: done [23:17:52] Damianz: http://www.mediawiki.org/wiki/Wikimedia_Labs/Account_creation_improvement_project [23:17:58] http://www.mediawiki.org/wiki/Wikimedia_Labs/Instance_creation_improvement_project [23:18:03] http://www.mediawiki.org/wiki/Wikimedia_Labs/Interface_usability_improvement_project [23:18:18] where is it Ryan_Lane? [23:18:21] add more projects, or bugs, or squash some bugs ;) [23:18:33] harshkothari: on labsconsole [23:18:34] harshkothari: you need to create an instance and such [23:18:36] @search mediawiki [23:18:36] Results (Found 13): morebots, labs-home-wm, labs-nagios-wm, labs-morebots, gerrit-wm, extension, revision, info, bots, labs-project, openstack-manager, wl, deployment-prep, [23:18:45] Damianz: The puppet docs are generated by a tool that ships with puppet. So I'm trying not to look the gift horse in the mouth. [23:18:54] hm. where's the docs for this? [23:19:26] labsconsole is slower than normal right now it seems [23:19:36] harshkothari: https://labsconsole.wikimedia.org/wiki/Help:Single_Node_MediaWiki [23:19:49] * harshkothari looking [23:20:01] andrewbogott: 'nay' [23:20:33] harshkothari: Especially don't skip the part about security groups :) Pretty much everyone skips that part and then later I have to tell them to start over and it makes us both sad [23:20:54] that means the ux sucks if /everyone/ does it [23:21:17] thanks andrewbogott.. I will keep in mind [23:21:19] Yep! [23:22:15] Damianz: I haven't thought of a great fix. We could have every project start out with a 'web' security group, and have the instance page select that group by default... [23:22:21] Or just add http ports to the defaul group [23:22:29] neither of those seem awesome, but maybe it's better than how things are now. [23:22:45] Really the entire workflow should be re-done into steps [23:22:58] Damianz, how so? [23:23:04] You mean the docs, or the actual process? [23:23:05] Rather than 'oh you need a group', 'go here', 'add one', 'refresh', 'start again entering info' [23:23:10] that's just lame [23:23:27] I should be able to in-line setup rules and groups when creating an instance [23:23:36] Hm... [23:23:45] That wouldn't solve the problem of people skipping that step [23:24:35] It would make the ui cleaner and people would see that 'section' rather than just skipping over a dropdown [23:24:41] andrewbogott: we could add support to nova for adding security groups after instance creation [23:24:42] Damianz, unless the GUI knows ahead of time what you're creating the instance for I don't know how it can guide properly. [23:24:55] Ryan_Lane, yeah, in the long run that's better. [23:24:55] well you'd tell it [23:25:02] that's the point of puppet [23:25:37] Damianz: yes, that would be ideal [23:25:55] Damianz: that realistically requires javascript [23:25:55] which requires an api [23:26:02] I have open bugs for this [23:26:18] it also requires puppet not fucking over instances on boot [23:26:43] What would store the knowledge that X puppet class needs Y security rules? [23:26:57] It doesn't really make sense to put it in puppet since puppet can't modify the security rules [23:27:00] Ideally in puppet [23:27:13] Damianz: what do you mean? [23:27:14] Labsconsole should be able to parse puppet manifests [23:27:18] what does puppet have to do with this at all? [23:27:54] Ryan_Lane: Different topic than ux [23:28:01] Ux is one issue (make a nice workflow) [23:28:14] another is provisioning instances in an automated fashion (puppet, security groups) [23:28:43] puppet has nothing to do with security groups [23:28:54] oh. you're saying it should? [23:28:56] It should [23:28:57] that wouldn't be easy [23:29:06] Else how can we ever auto provision instances easily [23:29:23] Currently we could just map it in the itnerface (easily) [23:29:27] for that we'd need to allow every instance to modify security groups [23:29:32] But all the puppet stuff in osm should be auto-added [23:29:39] why? [23:29:44] nova can tell what's assigned [23:29:46] it's all in ldap [23:30:03] ideally in the long run labsconsole won't manage puppet [23:30:05] but nova will [23:30:13] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 152 processes [23:30:17] labsconsole/nova makes no difference - they both use the same data [23:31:35] Since either way it would auto know about avaible classes in puppet and variables they can take and doc strings they have, it can also know about security groups it might need imo [23:31:53] Downside is, nova really needs to be able to edit after creation still though [23:35:24] where would you actually do this, though? [23:36:30] Ideally everything would be in nova and osm would be a light, caching interface ontop of the api [23:38:22] ah [23:38:26] yeah. I agree with that [23:38:34] the puppet part would need to be a plugin [23:39:41] Lovely would be if we could have a puppet reporter tell nova when the instance has applied stuff so it knows when it's 'built' [23:40:19] really what you're describing is something more like heat [23:40:32] this could also be done with salt and reactors [23:40:50] have puppet fire events when it needs a security rule added [23:41:13] heat? [23:41:21] and yes - ideally [23:42:08] Which reminds me, I need to look at events when somone triggers a server reboot