[00:04:12] RECOVERY Current Load is now: OK on bots-sql1.pmtpa.wmflabs 10.4.0.52 output: OK - load average: 0.13, 1.33, 3.90 [00:07:12] got a question - is there some sort of quota for CPU or memory use? i've been running a python IRC bot but it keeps quitting [00:09:43] rschen7754, where are you running it? :O [00:09:49] addshore:bots-4 [00:10:42] it [00:10:55] 's not running now, just quit during the last hour when i was driving to school [00:11:04] I got told the other day there was nothing that autokilled stuff [00:11:12] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 152 processes [00:12:09] hmm [00:12:34] maybe a storage limit? though i don't think the logs are that big [00:12:49] you have many many GBs of space :/ [00:13:06] whats your username? [00:13:31] I did have an issue with some of my scripts ending the other day on bots-4, there final output was just "Killed" [00:14:24] rschen7754 [00:15:10] the bot log says it's pinging fine until it stops right when the bot pings out [00:18:03] i think bots-4 might be running out of memory [00:18:15] I just tried to run something which couldnt as there was no memory to allocate [00:18:46] 17 mb free >.< [00:19:14] that might be a factor... [00:19:17] glusterfs seems to be using 60% of the instance memory [00:19:34] * addshore pokes petan  [00:24:34] PROBLEM dpkg-check is now: CRITICAL on bots-4.pmtpa.wmflabs 10.4.0.64 output: DPKG CRITICAL dpkg reports broken packages [00:39:44] RECOVERY Free ram is now: OK on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: OK: 25% free memory [00:39:54] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 21% free memory [00:41:32] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [00:47:54] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 13% free memory [01:05:45] addshore: would running the bot on bots-3 work? [01:06:02] yep [01:07:42] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 173 processes [01:07:52] PROBLEM Free ram is now: WARNING on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: Warning: 17% free memory [01:09:32] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [01:10:33] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 23% free memory [01:17:42] RECOVERY Total processes is now: OK on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS OK: 96 processes [01:36:13] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 146 processes [01:43:52] PROBLEM Current Load is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [01:44:32] PROBLEM Disk Space is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [01:45:12] PROBLEM Free ram is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [01:46:42] PROBLEM Total processes is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [01:47:22] PROBLEM dpkg-check is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [01:48:52] RECOVERY Current Load is now: OK on proposals-1.pmtpa.wmflabs 10.4.0.189 output: OK - load average: 1.10, 1.03, 0.56 [01:49:32] RECOVERY Disk Space is now: OK on proposals-1.pmtpa.wmflabs 10.4.0.189 output: DISK OK [01:50:13] RECOVERY Free ram is now: OK on proposals-1.pmtpa.wmflabs 10.4.0.189 output: OK: 91% free memory [01:51:43] RECOVERY Total processes is now: OK on proposals-1.pmtpa.wmflabs 10.4.0.189 output: PROCS OK: 83 processes [01:52:23] RECOVERY dpkg-check is now: OK on proposals-1.pmtpa.wmflabs 10.4.0.189 output: All packages OK [02:05:22] PROBLEM dpkg-check is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.189 output: DPKG CRITICAL dpkg reports broken packages [02:26:05] i know ive mentioned this before, but idea for a project i could start up, an opensimulator grid that allows users to create different models/interactive objects to maybe explain a concept/theroy, as well as be able to hold meetings for different somewhat hackathon like events, think its a good idea? [02:29:32] JasonDC: can you send the idea to the list? we may need to discuss whether it's feasible to do this in labs [02:29:45] sure thing [02:49:32] PROBLEM Disk Space is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.188 output: Connection refused by host [02:49:39] Change on 12mediawiki a page Wikimedia Labs/Account creation improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=637023 edit summary: [-119] /* Current account creation process */ [02:49:49] PROBLEM Total processes is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.188 output: Connection refused by host [02:49:52] PROBLEM Current Load is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.188 output: Connection refused by host [02:50:13] PROBLEM Free ram is now: CRITICAL on proposals-1.pmtpa.wmflabs 10.4.0.188 output: Connection refused by host [03:08:33] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 18% free memory [03:24:42] RECOVERY Current Load is now: OK on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: OK - load average: 4.59, 4.89, 5.00 [04:22:16] Change on 12mediawiki a page OAuth/Issues was modified, changed by 70.112.96.241 link https://www.mediawiki.org/w/index.php?diff=637029 edit summary: [-61] /* OAuth 1 */ Remove incorrect reference to signature and remove incomplete. [04:37:52] RECOVERY Free ram is now: OK on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: OK: 25% free memory [04:37:52] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 21% free memory [04:39:32] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [04:46:32] Hello all, I want to create an instance but I am told "No Nova credentials found for your account" at https://labsconsole.wikimedia.org/wiki/Special:NovaInstance [04:46:42] I am "Nicolas Raoul" on Labs, I added my SSH public key to both Labs and Gerrit. I can access the bastion via SSH. [04:53:32] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 20% free memory [04:56:43] PROBLEM Free ram is now: WARNING on newprojectsfeed-bot.pmtpa.wmflabs 10.4.0.232 output: Warning: 16% free memory [04:58:24] NicolasRaoul__: log in/out [04:58:26] err [04:58:28] out/in [04:58:34] OK! [04:58:46] was that your first login? [04:59:22] I have an open bug on that, but it should be the only case in which the error appears [05:00:53] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 13% free memory [05:01:56] [05:02:33] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [05:03:22] I can now access the instance creation dialog :-) but the instanceType and imageType selects have no selectable options :-/ [05:03:23] hm [05:03:29] sorry. I may have broken things today [05:04:48] yeah. definitely broke something [05:05:52] PROBLEM Free ram is now: WARNING on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: Warning: 16% free memory [05:06:20] * Ryan_Lane grumbles [05:11:37] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 18% free memory [05:25:40] NicolasRaoul__: ok. fixed it [05:26:07] stupid keystone requiring global roles be defined eventhough there's no such thing as a global role [05:27:30] Works great, xsmall instance created :-) [05:27:45] cool [05:30:58] xsmall is still overkill... maybe there is a way to put the instance to sleep most of the time and just wake it up once a day? I know it is virtualized but any OS has some load even when doing nothing [05:31:19] it's really not :) [05:31:35] as long as you're using it, don't worry about the resources it's eating [05:32:05] each system has 185GB of memory [05:32:10] and we have 7 per datacenter [05:33:52] PROBLEM Current Load is now: CRITICAL on generator.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [05:34:32] PROBLEM Disk Space is now: CRITICAL on generator.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [05:35:14] PROBLEM Free ram is now: CRITICAL on generator.pmtpa.wmflabs 10.4.0.189 output: Connection refused by host [05:38:54] RECOVERY Current Load is now: OK on generator.pmtpa.wmflabs 10.4.0.189 output: OK - load average: 0.32, 0.79, 0.52 [05:39:34] RECOVERY Disk Space is now: OK on generator.pmtpa.wmflabs 10.4.0.189 output: DISK OK [05:40:13] RECOVERY Free ram is now: OK on generator.pmtpa.wmflabs 10.4.0.189 output: OK: 83% free memory [06:25:56] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [06:28:43] Ryan_Lane: [06:28:49] ? [06:29:01] http://openid-wiki.instance-proxy.wmflabs.org/wiki/Main_Page broken. What happened [06:29:03] puppet problem ? [06:29:14] I sent you an email [06:29:16] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 151 processes [06:29:20] read this [06:29:30] I changed their configuration around so that they are using their internal hostnames [06:29:31] but the wiki as such does not work , have a look [06:29:39] arrrgjh. it breaks [06:29:40] since they can't talk to each other on their public hostnames [06:29:44] use a socks proxy [06:29:50] See the layout is brokwn [06:29:57] load.php does not work [06:30:28] because it's set to use internal hostnames [06:30:57] which is http://openid-wiki.pmtpa.wmflabs [06:30:59] http://i.imgur.com/EjAZ9ig.png [06:31:12] pls. do not break running systems [06:31:20] dude. did you read your email? [06:31:27] yes [06:31:34] leaving now, need to got to work [06:31:43] 7:30 a.m. localt time [06:31:54] checked yesterday, it works [06:32:05] * Ryan_Lane sighs [06:32:09] you're not listening to me [06:32:15] it's not *broken* [06:32:15] hehe [06:32:28] it's configured in a way that doesn't work through the instance proxy [06:32:44] set up a debug logfile on your consumer wiki [06:32:51] check for OpenID: lines [06:32:54] its current configuration requires a socks proxy [06:32:55] listen to me [06:33:00] ^ [06:33:12] set up a debug logfile on your consumer wiki [06:33:33] then the extension dumps several wfDebug lines with trailing OpenID: [06:33:40] sent me these lines pls [06:33:57] next contact: this evening, i.e. +/- 8 p.m. UTC [06:33:59] by [06:35:53] RECOVERY Free ram is now: OK on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: OK: 32% free memory [06:41:43] RECOVERY Free ram is now: OK on newprojectsfeed-bot.pmtpa.wmflabs 10.4.0.232 output: OK: 32% free memory [06:46:02] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [06:54:13] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 145 processes [06:55:54] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 27% free memory [06:56:02] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [07:00:19] Ryan_Lane, sorry for the incessant questions... cloud-init boot finished but when I try to login I get "Permission denied (publickey)" [07:00:40] I have read the doc many times but can't figure what I am doing wrong [07:00:47] did it say that it finished running puppet? [07:00:51] what's the instancename? [07:00:53] yes [07:01:00] generator.pmtpa.wmflabs [07:01:17] it's up [07:01:22] I am sshing from bastion, which I am ssh -A [07:01:23] but you may not have a forwarded agent [07:01:32] on bastion, type this: [07:01:33] ssh-add -l [07:02:26] The agent has no identities [07:02:54] right, so either your agent isn't forwarded, or you don't have any keys added to it on your local system [07:03:04] https://labsconsole.wikimedia.org/wiki/Access#Using_agent_forwarding says: On bastion: ssh .pmtpa.wmflabs [07:03:14] yes [07:03:23] but you need to add the key to your agent on your local system [07:03:36] using ssh-add [07:10:41] On my machine I did: eval `ssh-agent`; ssh-add ~/.ssh/id_rsa.pub; ssh -A nicolas-raoul@bastion.wmflabs.org [07:10:54] same key as uploaded to Labs and Gerrit [07:16:23] log off of bastion [07:16:26] ssh-add -l [07:21:21] just "ssh-add" works whereas "ssh-add ~/.ssh/id_rsa.pub" did not work [07:23:20] ah [07:23:26] oh [07:23:30] you don't add the .pub [07:23:33] you add the private [07:23:48] so, just: ssh-add ~/.ssh/id_rsa [07:24:37] yes I just tried that now, sorry for watsing your time! [07:24:46] it's no problem [07:51:55] I believe I am set up, thanks a lot for the support Ryan_Lane! [07:52:00] yw [08:37:35] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [08:47:20] PROBLEM Free ram is now: WARNING on nagios-main.pmtpa.wmflabs 10.4.0.120 output: Warning: 19% free memory [08:51:27] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [08:52:17] RECOVERY Free ram is now: OK on nagios-main.pmtpa.wmflabs 10.4.0.120 output: OK: 35% free memory [08:52:18] PROBLEM Current Load is now: WARNING on nagios-main.pmtpa.wmflabs 10.4.0.120 output: WARNING - load average: 8.02, 8.69, 7.02 [09:36:57] PROBLEM Free ram is now: WARNING on nagios-main.pmtpa.wmflabs 10.4.0.120 output: Warning: 11% free memory [10:01:13] RECOVERY Current Load is now: OK on nagios-main.pmtpa.wmflabs 10.4.0.120 output: OK - load average: 0.36, 0.69, 3.83 [10:03:33] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [10:13:33] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [12:33:33] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [12:40:34] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [12:48:33] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [12:48:33] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [13:46:34] PROBLEM Free ram is now: WARNING on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: Warning: 19% free memory [13:55:53] PROBLEM Free ram is now: CRITICAL on bots-4.pmtpa.wmflabs 10.4.0.64 output: Critical: 4% free memory [14:15:53] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 18% free memory [14:33:52] PROBLEM Current Load is now: CRITICAL on new-ns0.pmtpa.wmflabs 10.4.0.190 output: Connection refused by host [14:34:32] PROBLEM Disk Space is now: CRITICAL on new-ns0.pmtpa.wmflabs 10.4.0.190 output: Connection refused by host [14:35:12] PROBLEM Free ram is now: CRITICAL on new-ns0.pmtpa.wmflabs 10.4.0.190 output: Connection refused by host [14:36:42] PROBLEM Total processes is now: CRITICAL on new-ns0.pmtpa.wmflabs 10.4.0.190 output: Connection refused by host [14:37:22] PROBLEM dpkg-check is now: CRITICAL on new-ns0.pmtpa.wmflabs 10.4.0.190 output: Connection refused by host [14:38:52] RECOVERY Current Load is now: OK on new-ns0.pmtpa.wmflabs 10.4.0.190 output: OK - load average: 0.45, 0.82, 0.49 [14:39:32] RECOVERY Disk Space is now: OK on new-ns0.pmtpa.wmflabs 10.4.0.190 output: DISK OK [14:40:12] RECOVERY Free ram is now: OK on new-ns0.pmtpa.wmflabs 10.4.0.190 output: OK: 89% free memory [14:41:42] RECOVERY Total processes is now: OK on new-ns0.pmtpa.wmflabs 10.4.0.190 output: PROCS OK: 88 processes [14:42:23] RECOVERY dpkg-check is now: OK on new-ns0.pmtpa.wmflabs 10.4.0.190 output: All packages OK [15:03:34] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [15:23:42] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [16:11:28] !log wikidata-dev wikidata-heclient is the official Hebrew demo client (wmf8) until deployment in Jan 30th [16:11:31] Logged the message, Master [16:38:34] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [16:38:35] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 7% free memory [16:53:42] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [17:06:32] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [17:33:33] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [18:08:43] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [19:17:13] PROBLEM dpkg-check is now: CRITICAL on openstack-wiki-instance.pmtpa.wmflabs 10.4.1.49 output: DPKG CRITICAL dpkg reports broken packages [19:22:54] PROBLEM dpkg-check is now: CRITICAL on build-precise1.pmtpa.wmflabs 10.4.0.173 output: DPKG CRITICAL dpkg reports broken packages [19:28:13] hashar, can you catch me up on your beta/mobile work? Do you have a patch ready for review now and/or is there anything I can help with? [19:28:21] MaxSem: same question [19:28:35] yeah oh [19:28:37] so [19:28:50] andrewbogott: I made mark crying [19:28:53] andrewbogott, https://gerrit.wikimedia.org/r/#/c/46240/ [19:28:57] (not sure if that last sentence is proper english) [19:29:47] andrewbogott: and the varnish patch is at https://gerrit.wikimedia.org/r/#/c/44709/ [19:30:00] I had a funny debugging session [19:30:09] hashar: I thought you made mark cry two weeks ago… he's still displeased? [19:30:27] turns out varnishncsa (which logs queries in Apache format) has been patched by us to send logs over udp to an host:port [19:30:38] which override the local file logging :-D [19:30:52] andrewbogott: well my patches either makes mark cry or facepalm [19:30:54] :-D [19:31:23] hashar, those both seem like necessary steps on the road [19:31:25] I'm going to be upgrading gluster in about a minute [19:31:33] <^demon> Will be be faster? [19:31:34] it's a point upgrade [19:31:43] no, but I do have some good news there [19:32:09] when we switch to the virtio network driver for the instances it'll be about 3x faster [19:32:30] <^demon> Hmm, so instead of 45s to run gerrit init, it'll take 15s. [19:32:31] <^demon> Yay :) [19:32:39] andrewbogott: you guys need to setup Hiera , that would be a nicer way to load per DC / per realm configuration :-D [19:33:00] RECOVERY dpkg-check is now: OK on build-precise1.pmtpa.wmflabs 10.4.0.173 output: All packages OK [19:33:03] andrewbogott: so anyway I need mark to approve my varnish patch, it got deployed on the deployment-varnish-t3 instance already [19:33:23] http://en.wikipedia.m.beta.wmflabs.org/ <-- works! (despite the error message) [19:33:44] now I need to clean up the mediawiki configuration [19:34:09] hashar: are you in ops team? [19:34:17] matanya: not at all [19:34:50] matanya: but I have some skills in writing puppet manifests and a 30'000 feet overview of Wikimedia tech infrastructure [19:35:04] yes, hiera would be nicer for the private passwords and such, too [19:35:09] I am a contractor for Platform Engineering, in charge of continuous integration (aka the Jenkins master) [19:36:10] hashar, are mark's comments attached to that patch someplace, or have you been talking to him elsewhere? (Patch doesn't seem obviously stupid to me.) [19:36:33] need to catch him [19:36:53] he commented about it over IRC [19:36:53] hashar: OK, so I'll leave you to it unless you think there's something I can do to help. [19:37:03] anyway that is deployed on the instance already [19:37:07] so I no more consider it a blocker [19:37:33] andrewbogott: well you could review the patch, maybe there are some issues in it I did not catch https://gerrit.wikimedia.org/r/#/c/44709/ [19:38:09] Did mark want you to design it totally differently? e.g. refactor everything so that it can be done via a role? [19:38:35] not sure [19:38:52] that role::cache::mobile mimic what I have done for role::cache::bits already [19:39:18] but you are right, might be worthless to review this patch if mark prefers me to refactor it to something else [19:40:13] Meanwhile… I am rankly unqualified to review the cdb patch. [19:41:13] well brad did already [19:41:20] I guess I will get it applied on beta and test it out there [19:46:17] <^demon> hashar: +2 on cdb patch. [19:46:22] <^demon> Test on labs, and we can mark verified. [19:47:02] ^demon: goooood [19:47:06] thanks for the review :-] [19:47:14] <^demon> yw. [19:47:26] Ryan_Lane: hi. I see, that your busy with other stuff. But I'm now available to fix the E:OpenID for you. Switch the openid-wiki to use proxy again [19:47:43] s/Switch/I switched/ [19:47:44] we can't test it using the proxy [19:47:57] the instances can't talk to each other over public hostnames [19:48:11] that is what I supposed [19:48:54] it's possible to test using a socks proxy [19:48:55] but were you able to use it "as such", I mean with "normal" consumer [19:49:27] uh. not my area of competence... [19:49:30] !socks-proxy [19:49:30] ssh @bastion.wmflabs.org -D ; # [19:51:25] are you going to test this now ? [19:52:33] otherwise, I will concentrate on debugging and fixing https://bugzilla.wikimedia.org/show_bug.cgi?id=44416 [19:53:38] did you put a fix in? [19:53:46] otherwise, what am I testing? [19:54:46] Wikinaut: it needs to be switched away from the instance-proxy for testing [19:55:03] ^yes, this is why I ask you [19:55:16] but what am I testing? that's what I'm asking [19:55:17] whether you want to test with socks-proxy now [19:55:28] did you make any changes? [19:55:46] as said, I switched to "normal" mode, the line you added [19:56:07] if you want, I can restore "your" mode [19:56:13] .... [19:56:27] if all you did was change it back to the instance proxy, what am I actually testing? [19:56:42] the same exact thing I tested yesterday. [19:56:46] it's going to fail in the same way [19:56:56] so what? [19:57:03] for me, it works [19:57:14] the Extension works [19:57:20] but not for your special case [19:57:22] no. it doesn't [19:57:30] are you crazy? [19:57:49] look. I'm not saying every single use case of the extension is broken [19:57:59] well... [19:58:01] ;-) [19:58:05] stop taking major offense when I say that things are broken [19:58:14] accepted [19:58:18] I'm talking about the *specific* thing that we're testing being broken [19:58:20] but I invested so much time [19:58:29] and dod not hear or read a nice word... [19:58:32] did [19:58:51] so I suggest this now: [19:59:00] let me work on that instance in "my" mode [19:59:10] so that I can fix the 44416 bug [20:01:15] Ryan_Lane: pls. don't touch the openid-wiki until further notice [20:01:23] hold on [20:01:30] I'm going to make it work in both modes [20:01:34] Yes! [20:01:39] this is what I tried already [20:01:44] but wanted to let you do this [20:01:48] because you are quicker [20:01:55] Damianz: around? [20:02:08] (standby) [20:02:31] Wikinaut: there. it works in both modes now [20:03:35] Ryan_Lane: nice. ty [20:03:42] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [20:04:49] :-) /orig/LocalSettings.php [20:06:06] btw, why not use that, rather than that file in the extensions directory? [20:06:32] it's my way of working on my wikis. [20:06:38] you said: self-service [20:20:02] Ryan_Lane, since i have a few mins, did you have any questions about the OpenSim idea? [20:27:24] PROBLEM Total processes is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS WARNING: 152 processes [20:28:34] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [20:38:33] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 7% free memory [20:41:34] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [20:47:25] RECOVERY Total processes is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS OK: 150 processes [20:53:23] PROBLEM Free ram is now: WARNING on newchanges-bot.pmtpa.wmflabs 10.4.0.221 output: Warning: 14% free memory [20:53:53] PROBLEM Current Load is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.0.24 output: Connection refused by host [20:54:33] PROBLEM Disk Space is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.0.24 output: Connection refused by host [20:55:12] PROBLEM Free ram is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.0.24 output: Connection refused by host [20:58:52] RECOVERY Current Load is now: OK on checkwiki-web.pmtpa.wmflabs 10.4.0.24 output: OK - load average: 0.16, 0.60, 0.45 [20:59:32] RECOVERY Disk Space is now: OK on checkwiki-web.pmtpa.wmflabs 10.4.0.24 output: DISK OK [21:00:13] RECOVERY Free ram is now: OK on checkwiki-web.pmtpa.wmflabs 10.4.0.24 output: OK: 89% free memory [21:09:33] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [21:28:34] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 5% free memory [21:33:40] login will fail for a couple minutes [21:33:47] while I upgrade gluster on labstore1 [21:33:59] fun [21:35:43] PROBLEM Total processes is now: CRITICAL on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: PROCS CRITICAL: 257 processes [21:36:33] RECOVERY Free ram is now: OK on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: OK: 92% free memory [21:37:34] bleh. gluster didn't start nfs [21:38:23] ugh. even worse, a lot of bricks are showing as down for volumes [21:41:42] PROBLEM Total processes is now: CRITICAL on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: PROCS CRITICAL: 718 processes [21:43:52] PROBLEM Current Load is now: CRITICAL on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: Connection refused by host [21:44:32] PROBLEM Disk Space is now: CRITICAL on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: Connection refused by host [21:44:56] ah. needed ulimit increased. different init script [21:45:11] we are here notpeter [21:45:12] PROBLEM Free ram is now: CRITICAL on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: Connection refused by host [21:45:22] matanya: woo! [21:45:31] listening [21:46:03] hey Ryan_Lane, matanya needs to reset their credentials [21:46:06] what's the process for this? [21:46:09] in ldap? [21:46:17] have labsconsole send a password [21:46:42] PROBLEM Total processes is now: CRITICAL on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: Connection refused by host [21:46:50] all I want it a shell to work on [21:47:15] I'll be the happiest man on earth with shell [21:47:22] PROBLEM dpkg-check is now: CRITICAL on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: Connection refused by host [21:47:47] matanya, you have shell rights already… can you explain what your specific issue or goal is? [21:48:04] how do I connect? [21:48:11] ssh what? [21:48:52] RECOVERY Current Load is now: OK on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: OK - load average: 0.45, 0.85, 0.52 [21:49:24] matanya, labs access is divided into projects. Are you part of a project already, or do you want to start a new one? [21:49:31] Or, are you an existing labs user and just lost your password? [21:49:32] RECOVERY Disk Space is now: OK on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: DISK OK [21:49:52] I guess I want to start a new one [21:50:11] I quite sure I am an existing labs user [21:50:13] RECOVERY Free ram is now: OK on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: OK: 91% free memory [21:50:50] matanya: Yep, you definitely have a labs account. But you also need to be in a project to do anything. If you're starting a new project and working on your own, that's fine, I can set things up. [21:51:24] please do [21:51:33] I have a nice project to start :) [21:51:43] RECOVERY Total processes is now: OK on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: PROCS OK: 83 processes [21:51:56] the error I got: If you are having access problems, please see: https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [21:51:56] Connection closed by 208.80.153.207 [21:52:05] for background (in case you don't know this), labs isn't a single server, it's a cloud of virtual machines. So 'shell access' isn't really a thing on its own… you will want to create yourself a VM, to which you will subsequently get shell access :) [21:52:23] RECOVERY dpkg-check is now: OK on rt-puppetdev.pmtpa.wmflabs 10.4.0.201 output: All packages OK [21:52:26] What would you like the project to be called? [21:52:49] nagios [21:54:02] BTW, how do I create a VM there? any docs about it? [21:54:17] I fucking hate glusterfs so much [21:55:42] Ryan_Lane, mutante: matanya would like to do some nagios development. Should he use the existing nagios project (in which you two are sysadmins) or create his own? I know that the existing project is used for some semi-prod purposes. [21:55:43] RECOVERY Total processes is now: OK on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: PROCS OK: 111 processes [21:57:14] Ryan_Lane, mutante: matanya would like to do some nagios development. Should he use the existing nagios project (in which you two are sysadmins) or create his own? I know that the existing project is used for some semi-prod purposes. [21:57:22] we're having a gluster outage [21:57:29] thanks to a *point* upgrade [21:57:34] what a piece of shit [21:58:55] andrewbogott: he should probably use that project but not that instance [21:59:09] matanya: OK… for background, there is already a project called 'nagios'. It was set up mostly by petan, so you may want to coordinate with him. [21:59:19] andrewbogott: i suggest he creates a new, fresh, instance in that project and then tries to apply puppet classes used in production [21:59:31] matanya: In the meantime, I'm not going to give you sysadmin for the project but I will give you access rights and create a VM for you to use. [21:59:38] since the current Labs Nagios is different from prod Nagios/Icinga [21:59:40] Sound OK? [22:00:02] that way anything that is tested in Labs and works, can be transferred to prod [22:00:19] otherwise we will have manual setups that work but it does not help prod Nagios [22:00:42] matanya: and note that we just are about to switch to Icina [22:00:42] Icinga [22:00:52] matanya: http://neon.wikimedia.org/icinga/ [22:01:34] there is class icinga::monitor in puppet.. i would try that out on the new instance [22:01:48] matanya: Hm… I need to step away for half an hour or so… I will check in when I return. Meanwhile maybe you can peruse the labs docs :) [22:01:51] you can look at site.pp and find node "neon" to see how its done [22:02:07] https://labsconsole.wikimedia.org/wiki/Help:Contents [22:02:27] https://labsconsole.wikimedia.org/wiki/Git [22:02:48] this has that example how to clone a git repo.. operations/puppet is the one you want [22:06:04] great [22:16:31] mutante: I don't have permission to see that [22:17:17] matanya: which one? i think all URLs i pasted should be public [22:17:29] http://neon.wikimedia.org/icinga/ [22:18:02] what do you get? i didnt realized that was firewalled or something [22:18:10] realize [22:19:24] request for user name and password [22:20:07] matanya: do you have https everywhere or some such? [22:20:13] http shouldn't ask for a pw [22:20:20] I do... [22:20:22] PROBLEM Total processes is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS WARNING: 151 processes [22:20:31] ah, you need to use http [22:20:39] https will attempt to log you in [22:20:51] oh, good catch.. yep [22:21:01] same with nagios.wikimedia.org [22:21:14] there i'm not requested [22:21:27] ah, now I see it. much nicer [22:21:37] yea, it's how i usually get to login.. i change http to https in the URL bar and hit enter :p [22:22:11] matanya: compare to nagios.wikimedia.org to see the difference between Nagios and Icinga [22:22:17] it should have identical checks [22:22:20] but a different UI [22:22:38] I care nothing for UI [22:23:17] links is my favourite browser [22:24:50] so all I need to do is puppitize the operations/puppet? no installing and no scripting? [22:33:33] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 6% free memory [22:39:34] PROBLEM Current Load is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: WARNING - load average: 6.00, 6.01, 5.31 [22:41:13] matanya: ideally everything is in puppet, exactly [22:41:24] I checked it out [22:41:37] I see I can help out quite a lot [22:41:43] it should install icinga for you and do all the rest [22:41:53] no need [22:42:03] reading it is enough for me [22:42:43] just need some time with someone to see the configuration in the real servers [22:42:59] http://neon.wikimedia.org/icinga/ requires me to log in, notpeter ... [22:43:15] matanya: but.. it is ..the configuration of the real servers ... [22:43:24] puppet configures them [22:43:34] Krenair: must use http [22:43:36] not https [22:43:41] yes [22:43:43] that is http [22:43:47] so where is /etc/nagios/checkcommands for example? [22:44:48] Krenair: http works for me w/o password... [22:45:31] matanya: where do you see that path? [22:45:46] inside nagios.cfg [22:45:54] Ah, HTTPS Everywhere was causing it... [22:46:07] Krenair: heh. problem solved! [22:46:30] that is the main commands to run (apart from misc commands) [22:46:37] matanya: on which host? [22:46:43] matanya: are you talking about Labs Nagios? [22:46:53] that wass exactly my point ...sigh [22:46:54] yes [22:47:02] Labs Nagios has nothing to do with Production Nagios [22:47:05] unfortunately [22:47:10] ah, got it [22:47:11] it is installed manually by petan [22:47:22] so I'll waste my time here? [22:47:46] trying to look at current Labs Nagios if you want to fix Production Nagios, yes [22:48:06] creating a fresh instance and applying the actual puppet classes ..absolutely not [22:48:22] I see nagios.pp, is that the one? [22:48:39] and icinga.pp [22:48:54] see we almost switched it over.. we kind of have both right now [22:49:36] does it matter on which one I work? [22:50:14] it seems to make more sense to use Icinga [22:50:24] we will be using that in prod very soon [22:51:38] i mean, we actually do.. it is on production servers and runs.. it is merely a question which one has IRC bots and sends pages [22:51:44] any reason it is not in the manifests? [22:51:58] what? [22:52:21] oh, it is in ./misc [22:52:23] why is it under misc and not [22:52:27] ./manifests/misc/ [22:52:30] no reason [22:52:34] under manifests [22:52:38] ok [22:52:43] good answer [22:52:52] :) [22:53:06] I'll learn ruby and start being useful [22:53:11] :) [22:53:15] cool:) [22:53:22] thanks matanya [22:53:44] * matanya starts coding [22:56:39] matanya: When it comes time to write/test/debug puppet code, you'll want a box for puppet dev. That's documented here: https://labsconsole.wikimedia.org/wiki/Help:Self-hosted_puppetmaster [22:57:05] But I haven't made you an instance yet because we're having a minor labs outage. Updates as events warrant :) [22:57:41] ok, thanks. that is ok. [23:00:22] RECOVERY Total processes is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS OK: 149 processes [23:11:10] mutante: a few more questions please? [23:11:35] hopefully the gluster issues don't turn into a bunch of corruption [23:12:12] matanya: what is it [23:13:06] I need to install a few packages in the process do they go inside this icinga.pp or some other pp file? [23:13:33] PROBLEM Free ram is now: CRITICAL on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Critical: 4% free memory [23:14:40] matanya: in this case, it is all in that file. but the same file has different classes. and this would likely be class "icinga::monitor::packages" [23:15:15] matanya: if you want to make it even better, you can turn it into a role class and create ./role/icinga.pp which includes all classes you want to apply on a specific "role" of server [23:15:31] and even better than that would be also turning it into a puppet module [23:15:56] so i did it right. ok, thanks. does the puppet master have to have the files or he can call them from the web? [23:16:29] you have basically 2 options.. use the default puppetmaster (not on your instance), means you have to merge changes for the master to see them... [23:16:41] or using the puppetmaster::self stuff Andrew linked you to [23:16:56] then you can just change manifests on your local instance.. without having to merge everything [23:17:08] and once it works you can merge things [23:17:17] and he can use wget and apt-get and the like? [23:17:28] the drawback is that there is no easy way to switch back from puppetmaster::self to the external master [23:17:41] you would then just create a fresh instance again and apply your classes [23:17:49] I see [23:18:09] ok, cool enough. a lot of code to right. [23:18:10] matanya: if you define a "package" in puppet like line 724 in icinga.pp ... [23:18:45] PROBLEM Total processes is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [23:18:47] *write [23:18:52] PROBLEM Current Load is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [23:19:01] then puppet will interpret that as "i have to install this package" and then checkout what kind of OS this is on .. and once it figures out this is APT based Ubuntu... the puppet provider will "translate" that for you ..and use apt-get in the background to do the job [23:19:13] this is whats cool about puppet .. it becomes platform-independent [23:19:22] PROBLEM dpkg-check is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [23:19:32] PROBLEM Disk Space is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [23:19:37] Ah! cool! [23:20:14] PROBLEM Free ram is now: CRITICAL on checkwiki-web.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [23:20:31] we only have ubuntu and some SUN's no? [23:20:43] note you have an option between "ensure => latest" and "ensure => present" with packages [23:20:54] present means it just cares that some version of this package is installed [23:21:04] and latest means it will try to update it if there are updates. [23:21:14] that means effectively you are doing unattended automatic upgrades [23:21:18] that is quite self-explanatory [23:21:21] correct about Ubuntu [23:21:41] matanya: I think he's trying to say "warning: latest is risky" [23:21:45] toolserver is Solaris [23:21:55] and that needs to be converted [23:22:05] We generally avoid using latest unless there's a good reason for ti [23:22:15] yea [23:22:34] RoanKattouw: I suffered from it just today at work. kernel patch broke firmware of network card... [23:22:55] heh yeah, we don't install kernel updates via puppet [23:23:01] glad I have idrac [23:23:05] that is still a manual thing, it requires reboots anyways [23:23:24] not using puppet at work :) [23:23:39] manual patch for a very good reason [23:24:29] scripting idrac is another thing we might like.heh [23:24:48] add to my to do list [23:25:24] who would a line in my script: echo 'deb http://linux.dell.com/repo/community/deb/latest /' | sudo tee -a /etc/apt/sources.list.d/linux.dell.com.sources.list [23:25:31] would be puppetized? [23:25:37] *how [23:26:06] you would define /etc/apt/sources.list.d/linux.dell.com.sources.list as a "file" in puppet [23:26:18] oh, now I see the point [23:26:20] and a file has a "source" and that points to some file checked into gerrit [23:26:27] eh git [23:26:43] RECOVERY Total processes is now: OK on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: PROCS OK: 112 processes [23:27:14] matanya: the step above from simple files is using ERB templates.. you can see some in ./templates/ [23:27:34] that is if you want to use variables in a file and have puppet replace them with values on execution [23:28:10] sounds handy [23:28:19] or.. there might also be an existing puppet provider thing to add deb sources ,already made for just that [23:28:28] there is "puppet forge" with existing stuff [23:29:19] so can I just create a script and let puppet run it everywhere instead of breaking the script itself? [23:32:03] mutante: thank you very much for all the help. I'm off to bed. might nag you some other time :P [23:34:33] RECOVERY Current Load is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: OK - load average: 0.37, 3.48, 4.99 [23:38:30] matanya: yw. cu [23:56:48] Ryan_Lane: Success ! have a fix for https://bugzilla.wikimedia.org/show_bug.cgi?id=44416 [23:57:00] cool [23:57:06] not a fix, but simply the correct setting [23:57:06] we're having an outage [23:57:15] I'll test things out when I get a chance [23:57:21] uh [23:59:55] works now like with google: a single server url wiki/Special:OpenIDServer/id for requesting auth. When correctly authenticated, your OpenID is saved as wiki/User:Name .