[00:39:12] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [00:41:53] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [00:55:48] [bz] (8ASSIGNED - created by: 2Antoine "hashar" Musso, priority: 4Highest - 6normal) [Bug 45084] autoupdate the databases! - https://bugzilla.wikimedia.org/show_bug.cgi?id=45084 [00:59:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [01:07:12] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [03:18:52] PROBLEM dpkg-check is now: CRITICAL on pdbhandler-2.pmtpa.wmflabs 10.4.1.73 output: DPKG CRITICAL dpkg reports broken packages [03:23:52] RECOVERY dpkg-check is now: OK on pdbhandler-2.pmtpa.wmflabs 10.4.1.73 output: All packages OK [04:37:13] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [04:39:53] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [04:50:12] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [04:57:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [05:52:42] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Call to popen() failed [05:55:12] PROBLEM Disk Space is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: CHECK_NRPE: Error - Could not complete SSL handshake. [05:57:42] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [06:00:12] RECOVERY Disk Space is now: OK on aggregator2.pmtpa.wmflabs 10.4.0.193 output: DISK OK [06:45:16] PROBLEM Free ram is now: CRITICAL on sube.pmtpa.wmflabs 10.4.0.245 output: Critical: 3% free memory [06:50:14] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [07:32:42] PROBLEM Free ram is now: UNKNOWN on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Call to fork() failed [07:37:33] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [08:31:21] !log integration dist-upgrade on integration-jenkins2 [08:31:22] Logged the message, Master [08:37:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [08:40:12] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [08:52:52] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 12% free memory [09:00:53] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [09:08:13] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [09:17:34] PROBLEM Free ram is now: CRITICAL on aggregator2.pmtpa.wmflabs 10.4.0.193 output: NRPE: Call to popen() failed [09:22:42] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 9% free memory [10:58:01] !ping [10:58:02] pong [12:37:52] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 21% free memory [12:38:12] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [12:40:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [12:42:00] Almost 400 tools to import? Some people will have a lot of work :D [13:01:17] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [13:03:47] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [13:05:53] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 14% free memory [13:21:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 159 processes [13:25:55] @labs-project-users webtools [13:25:55] Following users are in this project (showing all 18 members): Novaadmin, Platonides, Ryan Lane, Odie5533, Gifti, Dschwen, Russell Blau, Tb, Ireas, Danmichaelo, Petrb, Kentaur, Vacation9, Fox Wilson, Tim Landscheidt, Coren, Darkdadaah, AzaToth, [13:26:16] I think importing 400 tools is not a hard work for 20 people :P [13:26:22] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 150 processes [13:34:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 160 processes [13:34:26] More than half are probably just bots, though. [13:36:10] I wonder if I have included even half of them in the list... [13:37:57] It depends on what proportion of them have hardcoded dependencies or will otherwise be difficult to refactor [13:42:09] Change on 12mediawiki a page Wikimedia Labs/Toolserver features needed in Tool Labs was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=648778 edit summary: [+42] added links section with link to list of tools [13:43:08] I think there are three classes of tools; those that will be fairly simple to move over once the software dependencies are there (fast, easy, not much work), those that will need some help to tweak, and those that will need extensive rejiggering. [13:43:27] I expect the first category is the vast majority of tools. [13:44:11] That is reassuring! [13:45:44] I'm going to have to hit the ground running on the 25th though. What worries me is tools that have already migrated/in progress since some of the environment and infrastructure is bound to change early [13:45:57] I'd hate for people to have to adapt things twice. [13:46:38] I think I need to up the priority of the uid/gid bug - the sooner this is done the less disruption. [13:48:45] I'm at a stage where I'm mainly getting used to the new environment. I'll wait until things are more stable before I seriously move my bigger tools. [13:52:10] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=648781 edit summary: [+86] /* Tool Labs */ added link to list of tools on toolserver [13:53:46] Darkdadaah: Wise indeed, though that shouldn't take very long. The two bigger changes (tool setup/management and scheduling) are my top priorities, and much of the rest is going to be detail work / new functionality [13:56:58] [bz] (8NEW - created by: 2Marc A. Pelletier, priority: 4Highest - 6enhancement) [Bug 45119] Add per-project service/role user accounts and groups - https://bugzilla.wikimedia.org/show_bug.cgi?id=45119 [13:57:37] I don't mind waiting, though. [13:58:22] There are hundreds of tools to migrate; I'm thinking every bit of delay upfront is going to be paid back tenfold in a few months. :-) [14:01:37] The toolserver is supposed to last until at least 2014, I hope this will be enough for most people to move. [14:02:14] Darkdadaah don't tell me half of users of webtools are bot :D [14:02:20] I don't see a single one in that list [14:03:14] Darkdadaah: I expect it will, but given the numbers of bugs, kinks and problems that are inevitable on the way, we don't want a race down the wire. :-) [14:04:29] petan: the first ~100 tools listed are bots. [14:04:42] * Coren grumbles. [14:04:47] I though you mean a list of these 20 users :P [14:05:13] Oh! [14:05:38] I do hope they are not bots ^^ [14:05:45] also who knows how many of these 400 are still working [14:08:19] Maybe all owners of tools should be contacted to fill their status. [14:08:58] tbh I have never ever documented any of my tools on TS on their wiki [14:08:59] yeah…good luck [14:09:06] i checked the list yesterday [14:09:18] and for some reason my bot that i stopped running 3 years ago was on it :/ [14:09:25] heh [14:09:32] it was compiled from TS wiki [14:09:38] oh lol [14:09:42] which is only being used by small number of TS users [14:09:43] i havent touched that in a while [14:09:57] the reality is very different [14:10:18] I guess that only around 50-100 of these tools will be needed to move [14:14:58] maybe [14:15:01] but you'll find out [14:15:06] a few months down the road [14:15:13] that an obscure tool didnt get moved [14:29:02] On the other hand, all toolserver users are required to follow the toolserver-l list. If some people have hidden tools and don't add them to the list from Silke_WMDE_, we may assume that they are not interested to move (or that they plan to move by themself). [14:33:06] andrewbogott_afk: Ping! [14:53:52] PROBLEM Free ram is now: WARNING on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: Warning: 19% free memory [15:25:19] ^^ I am planing something like a mail to all active accounts pointing to the list. [15:26:00] and probably also to all inactive accounts to make sure [15:26:41] Silke_WMDE_: That would be good, yes. [15:44:12] PROBLEM dpkg-check is now: CRITICAL on integration-contintrefactor.pmtpa.wmflabs 10.4.1.52 output: DPKG CRITICAL dpkg reports broken packages [15:50:21] Coren: If I wanted to send a mail, shall I use the mail address you use on labs-l? Or wait until you have a dedicated WMF address? [15:51:37] Silke_WMDE_: use my current address. If at all possible, I'll have the WMF address .forward to my own infrastructure anyways: I have a fancy .procmail setup here to handle my mailing lists and stuff. :-) [15:51:54] :) ok [15:54:40] !log deployment-prep applied role::cache::text on deployment-cache-text01 [15:54:42] Logged the message, Master [16:05:25] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 190 processes [16:20:24] PROBLEM Total processes is now: CRITICAL on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS CRITICAL: 201 processes [16:38:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [16:40:23] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 190 processes [16:40:53] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 22% free memory [16:41:13] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [16:50:00] !log wikidata-dev wikidata-testrepo: Installed Babel extension (put it into wikidata.pp and orig/LocalSettings.php manually) [16:50:01] Logged the message, Master [16:53:53] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 14% free memory [16:56:34] Do we have a host where one can reasonably build a package? [16:57:04] Or do I just pick a reasonable instance in the project? [16:58:00] Coren: I'm trying to remember where I did it last... [16:58:08] But mostly the answer is: yes, just pick a reasonable instance [16:58:35] andrewbogott: Oh, and I got a request for you for a change on bastion. Do I open a bug or just bug /you/? :-) [16:59:21] It depends on the kind of requets. If it's just an access request then there's a form on labsconsole for that... [16:59:29] "X11Forwarding yes" in sshd_config [16:59:54] Oh -- yeah, best to make a bug for that, probably needs some discussion. [16:59:59] kk [17:00:47] Pretty sure none of our image types include X so… I don't know that that would be useful [17:03:44] [bz] (8NEW - created by: 2Marc A. Pelletier, priority: 4Unprioritized - 6normal) [Bug 45157] X11 Forwarding for sshd on bastion* - https://bugzilla.wikimedia.org/show_bug.cgi?id=45157 [17:04:13] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [17:04:59] Don't need an X server installed to be useful, just Xlib (and some packages have that as dependency) [17:05:17] Case in point: qmon. :-) [17:06:04] Hmmm. Should that have been an 'enhancement'? [17:06:52] Probably, but I think those tags barely matter. [17:11:53] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [17:23:26] !log webtools manual install of {build-essential,debhelper,devscripts} on webtools-login (building a dpkg) [17:23:27] Logged the message, Master [17:31:38] Hi andrewbogott! Just a little ping for puppet stuff. Would be cool if you could look at the open reviews and also hoo's changeset (48979). [17:32:05] Silke_WMDE_: Yep, I will catch up shortly. (Yesterday was a holiday in the US) [17:32:26] oh, what holiday? [17:33:32] 'Presidents day,' the holiday formerly known as George Washington's Birthday [17:33:36] http://en.wikipedia.org/wiki/Presidents_Day_%28United_States%29 [17:33:43] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 10% free memory [17:34:46] ah [17:35:51] And, yes, it is probably the holiday with the weakest premise [17:37:40] * Silke_WMDE_ is calling it a day... CU all [17:43:42] PROBLEM Free ram is now: CRITICAL on bots-4.pmtpa.wmflabs 10.4.0.64 output: Critical: 4% free memory [17:48:42] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 10% free memory [17:55:23] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 148 processes [18:03:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 152 processes [18:15:03] Coren: why do you need x11? [18:15:53] Ryan_Lane: qmon is my immediate use-case. Honestly, it's not high priority, but it's in the "would be really handy" category. [18:16:26] Ryan_Lane: I can circumvent with tunnels, but that's (a) an ugly hack and (b) not secure since it doesn't allow for proper xauth [18:16:38] what's qmon? [18:16:55] ah [18:16:55] I see [18:16:56] Monitoring/management interface for gridengine [18:17:09] is a gui interface necessary? [18:17:15] do folks use it in the toolserver? [18:17:24] Ryan_Lane: No. But it's really useful during setup and debugging. [18:17:29] * Ryan_Lane nods [18:17:33] Ryan_Lane: Once in production, not so much. [18:17:52] Ryan_Lane: (but still handy for those who know/want to use it) [18:17:53] you should be able to install the libs on the end system [18:18:04] and to do x11 forwarding [18:18:09] even through the bastion [18:18:13] if you use proxycommand [18:20:03] Yes, that works but is really insecure (I need to xhost + on my desktop to allow the connection since there is no xauth cookie). If there's a problem with X11Forwarding, I can live it with though. [18:20:40] Just by curiosity though, why no X11Forwarding allowed? It has limited use, but causes no security exposure that I know of. [18:20:54] how is that insecure? [18:21:16] with proxy command you're basically making a direct connection to the instance [18:21:22] Because the socket talking to my X server on the client can be connected to by anyone on that instance. [18:21:49] Whereas with X11 forwarding they need to be root to steal my xauth [18:22:29] Like I said, in practice, it's a minor inconvenience at worse. In theory, it's a Bad Thing (tm). :-) [18:22:44] I'm confused as to why it would need to be done on the bastion, rather than on the target instance [18:23:02] Ah, because I can't ssh right to the target instance! [18:23:10] proxycommand? [18:23:29] oooooh! [18:23:38] !access [18:23:38] https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [18:23:49] You mean tunnel the *ssh* through bastion and not the qmon! [18:23:52] https://labsconsole.wikimedia.org/wiki/Access#Using_ProxyCommand_ssh_option [18:23:58] yes [18:24:18] Oh, yes, that'd work and only require X11Forwarding on the actual instance. [18:24:18] then you enable x11 forwarding on the target instance [18:24:29] I'm pretty sure there's a config option for this, too [18:24:47] Yeah, okay, that'd work. [18:25:49] * Coren didn't think of that. [18:26:44] ssh_x11_forwarding [18:26:50] that variable in puppet can be set for this [18:26:56] let me add it to the global options [18:27:47] ok. it's available for instances from the "configure" action [18:27:50] under "ssh" [18:27:52] ssh_x11_forwarding [18:28:23] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 148 processes [18:28:29] value would be: yes [18:29:14] You are most excellently useful. Thanks. :-) [18:30:19] Ryan_Lane: While we're at ssh, could we get rid of the message "If you are having access problems, [...]"? If you can't reach bastion, you don't see it (!). If you can reach bastion, but can't reach the instance, you should already know where to look. If you *can* reach the instance, it gets displayed twice. It's rather annoying when you "ssh $INSTANCE $COMMAND". [18:30:43] does it break anything for you? [18:31:06] if not, then ignore it [18:31:20] there's also some way to disable it from the client [18:31:24] No, it doesn't. [18:31:41] I'd rather annoy people than get constant complaints in irc about "I can't connect" [18:32:10] that message dramatically decreased the number of questions I got on that topic [18:32:54] Ryan_Lane: In light of our happy-hour mess on Friday… do you still want to pursue the switchable shared-volumes thing? [18:33:04] andrewbogott: yes [18:33:19] also, it seems that the problem solved itself [18:33:30] by letting the regeneration finish [18:33:30] via self-heal? [18:33:53] oh, I also learned something else [18:35:08] now to find it again [18:36:48] andrewbogott: http://joejulian.name/blog/dht-misses-are-expensive/ [18:37:00] seems in this version of gluster that files with 1000 permissions are ok [18:37:40] I love how they chose something so similar to something in the past that indicated a broken filesystem [18:38:35] I feel like the reducto ad absurdum of that post is that the file system should instantiate every possible filename [18:38:38] and /that/ can't be right [18:39:03] it only does it when it needs to do a lookup against every brick [18:39:19] Yeah, ok. [18:39:44] So… on Friday you fixed the problem in the first 10 minutes [18:39:51] yep [18:40:10] that script I wrote should make fixing split-brains pretty easy [18:40:24] * andrewbogott not sure if that's a loss or a win [18:40:27] heh [18:40:34] well, I'm going to enable quorums, too [18:40:39] But, cool, that means we're back to thinking we know how to deal with this. [18:40:44] so hopefully we won't have any more split-brains [18:41:02] PROBLEM Total processes is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: PROCS WARNING: 152 processes [18:41:14] enabling quorums will mean that file access will fail anytime < all four hosts are working, right? [18:42:49] Do we have the hardware to run six hosts instead of four? Seems like with replication=2 and quorums that would be much more reliable. [18:43:00] trying to find proper docs on it [18:44:02] best I can find: "As of commits 76d5e5d and 1b3571d - September 20 and November 21 respectively - you can set the number of bricks that must be reachable in order for a client to write, using the "cluster.quorum-type" option.  If the value is "auto" then quorum equals (N+1)/2 out of N bricks, or exactly N/2 if that includes the first brick.  If the value is "fixed" then you can specify a particular value with the "cluster.quorum-count" [18:44:08] I have a feeling that got cut off [18:44:17] http://community.gluster.org/q/what-glusterfs-do-when-a-split-brain-happens/ [18:44:33] Ryan_Lane: I don't see any way in ssh to disable the banner display; I'll document it on the wiki. [18:44:44] (N+1)/2 [18:44:57] which is two or three? [18:45:05] If two then it won't change behavior at all... [18:45:22] N = 4 [18:45:34] heh [18:45:37] good point [18:45:40] I'd hope that means 3 [18:45:54] Is 3 meaningful with replication=2? Seems like our bricks always live or die in pairs. [18:46:02] yes [18:46:09] because there's still 3 bricks [18:46:22] err [18:46:29] there's 4 bricks [18:46:45] replication=2 just means there will always be 2 copies [18:46:57] if one brick dies, gluster should choose another brick as the replica [18:47:18] OK. If it can still function with 3 then what I said about needing six hosts is wrong, and this should work fine. [18:50:42] Occasionally I note that nova-precise2 can't actually spin up new VMs. Would you guess that that's just lack of CPU/RAM or a misconfiguration thing? (I haven't investigated the failures at all) [18:51:51] I didn't even realise nova-precise2 was supposed to be able to start new VMs of it's own.. [18:51:59] andrewbogott: for sure a lack of ram [18:52:17] Krenair: if it has enough ram, it'll launch one [18:52:20] it won't be accessible [18:52:24] but it'll run [18:52:32] it uses qemu [18:52:34] Krenair: I was dumb and built it on a 'small' instance. I hope to rebuild it soon [18:52:44] oops :P [18:52:50] let's puppetize it a bit more before then :) [18:52:58] it was a bitch last time [18:53:19] also, let's use the default passwords. I had to change the non-default ones back to the default ones [18:53:25] Yeah, although this time we have the option of rsync which we didn't have when going from precise1 to precise2 [18:54:41] Yeah I gave up trying to work out my nova-precis2 wiki/ldap password the other day :p [18:55:07] andrewbogott: true [18:55:12] Krenair: it stopped working again? [18:55:15] oh [18:55:22] there still may be an issue with keystone [18:55:24] idk [18:56:57] I'm updating puppetmaster::self [18:57:01] on that node [18:57:11] and purging keystone and re-running puppet [19:04:34] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 16% free memory [19:17:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 153 processes [19:17:45] andrewbogott: so, I think we can/should roll out the interface changes first [19:17:54] the ldap changes are already pushed out [19:18:13] I'm going to start disabling project storage today [19:18:24] ok. I'm still chasing a bug but should be ready soon. [19:18:39] Wait, when you say 'disabling project storage'... [19:18:42] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 9% free memory [19:18:49] You mean for projects that don't use it? [19:18:49] not all of it [19:18:52] yep [19:19:07] which means the next step is to fix the script :) [19:20:04] first step is to find projects with no instances [19:22:36] it definitely needs to be fixed for the auth.allow problem [19:23:52] I will look at that next, if you haven't done it by then [19:24:19] * andrewbogott wishes that ldap_modify told me a little bit more than 'success' or 'failure' [19:24:54] what problem are you having? [19:25:07] I think you'll get to the script before me [19:25:18] I'm going to go through and unmark projects that should be disabled [19:25:23] it's likely to take me a while [19:26:17] Ah, ldap_modify is failing when I write out the changes. Probably I'm misformatting, it's just clumsy to debug. [19:26:41] nothing appears in the ldap log so must not be getting that far [19:26:59] ah [19:27:13] setVolumeSettings, here: https://gerrit.wikimedia.org/r/#/c/49375/1/OpenStackNovaProject.php [19:28:25] oh [19:28:30] hm [19:29:40] I'm not clear on use of modify vs. add for multi-value fields [19:29:52] Like, does modify fail if there aren't any infos in the record? [19:29:52] you have to use modify [19:30:07] you can only use add if the attribute doesn't exist [19:30:11] otherwise you have to use modify [19:30:24] right, but will 'info' always exist? [19:30:27] you aren't adding a field to the attribute, you're modifying the attribute [19:30:45] and technically, you're modifying the entry [19:31:03] the ldap library should figure out if the attribute needs to be added or deleted [19:32:03] So if an entry already has two infos info: foo and info:bar [19:32:17] And I call modify with ['baz'] [19:32:24] At the end are there three entries or one? [19:33:01] one [19:33:14] you always modify the entire array and pass back the entire array [19:33:19] Ok, good, that's what my code was expecting [19:33:42] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 77% free memory [19:33:55] I don't see a problem with this [19:34:01] have you looked at opendj's log? [19:34:19] I see a problem -- 'global $wgAuth;' [19:34:24] ah [19:34:24] or, the lack thereof [19:34:26] right [19:34:30] that's likely it [19:34:40] can't believe I didn't notice that :D [19:34:52] I guess if it's not declared then php just blithly calls a method on 'null' and doesn't warn or fail or anything? [19:35:10] it should show a warning in the apache log [19:35:17] apache's error log [19:35:56] ok, works now :) [19:36:19] cool [19:46:45] wow. we really only have 38 projects without instances? [19:47:07] that's both surprising and reassuring [19:47:22] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 150 processes [19:47:45] !log webtools installed build-dependencies for OGS on webtools-login (csh groff libdb5.1-dev libssl-dev libncurses5-dev libpam0g-dev libxt-dev lesstif2-dev libxpm-dev libxmu-dev quilt default-jdk ant ant-optional junit javacc) [19:47:46] Logged the message, Master [19:50:57] Coren: is OGS different from SGE? From a user point of vue? [19:51:44] Darkdadaah: It's a strict superset from SGE 6.2; the chances that you run into a difference unless you're configuring it are epsilon. [19:52:03] Ok that's good then :) [19:52:51] I'm thinking there might be edge cases where the behaviour might be a little different if you use multiprocessing without an Infiniband network -- and I doubt we have much cluster computing bots. :-) [19:53:52] PROBLEM Free ram is now: CRITICAL on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: Critical: 5% free memory [19:56:02] RECOVERY Total processes is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: PROCS OK: 148 processes [19:57:12] RECOVERY Total processes is now: OK on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS OK: 149 processes [20:04:38] PROBLEM Current Load is now: WARNING on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: WARNING - load average: 9.76, 9.02, 6.77 [20:08:12] Coren: is this not packaged for debian/ubuntu? [20:11:52] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 8.56, 8.31, 6.30 [20:12:06] hey Coren how are you :-) [20:13:10] sumanah: Well, and you? [20:13:23] OK. A bit wrung out; long day already and it's not over yet :-) [20:13:29] Ryan_Lane: No, deb is stuck on an older pre-fork version [20:14:55] Ryan_Lane: Raring also. [20:15:10] Coren: update the package, then? :) [20:15:23] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 152 processes [20:17:04] Ryan_Lane: That is exactly what I'm doing. :-) [20:17:10] ahhh ok [20:17:10] cool [20:17:23] PROBLEM Current Load is now: WARNING on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: WARNING - load average: 6.80, 6.69, 5.34 [20:17:30] Coren: do you work from home, or do you have an office or a coworking space? [20:20:43] sumanah: We have offices in my home (me and my hubby) [20:20:49] our* [20:24:27] My partner & I also work from home most of the time [20:28:52] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 6.52, 6.10, 5.30 [20:31:39] Part of it is my own insistence on a reliable and secure infrastructure. To me, that means "one that I control" :-) [20:34:52] Heh! yeah [20:36:34] Anyone wants to place bets on whether the debuild will work on the first try? [20:37:24] ... which would be cool because it means upstream builds cleanly. [20:38:53] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 22% free memory [20:38:54] PROBLEM Free ram is now: WARNING on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: Warning: 7% free memory [20:39:13] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [20:39:33] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 21% free memory [20:40:03] PROBLEM Confidence is now: WARNING on coren 127.0.0.1: output: Warning: 12% realism [20:41:53] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [20:43:24] [bz] (8RESOLVED - created by: 2Nemo, priority: 4Low - 6trivial) [Bug 43927] amarok.kde.org listed twice - https://bugzilla.wikimedia.org/show_bug.cgi?id=43927 [20:43:39] oooh pretty colors :-) [20:45:23] sumanah: I see the tools list is seeing continual updating. That's a Good Thing(tm) [20:45:31] yes :) [20:53:52] PROBLEM Current Load is now: CRITICAL on you-should-delete-this.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [20:53:53] RECOVERY Free ram is now: OK on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: OK: 40% free memory [20:54:29] !log wiktionary-tools apt-get install sqlite3 [20:54:31] Logged the message, Master [20:54:33] PROBLEM Disk Space is now: CRITICAL on you-should-delete-this.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [20:55:13] PROBLEM Free ram is now: CRITICAL on you-should-delete-this.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [20:57:21] sumanah: I hope you weren't insulted by my having declared your table "unreadable". :-) [20:58:09] Coren: I had an unreadable table? I obviously haven't looked at a watchlist today, or something [20:58:10] where? [20:58:48] The tool list; the single-row format made it scroll over 4-5 screen width for me. :-) [20:58:53] RECOVERY Current Load is now: OK on you-should-delete-this.pmtpa.wmflabs 10.4.1.80 output: OK - load average: 0.14, 0.68, 0.53 [20:59:05] Coren, that was Silke's table wasn't it? [20:59:13] Oh, so it was! [20:59:31] Well, then, sumanah was clearly /not/ insulted. :-) [20:59:33] RECOVERY Disk Space is now: OK on you-should-delete-this.pmtpa.wmflabs 10.4.1.80 output: DISK OK [20:59:53] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [21:00:04] ha! I think yeah that was Silke's [21:00:12] RECOVERY Free ram is now: OK on you-should-delete-this.pmtpa.wmflabs 10.4.1.80 output: OK: 89% free memory [21:00:15] and I think she is even harder to insult than I am :-) [21:01:14] "you-should-delete-this?" :D [21:01:25] did someone really name an instance that? [21:02:11] Ryan_Lane: Clearly, it should be deleted. :-) [21:02:26] Ryan_Lane: Me, just a minute ago :) [21:02:30] :D [21:02:32] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 17% free memory [21:02:51] * andrewbogott deletes it [21:07:13] PROBLEM Total processes is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS WARNING: 152 processes [21:07:14] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [21:11:52] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 13% free memory [21:16:06] [bz] (8RESOLVED - created by: 2Nemo, priority: 4Low - 6trivial) [Bug 43941] A few more duplicates - https://bugzilla.wikimedia.org/show_bug.cgi?id=43941 [21:17:13] RECOVERY Total processes is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS OK: 150 processes [21:18:35] [bz] (8RESOLVED - created by: 2Nemo, priority: 4Low - 6trivial) [Bug 44149] Remove deckle.co.za - https://bugzilla.wikimedia.org/show_bug.cgi?id=44149 [21:25:47] Is it already possible to use databases on the Webtools project? [21:27:12] Darkdadaah: Not that I know of. [21:31:44] Darkdadaah: not yet [21:32:41] Then don't worry, I can wait :) [21:33:52] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 2.57, 3.66, 4.70 [21:34:09] andrewbogott: so, alas, quorum feature does indeed look like it requires replica=3 [21:34:39] What kind of hardware is labstore(*)? [21:34:55] dell [21:35:17] Ah, don't we have several more of those sitting unused? Like, the old virt1, virt2, etc? [21:35:33] not the same kind of box [21:35:43] oh, ok. dang [21:35:54] these use SATA disks and have a lot more of them [21:38:13] PROBLEM Total processes is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS WARNING: 154 processes [21:38:14] PROBLEM Total processes is now: WARNING on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS WARNING: 151 processes [21:44:02] PROBLEM Total processes is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: PROCS WARNING: 153 processes [21:51:53] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 8.12, 7.55, 6.03 [21:59:02] RECOVERY Total processes is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: PROCS OK: 150 processes [22:03:12] RECOVERY Total processes is now: OK on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS OK: 150 processes [22:11:13] PROBLEM Total processes is now: WARNING on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS WARNING: 151 processes [22:17:20] I can't seem to connect through ssh anymore :/ [22:17:53] Darkdadaah: to where? [22:18:02] bastion [22:18:08] * gwicke too, to both bastion and parsoid.wmflabs.org [22:18:08] hm [22:18:34] internal ssh connections also seemed to fail [22:18:56] * Ryan_Lane sighs [22:19:01] let me debug [22:19:06] I've been deleting some projects [22:19:26] it's possible that that is causing some issues [22:19:31] * Damianz waits for Ryan_Lane to realise he deleted bastion :D [22:19:38] I did not do that ;) [22:19:51] I can hit bastion, FWIW, but parsoid-spof doesn't work for me [22:20:10] ssh_exchange_identification: Connection closed by remote host [22:20:17] from bastion [22:20:18] yeah [22:20:50] When I first typed to say this, I at least managed to get the intro message, but not anymore. [22:21:06] I know. I'm taking a look [22:21:24] ah [22:21:25] I'm messing with the gluster access-permission script, so it's possible I screwed up access. [22:21:25] gluster [22:21:29] Although it doesn't look like I did [22:22:25] Hm... Is it problematic if a cpu is at 100% "wait" ? http://ganglia.wmflabs.org/latest/?c=wiktionary-tools&h=wiktionary-dev&m=cpu_report&r=hour&s=descending&hc=4&mc=2 [22:22:43] I need to restart the glusterd services [22:23:33] PROBLEM SSH is now: CRITICAL on bastion1.pmtpa.wmflabs 10.4.0.54 output: Server answer: [22:25:02] !logs [22:25:02] logs http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-labs [22:25:03] Ryan_Lane: Once gluster is back (and presuming I didn't break it), a couple of reviews for you: https://gerrit.wikimedia.org/r/49375 and https://gerrit.wikimedia.org/r/49916 [22:26:23] PROBLEM Current Load is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: WARNING - load average: 9.99, 9.21, 6.01 [22:31:54] ok. fixed [22:32:02] Darkdadaah: mind trying now? [22:32:14] gwicke, marktraceur: same. please try now [22:32:45] Seems good [22:32:46] Ryan_Lane: works for me, thanks! [22:32:46] now it works [22:32:49] great [22:32:52] Ryan_Lane: perfect [22:33:22] hey, for once a gluster issue didn't result in a long outage [22:33:22] I hate this filesystem [22:33:32] RECOVERY SSH is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [22:36:52] PROBLEM Current Load is now: CRITICAL on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: Connection refused by host [22:41:22] RECOVERY Current Load is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: OK - load average: 0.03, 1.32, 3.68 [22:41:52] RECOVERY Current Load is now: OK on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: OK - load average: 3.91, 2.31, 0.98 [23:05:17] Whoever created that build system needs to be taken out back and shot. [23:06:27] Coren: now, now [23:06:52] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 4.20, 4.21, 4.82 [23:07:17] sumanah: Heh. Sorry, that's a rule of mine: If your software takes several hours to build, you should be able to /resume/ a build after an error, not forced to start over. [23:07:47] sure. I just don't much love violent talk espec when it might be directed at a colleague [23:08:52] sumanah: It was a promise of hyperbole, not violence. "taken out back and (shot|put out of its misery)", as far as I can tell, has never been seriously applied to humans. :-) [23:09:42] sumanah: I have strong feelings about proper coding practice, but not /that/ strong. :-P [23:10:34] sumanah: I can promise you that the more violent I get is a nasty glare. [23:10:38] OK! [23:11:08] And even then, I've been said to be about as intimidating as an irate shi-tzu. [23:11:44] sumanah: Remind be you tell you about LART-1 and LART-2 at the first occasion we have to chat over a $BEVERAGE. :-) [23:12:21] oh I've heard of LARTs, Mr BOfH [23:13:25] Heh. I'm refering to to specifically tangible objects in my office at Above Security. It's an amusing anecdote, best laughed at over an intoxicating beverage. [23:13:33] to two* [23:17:32] RECOVERY Current Load is now: OK on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: OK - load average: 4.27, 4.13, 4.83 [23:20:12] PROBLEM Current Load is now: WARNING on bots-sql1.pmtpa.wmflabs 10.4.0.52 output: WARNING - load average: 12.89, 11.82, 5.45 [23:23:32] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 171 processes [23:24:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 164 processes [23:33:32] RECOVERY Total processes is now: OK on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS OK: 133 processes [23:46:32] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 171 processes [23:50:52] PROBLEM Current Load is now: CRITICAL on bots-sql1.pmtpa.wmflabs 10.4.0.52 output: CRITICAL - load average: 64.30, 32.11, 18.04 [23:52:12] PROBLEM Total processes is now: WARNING on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS WARNING: 152 processes [23:55:52] PROBLEM Current Load is now: WARNING on bots-sql1.pmtpa.wmflabs 10.4.0.52 output: WARNING - load average: 1.53, 13.23, 13.79