[00:11:16] restarted glusterd services. short public key outage [00:11:38] I hate how glusterd manages that service [00:13:30] <^demon> s/(how|manages that service)// [00:13:36] indeed [00:14:19] RAWR [00:16:31] god damn it [00:17:51] if a single glusterd service is down, none of them can export nfs [00:18:01] what a giant piece of shit gluster is [00:25:52] :< [00:25:58] this is really pissing me off [00:28:35] Ryan_Lane: just to check, will this impact instances that are still on lucid? [00:28:56] well, this is specifically the ssh keys [00:29:00] it only affects login [00:29:16] everything else should continue to work as expected [00:29:24] Ryan_Lane: When will it be resolved? I was in the middle of a deploy. [00:29:30] a deploy? [00:29:32] in labs? [00:29:34] To a labs instance. [00:29:37] ah [00:29:40] hopefully soon [00:29:48] no fucking clue why nfs won't mount [00:29:51] it's showing in showmount -e [00:30:00] and services are running in rpcinfi -p [00:30:03] rpcinfo -p [00:30:25] this is what I get for trying to work on gluster [00:30:29] it only ever fucks me [00:41:49] hi, how do I log into bastion? (I see the issue has been discussed on labs-l, but there was no fix described) [00:42:35] ssh bastion2.wmflabs.org [00:43:03] .. oh, nope. I swear that was working earlier. [00:43:09] hm [00:43:12] *now* it's a gluster issue [00:43:20] Krenair: permission denied for me [00:43:25] at least it's responding :p [00:43:27] same. I realised after I sent the message :p [00:43:38] I restarted glusted services to unlock the daemon and now it's nfs services are fucked [00:45:06] so tl;dr no way into labs for now? [00:45:10] correct [00:45:13] until I fix this [00:45:24] do we get paid time off? [00:45:29] :) [00:57:51] ah ha [00:57:54] ok [00:57:58] keys are back up [00:58:06] had to kill all nfs daemons and restart glusterd again [00:58:07] however.... [00:58:11] bastion1 is hung [00:58:18] I think lucid may have an issue with nfs [00:58:37] bastion2/3 will work? [00:58:39] yep [00:58:48] yay [00:59:05] dschoon: login should work again [00:59:12] gribeco, ^ [00:59:22] nope [00:59:22] I think I'm going to rebuild bastion as precise [00:59:25] not for me :( [00:59:25] ah, I'll give it a try [00:59:30] dschoon: which instance? [00:59:35] legoktm@bastion2:~$ ssh legoktm@bots-3 [00:59:35] If you are having access problems, please see: https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [00:59:35] Permission denied (publickey). [00:59:35] kripke [00:59:45] boxofjuice, did you connect to bastion with -A? [00:59:48] I got into bastion2! [00:59:59] Also you won't need the legoktm@ bit when you're already legoktm :) [01:00:02] bots-3 has public/keys [01:00:03] Krenair: no, lemme try that [01:00:09] it should definitely work [01:00:17] Krenair: thanks that worked [01:00:24] dschoon: try again? [01:00:24] Ryan_Lane: seems i needed to set -A [01:00:28] boxofjuice: yeah [01:00:29] trying [01:00:40] dschoon: i just ls'd in /public/keys and it's there [01:00:46] hm didn't know you could forget '.pmtpa.wmflabs' and still have it work [01:01:15] bastion1.pmtpa.wmflabs gives me conn closed [01:01:27] oh [01:01:27] which bastion should i use? [01:01:28] yeah, as mentioned bastion1 is broken [01:01:31] one sec [01:01:33] I'll reboot it [01:01:37] I'm going to rebuild it as precise [01:01:40] As I said bastion2 works [01:01:44] ...while you do that [01:01:47] is there another? [01:02:15] ......... Oh well I sort of said that. I realised I was wrong afterwards ;) [01:02:46] dschoon: bastion2.wmflabs.org and bastion3.wmflabs.org [01:02:49] okay. [01:04:11] so, as a positive, it looks like glusterfs isn't crapping itself every few minutes in the dmesg [01:04:35] and when I rebuild bastion as precise it should recover itself from an nfs outage [01:08:30] Ryan_Lane: looks good with bastion2 [01:08:30] ty [01:08:35] yw [01:08:39] now to solve the issue on bastion1 [01:09:40] Ryan_Lane: is there a reason adding a new hostname to an instance wouldn't be updating DNS right now? [01:10:01] did you check for the name before you added it? [01:10:48] what's the hostname/ip? [01:11:15] usually this ends up being negative cache, but I can check [01:11:20] dschoon: ^^ [01:12:45] ee-dashboard [01:12:48] ^^ Ryan_Lane [01:12:54] .wmflabs.org? [01:13:12] ;; ANSWER SECTION: [01:13:12] ee-dashboard.wmflabs.org. 3600 IN A 208.80.153.237 [01:13:53] I'd imagine you're hitting negative cache [01:14:28] yeah, I'm hitting the metrics dashboard when visiting it [01:15:21] dschoon: ^^ [01:15:44] weird. [01:15:48] maybe it's my local DNS caceh. [01:15:49] *cache [01:15:51] http://kripke.wmflabs.org/ wfm [01:15:53] yep [01:16:02] ty [01:16:06] either local, or your resolver [01:16:08] yw [01:16:22] it's a 1 hour negative [01:59:35] Die Gluster Die! [02:01:11] Well, actually, that's what it keeps doing. :-/ [02:03:52] heh [02:03:55] yep [02:04:12] !log bastion rebooting bastion-restricted [02:04:13] Logged the message, Master [02:04:52] On the positive side, I got a fairly performing DB up. Mariadb >> plain mysql [02:05:04] * Coren still prefers postgres. [02:08:29] Ryan_Lane: bastion2 doesn't do forwarding? [02:08:43] liangent: it does. are you having an issue? [02:09:34] yes. http://dpaste.com/1014641/ [02:10:14] oh. port forwarding [02:10:59] debug1: getpeername failed: Bad file descriptor [02:11:05] liangent: what are you trying to do? [02:11:39] ssh -a -W instance-proxy.eqiad.wmflabs:22 bastion1.eqiad.wmflabs -vvv [02:12:11] wait. is .eqiad.wmflabs valid? [02:12:12] are you trying to do proxycommand? [02:12:15] nope :) [02:12:18] it's pmtpa [02:12:27] we don't have the eqiad zone up yet [02:12:44] its listed in https://wikitech.wikimedia.org/wiki/Access#Using_ProxyCommand_ssh_option [02:12:48] yeah [02:12:52] and listed as bastion2 section [02:12:55] that will eventually be the bastion there [02:13:02] ah. whoops [02:13:13] let me update that [02:13:13] so I imagine bastion2 is for eqiad [02:13:45] well, it was going to be [02:14:00] I guess we'll name it something else [02:16:17] ok it works now [02:17:08] so bastion[1-3] are identical currently? [02:27:18] liangent: yep [02:47:05] any docs about Special:OATH? [02:52:35] liangent: not really. I need to add some docs and such [05:52:35] Ryan_Lane: ping : you have a PM [06:07:18] petan: can you install "libmysqlclient-dev" on bots-bnr1? [06:07:31] i need it to compile python-oursql [06:41:23] Ryan_Lane: you there? [06:44:31] Krinkle: yep [06:44:41] aaron and I fixed the search, btw [06:45:02] Ryan_Lane: apergos is telling my my cron on a vm is spamming all of ops every 15 minutes [06:45:12] how awesome is that [06:45:19] heh [06:45:29] any idea which instance, or what cron? [06:45:52] most labs instances send a ton of cron spam [06:45:55] Ryan_Lane: So, aside from the issue that the default for root crons in labs shouldn't go to all of ops (lol!), I've got an idea, based on how things go with MMP at toolserver. [06:46:04] Ryan_Lane: etc/crontab on cvn-app1 [06:46:10] I just edited it to go to /dev/null instead [06:46:29] we can't do anything about mail until we have a exim relay for labs [06:46:33] apparently the default for stdout in cron there from root is to go to ops [06:46:42] The idea is as follows [06:46:43] it's been on mark/faidon's radar for ages [06:46:50] set up an email alias on wmflabs for nova projects [06:47:02] e.g. cvn@mail.wmflabs.org would go to members of https://wikitech.wikimedia.org/wiki/Nova_Resource:Cvn [06:47:16] yep. this is basically how we're going to handle it [06:47:19] (their LDAP mailaddress, assuming that's mandatory) [06:47:25] yep [06:47:33] and secondly, aside from that being a useful thing. set it as default for things like cron spam. [06:47:36] we're going to send cronspam to project admins [06:47:41] great [06:47:42] not regular users [06:47:57] we'll likely make an alias for regular users too [06:48:06] Hm.. is project admins new? [06:48:09] I haven't see that before [06:48:18] since it would be useful to be able to send email to all members of a project [06:48:24] we merged netadmin and sysadmin [06:48:29] I created the cvn project in labs, I'd consider myself an admin (create/delete instances, assign ips etc.) [06:48:39] but I see that section is empty [06:48:48] yeah. that makes you a "projectadmin" [06:48:55] and novaadmin is in there, what's that doing there? [06:49:14] essex and above of openstack have no concept of global admin [06:49:31] I think havana, or maybe grizzly will add that back in [06:49:42] so, we have to have a single user that's in every project [06:49:46] as projectadmin as well [06:50:10] we should likely be hiding it from the interface, but it's not a major issue [06:50:19] ok, I don't mind the "novaadmin" in the list, but I mean Project admins. [06:50:22] Why is that empty for Cvn? [06:50:30] it's empty? [06:50:32] let me see [06:51:06] it's not empty [06:51:16] it has four members [06:51:53] Azariv, Krinkle, novaadmin, Sactage [06:51:55] oohhhhhhh [06:52:00] you mean on: [06:52:03] !resource cvn [06:52:04] https://labsconsole.wikimedia.org/wiki/Nova_Resource:cvn [06:52:10] members, yes. [06:52:14] but the "admins" list is empty? [06:52:27] yes, because that's currently broken [06:52:27] I know I'm a project admin (or so I guess) [06:52:41] k [06:52:42] I should remove it from the interface till I have it working again [06:53:41] I wonder if I have an open bug on that [06:53:46] I'm pretty sure I do [06:54:14] http://www.mediawiki.org/wiki/Wikimedia_Labs/Account_creation_improvement_project#Current_account_creation_process [06:54:20] https://bugzilla.wikimedia.org/show_bug.cgi?id=43515 [06:54:29] @notify petan [06:54:29] This user is now online in #huggle so I will let you know when they show some activity (talk etc) [06:54:32] it's probably not a ton of work [06:55:32] and it's even easier now that it's just one group [06:56:23] Change on 12mediawiki a page Wikimedia Labs/Account creation improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=656319 edit summary: [-17] /* Current account creation process */ [06:56:48] Change on 12mediawiki a page Wikimedia Labs/Account creation improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=656320 edit summary: [-257] [06:56:58] Hooray, crossed an entire section out! [06:57:26] lol [06:58:27] :) [06:58:41] Ryan_Lane: any chance you can install a package on bots-bnr1 for me? [06:58:47] Change on 12mediawiki a page Wikimedia Labs/Account creation improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=656321 edit summary: [-182] /* OpenID as a provider */ [06:58:53] and a bunch of bugs crossed off too! [06:59:11] boxofjuice: which package? [06:59:23] "libmysqlclient-dev" [06:59:44] i need it to build oursql [06:59:54] oursql? [07:00:10] its a python package [07:00:20] library for mysql, replacement for MySQLdb [07:00:29] * Ryan_Lane nods [07:00:29] ok [07:00:49] !log bots installed libmysqlclient-dev on bots-bnr1 [07:00:51] Logged the message, Master [07:01:10] ty :) [07:01:30] yw [07:02:41] I really need to finish puppet stuff for bots for this [07:02:45] * Damianz sigh [07:03:36] Ryan_Lane: regarding the mail alias to nova project group, open bug for that or want me to create one? and the cron going to root/ops bug, and the enhancement to make it go to project (once the first thing is implemented) [07:04:36] cron going to roots is just funny... though we need a relay to direct that to project admins :( [07:04:40] I think there's already an RT for it [07:04:45] a really, really old one :( [07:05:11] ok. removed admins for now from the project pages [07:06:41] [bz] (NEW - created by: Krinkle, priority: Unprioritized - major) [Bug 45827] Labs: Mails from cron should not go to ops-wikimedia - https://bugzilla.wikimedia.org/show_bug.cgi?id=45827 [07:07:57] [bz] (NEW - created by: Krinkle, priority: Unprioritized - normal) [Bug 45828] Labs: Implement mail aliases for projects (@wmflabs.org) - https://bugzilla.wikimedia.org/show_bug.cgi?id=45828 [07:09:21] [bz] (NEW - created by: Krinkle, priority: Unprioritized - enhancement) [Bug 45829] Labs: Let mail (from cron and perhaps other defaults) go to project forwarder by default - https://bugzilla.wikimedia.org/show_bug.cgi?id=45829 [07:09:50] Ryan_Lane: Merge https://gerrit.wikimedia.org/r/#/c/52350/ please :p [07:10:03] heh [07:10:05] it scares me [07:10:08] let me look at it tomorrow [07:10:43] some things are poorly named Nova right now [07:10:48] controller, for instance [07:11:00] though really I probably just need to pull the keystone stuff out of it [07:11:05] and the rest call stuff [07:11:17] haha [07:11:37] I'll probably merge it through tomorrow, then refactor afterwards [07:12:56] Krinkle: thanks for adding in the bugs [07:13:04] yw [07:13:04] this is quite unique right now this moment [07:13:11] what is? [07:13:28] so, Ryan in SF, 11:12PM. Reedy 7.12AM (night, morning? not sure how you feel), Krinkle, 8.12AM (end of night) [07:13:40] reedy is in SF ;) [07:13:46] reeeeeally [07:13:50] yep [07:13:50] My laptop says it's tomorrow though [07:13:54] heh [07:13:56] So does my IRC client [07:13:58] I found it unusually late indeed, even for Reedy [07:14:01] ;) [07:14:08] late? Meh I just got out of bed :P [07:14:30] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=656324 edit summary: [+75] /* Proposals */ [07:16:52] Change on 12mediawiki a page Wikimedia Labs/Interface usability improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=656326 edit summary: [-609] [07:17:13] Change on 12mediawiki a page Wikimedia Labs/Communication improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=656327 edit summary: [+609] [07:23:30] Ryan_Lane: So it's final that there isn't going to be a separate "Test/Dev Labs" and "Tool Labs", right? [07:23:46] I'm about to update stuff to reflect that [07:23:51] tools will be a labs project [07:24:01] and "Test/Dev Labs" is a term no longer to be used. [07:24:28] well, those terms were meant more as projects [07:24:36] they were always meant to be in the same infrastructure [07:24:51] sure [07:25:06] phase 1 was test/dev labs (which is what's currently running) and tool labs is phase 2 [07:25:18] ok [07:25:19] which Cor en is so unlucky to be doing :) [07:25:32] I'll leave it as is for now then [07:25:35] kidding it'll be fun :) [07:26:00] thanks [07:26:50] btw, I got 2 small random bots I'd like to migrate from my krinkle@willow.toolserver crontab to labs. Is bots still the place to go or is tools ready for action already? (I know you don't manage that, but Coren is asleep and I figured you are aware of how it progresses) [07:27:16] hm. may want to ask coren on progress [07:27:20] k [07:27:22] I have a feeling bots project is still the spot [07:28:15] Ryan_Lane: interesting, looks like does properly display projectadmin [07:28:25] what does? [07:28:54] http://cl.ly/image/3Z0t1v1N340H [07:28:58] Special:NovaProject * [07:29:20] yeah, it displays properly in manage projects [07:29:29] but not on project pages [07:29:48] I didn't want to make project pages dynamic pull from ldap [07:29:57] as that's just waiting to be abused [07:30:12] doesn't it already do that for members? [07:30:17] so, when projects are modified, OpenStackManage updates the project page [07:30:27] oh, I see. [07:30:27] it's doing it for members, but not for projectadmins [07:30:29] it is hardcoded in wikitext [07:30:32] yeah [07:30:47] the only thing that updates the memberships is wikitech, so it's not an issue [07:31:17] if that ever moves to keystone, we'll make a keystone plugin that updates mediawiki with the info [07:31:22] like we do for nova [07:34:45] hell, let me see if I can just push this in now [11:22:40] !log bots enforcing new security rules for sql and passwords storage [11:22:42] Logged the message, Master [11:44:31] Damianz ping [11:44:43] petan: pong [11:44:55] Damianz when you want to move cluebot to bsql? [11:45:13] petan: Erm... tomorrow night maybe, I've got to work late tonight [11:45:16] that would actually allow us to turn off sql2 :P [11:45:18] mhm ok [12:26:05] Silke_WMDE: about that tool list... am I expected to list all my bots one by one? [12:29:53] * addshore dislikes interwiki bots [12:30:45] liangent: that would be best. how many bots so you run? [12:33:21] Silke_WMDE: eight. different invocations (with different arguments) of the same script are counted as one [12:33:44] I would say so, yes [12:35:30] Silke_WMDE: so one entry per script is ok? [12:35:48] yes [12:35:53] Silke_WMDE: hmm I also have a webtool, which consists of some smaller apps. one entry or multiple? [12:36:10] I already added it as one -- am I expected to expand it now? https://www.mediawiki.org/w/index.php?title=Toolserver%2FList_of_Tools&diff=656432&oldid=656425 [12:36:40] no, i think it's ok [12:57:32] Silke_WMDE: and among those bots, one does nothing itself, and it depends on all configured "plugins". should I list those plugins? [12:58:15] yes, please list them as dependencies [12:58:48] Silke_WMDE: eh the "framework" and "plugins" are all written by myself. [12:58:57] it's coded in this way for easier maintenance [12:59:01] ah [12:59:21] then please list them separately [12:59:23] each "plugin" does something completely different from others [14:21:36] Top 10 bad starts to a day: (3) drive failure on your primary desktop [14:22:43] what's (2) and (1) [14:22:43] ? [14:23:34] Not sure, but I expect "being woken up by a phone call announcing the death of a family member" should be up there. :-) [14:34:14] * Coren fetches caffeine. [16:07:11] addshore we should restart sqld [16:07:18] why? [16:07:23] addshore I found out we really need to restart it for it to read config file [16:07:32] so as we changed the sizes it never took effect [16:07:49] I did a sneaky restart the other day :) but if you have made more changes go for it :) [16:07:51] it eats around 7gb out of 16 [16:07:55] aha [16:08:02] I configured it to use 10GB for innodb [16:08:03] I did today increased the buffer to 12gb [16:08:06] oh lol [16:08:09] I changed that to 12 [16:08:10] :P [16:08:12] kk :) [16:08:13] mhm [16:08:26] I think I was going to have it at 12 also [16:08:26] for some reason it uses very little ram right now [16:08:31] I don't know whty [16:08:42] no requetss running on it atm [16:10:13] ok but it should be full anyway [16:10:20] so that it doesn't need to read from disk [16:13:27] gribeco generates lot of errors in sql :/ [16:13:31] dunno what's up [16:15:12] !log tools can haz database (support for user/tool databases in place) [16:15:13] Logged the message, Master [16:15:36] petan: if you want to restart it, restart it now :) [16:15:40] Coren how do you maintain db creation [16:15:52] addshore idk wouldn't it crash some bots? [16:15:56] more people use it [16:16:01] * addshore checks [16:16:20] petan: At this time, this is part of my create-a-tool script. Eventually (read: real soon) it will be done via the wikitech interface. [16:16:24] Threads: 8 Questions: 1527517 Slow queries: 3 Opens: 274 Flush tables: 1 Open tables: 165 Queries per second avg: 16.048 [16:16:35] addshore let's wait [16:16:47] looks like we will have to schedule a restart in a week or so :) [16:16:48] we will restart it one day... I believe labs will crash soon again :D [16:16:50] then it will restart [16:16:53] yee xD [16:17:07] just about to run my bot now :) [16:17:15] 16 queries per second :D [16:17:20] addshore how you get these stats [16:17:25] from information schema [16:17:28] loginto mysql [16:17:28] or you got a tool [16:17:33] and type "status" ;p [16:17:41] yay [16:19:25] mhm Coren what about that procedure I posted :P or some self-service script [16:19:31] so that users could create db's using terminal [16:19:41] * petan hates web interfaces [16:20:10] also, what if I wanted to create a DB and give permissions to 5 more people to it [16:20:38] petan: 'with grant option' is given. [16:20:50] ok, how do you handle database dropping then? [16:20:57] because when you drop a db, grants remain [16:21:19] so, if someone recreated a db which existed in past, they might have a security problem [16:21:23] petan: said interface will also clean up the grants. [16:21:39] ok but if I have grants [16:21:43] I don't need to use your interface [16:21:51] one could just execute drop database [16:22:19] petan: Yes, you get to shoot yourself in the foot. If you do so, you are welcome to use the hole for any purpose you like. [16:22:40] petan: The only one you're going to affect is yourself. [16:22:49] right [16:24:06] I'm not trying to protect the maintainers from themselves; I'm trying to protect their tools from being broken by others. If you break your own tool, you get to keep the pieces. :-) [16:24:23] ok [16:26:04] But there's not going to be a way around having to use the wikitech interface sometimes; it alone holds the right to play in the labs-wide LDAP [16:26:24] :/ [16:26:32] that's sad [16:26:35] Although, I would expect ~5 minutes of use over the lifetime of a tool. [16:26:57] poor people with no graphical interfaces [16:27:15] I heard about people who don't even run X server [16:27:16] :D [16:27:22] on their home computers lol [16:27:38] I believe Ryan is one of them [16:27:47]