[00:57:33] sitematrix on labs apparently can't count [00:57:36] says there's 459 sites [00:57:43] only 457 http:// on the page.. [01:28:58] Reedy: when talking about beta, use beta, not labs ;) [01:29:25] petan: can you please rename that central wiki? [01:29:36] it makes things ultra confusing [02:02:11] PROBLEM Puppet freshness is now: CRITICAL on shop-analytics-main1 i-000001d0 output: Puppet has not run in the last 10 hours [02:20:21] PROBLEM Free ram is now: WARNING on bots-2 i-0000009c output: Warning: 19% free memory [02:40:17] RECOVERY Free ram is now: OK on bots-2 i-0000009c output: OK: 22% free memory [03:03:07] RECOVERY Total Processes is now: OK on udp-filter i-000001df output: PROCS OK: 84 processes [03:03:15] RECOVERY dpkg-check is now: OK on udp-filter i-000001df output: All packages OK [03:04:27] RECOVERY Current Load is now: OK on udp-filter i-000001df output: OK - load average: 0.19, 0.20, 0.09 [03:04:57] RECOVERY Current Users is now: OK on udp-filter i-000001df output: USERS OK - 1 users currently logged in [03:05:47] RECOVERY Disk Space is now: OK on udp-filter i-000001df output: DISK OK [03:06:27] RECOVERY Free ram is now: OK on udp-filter i-000001df output: OK: 86% free memory [03:41:08] PROBLEM Puppet freshness is now: CRITICAL on hugglewa-1 i-000001dc output: Puppet has not run in the last 10 hours [03:41:28] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 12% free memory [03:41:28] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 14% free memory [03:51:08] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 14% free memory [03:56:09] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 16% free memory [03:56:28] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 5% free memory [04:01:28] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 5% free memory [04:01:28] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 16% free memory [04:06:28] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 93% free memory [04:11:09] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 4% free memory [04:16:08] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 96% free memory [04:16:08] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 4% free memory [04:21:18] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 97% free memory [04:21:18] PROBLEM Free ram is now: CRITICAL on test3 i-00000093 output: Critical: 5% free memory [04:26:18] RECOVERY Free ram is now: OK on test3 i-00000093 output: OK: 96% free memory [05:38:49] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 1.56, 5.02, 2.89 [05:43:49] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 0.07, 1.91, 2.11 [06:39:14] RECOVERY Disk Space is now: OK on aggregator1 i-0000010c output: DISK OK [06:47:14] PROBLEM Disk Space is now: CRITICAL on aggregator1 i-0000010c output: DISK CRITICAL - free space: / 0 MB (0% inode=93%): [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on incubator-web i-00000198 output: Puppet has not run in the last 10 hours [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on kripke i-000001aa output: Puppet has not run in the last 10 hours [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on labs-lvs1 i-00000057 output: Puppet has not run in the last 10 hours [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on nova-daas-1 i-000000e7 output: Puppet has not run in the last 10 hours [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on orgcharts-dev i-0000018f output: Puppet has not run in the last 10 hours [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on pad1 i-00000066 output: Puppet has not run in the last 10 hours [07:38:04] PROBLEM Puppet freshness is now: CRITICAL on patchtest2 i-000000fd output: Puppet has not run in the last 10 hours [07:38:05] PROBLEM Puppet freshness is now: CRITICAL on pediapress-puppetmaster i-000001af output: Puppet has not run in the last 10 hours [07:38:05] PROBLEM Puppet freshness is now: CRITICAL on puppet-lucid i-00000080 output: Puppet has not run in the last 10 hours [07:38:06] PROBLEM Puppet freshness is now: CRITICAL on reportcard1 i-000000a8 output: Puppet has not run in the last 10 hours [07:38:06] PROBLEM Puppet freshness is now: CRITICAL on search-test i-000000cb output: Puppet has not run in the last 10 hours [07:38:07] PROBLEM Puppet freshness is now: CRITICAL on test-oneiric i-00000187 output: Puppet has not run in the last 10 hours [07:43:59] andrewbogott: I created a test acc [07:44:01] works [07:52:32] andrewbogott: can you please insert git-core as default for all instances on labs? [07:59:44] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 67 MB (5% inode=53%): [09:17:00] !log . [09:17:00] Message missing. Nothing logged. [09:40:16] petan: do you have write access to labs LDAP to fix my real name ? :-D [10:31:50] hashar: try to change it in mediawiki [10:32:31] mutante has write access to it [10:32:49] I did change my name in MW [10:32:56] was just wondering if you add some write access [10:33:01] will ask later tonight :) [10:33:03] thanks! [10:33:04] no I have no access heh [10:33:08] :P [10:33:12] don't worry [10:36:32] lunch time. ++ [10:37:19] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on incubator-web i-00000198 output: Puppet has not run in the last 10 hours [10:37:34] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on kripke i-000001aa output: Puppet has not run in the last 10 hours [10:37:49] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on nova-daas-1 i-000000e7 output: Puppet has not run in the last 10 hours [10:38:04] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on orgcharts-dev i-0000018f output: Puppet has not run in the last 10 hours [10:38:19] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on pad1 i-00000066 output: Puppet has not run in the last 10 hours [10:40:25] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on labs-lvs1 i-00000057 output: Puppet has not run in the last 10 hours [10:40:40] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on patchtest2 i-000000fd output: Puppet has not run in the last 10 hours [10:40:55] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on reportcard1 i-000000a8 output: Puppet has not run in the last 10 hours [10:40:55] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on search-test i-000000cb output: Puppet has not run in the last 10 hours [10:41:10] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on shop-analytics-main1 i-000001d0 output: Puppet has not run in the last 10 hours [10:41:25] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on test-oneiric i-00000187 output: Puppet has not run in the last 10 hours [10:41:55] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on pediapress-puppetmaster i-000001af output: Puppet has not run in the last 10 hours [10:42:10] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on test3 i-00000093 output: Puppet has not run in the last 10 hours [10:42:25] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on testblog i-00000167 output: Puppet has not run in the last 10 hours [10:42:25] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on utils-abogott i-00000131 output: Puppet has not run in the last 10 hours [11:25:25] PROBLEM Current Load is now: CRITICAL on hugglewa-1 i-000001e0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [11:26:05] PROBLEM Current Users is now: CRITICAL on hugglewa-1 i-000001e0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [11:26:45] PROBLEM Disk Space is now: CRITICAL on hugglewa-1 i-000001e0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [11:27:25] PROBLEM Free ram is now: CRITICAL on hugglewa-1 i-000001e0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [11:28:35] PROBLEM Total Processes is now: CRITICAL on hugglewa-1 i-000001e0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [11:28:47] @search http [11:28:47] Results (found 70): morebots, nagios, bot, labs-home-wm, labs-nagios-wm, labs-morebots, gerrit-wm, wiki, labs, extension, wm-bot, gerrit, change, revision, monitor, alert, unicorn, bz, os-change, instancelist, instance-json, amend, queue, sal, info, access, keys, bug, blueprint-dns, bots, rt, pxe, group, pathconflict, terminology, etherpad, nova-resource, pastebin, osm-bug, bastion, initial-login, manage-projects, rights, puppet, projects, quilt, labs-project, openstack-manager, wikitech, load, load-all, wl, docs, instance, address, ssh, documentation, start, link, socks-proxy, requests, gitweb, labsconf, resource, security, project-discuss, putty, git, port-forwarding, logs, [11:28:54] @search nagios [11:28:54] Results (found 3): nagios, monitor, alert, [11:29:04] !alert [11:29:05] http://nagios.wmflabs.org/cgi-bin/nagios3/history.cgi?host=$1 [11:29:15] PROBLEM dpkg-check is now: CRITICAL on hugglewa-1 i-000001e0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [11:38:09] New patchset: Dzahn; "labs - make common syslogs readable by normal users per RT-2712" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4157 [11:38:18] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (test); V: -1 - https://gerrit.wikimedia.org/r/4157 [11:43:46] New patchset: Dzahn; "labs - make common syslogs readable by normal users per RT-2712" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4157 [11:43:58] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/4157 [11:47:41] !requests [11:47:41] this is a backlog of all requests needed to be done by ops https://labsconsole.wikimedia.org/wiki/Requests [12:03:35] PROBLEM Free ram is now: WARNING on mobile-enwp i-000000ce output: Warning: 14% free memory [12:08:31] mutante: https://labsconsole.wikimedia.org/wiki/Requests [12:08:35] RECOVERY Free ram is now: OK on mobile-enwp i-000000ce output: OK: 39% free memory [12:13:06] petan: oh, read it. well that's a bigger one. what's the difference between "system operator" and "sysadmin"? [12:15:17] also sounds like a good reply to Ryan's most recent mail to the list re sysadmin/netadmin. i guess that made you add this? [12:40:04] mutante: I wanted to implement it long time ago [12:40:27] difference between sysadmin and system oper, is that oper has root, while sysadmin can control whole project, remove people etc [12:40:37] assign groups [12:40:39] and so [12:40:51] system oper can have root on instances but can't manage project in labs [12:41:03] like delete instance / create it [12:41:47] since Ryan said that labs are going to be used as community production environment too, it's going to be needed [12:42:12] production like projects should have restrictions for members, for security reasons [12:42:27] yeah, i see, reasonable. actually sysadmin is projectadmin then [12:42:46] for example on production bots project we don't want people to be able to sudo su and read credentials of other bots etc from configs [12:43:05] and netadmin additionally has the security groups / firewall [12:43:07] mutante: there is no "project admin" group [12:43:23] yes netadmin would be something like oper, for firewall [12:43:31] sysadmin can manage project, assign groups [12:43:38] members of netadmin can manage firewall [12:43:54] no, just saying that that kind of name would maybe make the differences more clear [12:43:59] ah, true [12:44:05] maybe sysadmin could be root [12:44:31] if we rename it [12:44:50] ok I will post it there [12:45:07] yeah, do that, it needs discussion [13:00:46] mutante: ping pong are you around still ? :) [13:01:16] mutante: could you possibly change my real name in LDAP ? It shows as "Hashar", should be "Antoine Musso" :) [13:06:24] <^demon> I've been bugging Ryan about changing mine from "Demon." [13:06:38] Demon is a good name [13:10:06] I will open a bug report :-D [13:10:17] or should it be an ops ticket [13:10:42] or maybe mutante as me on ignore list because I keep asking him silly stuff :-D [13:41:04] hashar: "sn", "cn", "givenName" ... It looks like you mean what is in "cn" = "preferred wiki username." But that will also be used for git/gerrit. "Preferred wiki username. This will also be the user's git username, so legal name would be reasonable " [13:42:15] when an SVN user was created, there was also --firstname and --lastname ... [13:46:02] looks like we did not get very creative ;) [13:46:17] mutante: I have no idea what sn cn givenName are supposed to be :-/ [13:47:07] mutante: looks like it is the "cn" [13:47:11] well those are values from LDAP on formey, and cn: is the capitalized Hashar [13:47:15] by looking at brion [13:47:27] <^demon> mutante: Some of us early users didn't get much choice since we were Ryan's guinea pigs. [13:47:28] yeah, but what about the "also be used for git" part [13:47:40] <^demon> Basically, it's annoying to change because we'd have to manually fix gerrit. [13:48:12] it seems gerrit use the "cn" part as a real username when doing merge commits [13:48:26] yep, and that made me hesitate [13:48:31] currently my merges shows as : Hashar [13:48:38] if it was just the Wiki user real name i would have changed [13:50:20] <^demon> hashar: Maybe if you pester him he'll do me too ;-) [13:50:26] <^demon> But I doubt it [13:50:52] mutante: is there anything you are afraid of ? [13:51:00] to me it looks like gerrit fetch the data from LDAP [13:51:04] whenever we login [13:52:06] i gues,you loosing your gerrit history, needing to upload a new key or something like that [13:52:20] that would be fun ;-] [13:52:46] does gerrit use the real name as a login ? :-( [13:53:15] yes [13:53:22] ohhhhh [13:53:25] should use the kid from dn [13:53:39] should use the uid from dn or the sn [13:53:41] or something [13:53:50] so I will just harass Ryan ;-] [13:53:57] well, the sn: is not capitalized [13:54:03] it could be that.. but i think not [13:54:32] when logged in in gerrit i see my own user capitalized next to Settings [13:55:29] accountPattern = (&(objectClass=person)(cn=${username})) [13:55:29] accountFullName = cn [13:55:37] and then my own user has givenName, and our user doesnt have that atrribute.. [13:55:41] in LDAP [13:55:59] s/our/your [13:56:36] I guess accountPattern should be (uid=${username}) [13:56:39] or something [13:56:44] * mutante nods [13:57:00] submitting a change request [14:01:00] maybe then can enable users to change "Full Name" in Gerrit ui settings [14:01:23] not sure if the LDAP connector would allow it [14:02:24] na we need to fix the ldap rules in gerrit [14:02:28] change coming [14:02:33] https://gerrit.wikimedia.org/r/4166 [14:02:37] will ask Ryan to review it [14:03:14] k, nice [14:03:16] with something like, Dear lord of Gerrit, Would you possibly help one of your subjects by blessing this change with some of review time ? [14:04:28] ;) [14:39:38] petan: git-core is already available as an option for all instances, right? [14:39:48] no it's not [14:39:51] or it wasn't [14:40:11] I had to install it on each instance where I needed [14:40:31] It is, weirdly, under the 'building' category. do you mind checking? Let me know if you see an instance where it's not available. [14:40:36] (via the config page, I mean) [14:40:40] ah [14:40:45] I thought it's default as svn [14:42:21] It's somewhat reasonable to have it optional rather than default, since lots of instances are used as servers rather than dev boxes. [14:42:34] But I will check with ryan about why it's buried in such a funny place. [14:44:30] I just don't know why svn isn't in puppet then as well [14:44:42] it's installed to all machines no matter if you want it [14:45:04] No idea. Unless it's required for mediawiki to operate for some reason. [14:45:24] It may be historic, since many projects are only now converting from SVN to git. [14:45:37] the reason was probably that the code lived in svn, while now it lives in git [14:45:46] so it was quite handy to have it [14:45:50] yep [14:45:53] just as now it's handy to have git :) [14:46:07] You make a good case. [14:46:32] I am stalling partly because I don't know where default packages come from. [14:46:39] k [14:46:49] neither I do [14:47:02] I think it's part of default image [14:47:31] there are defaults in puppet which are unfortunatelly not really default, since they are installed after first run of puppet [14:47:40] which usually happen after some time [14:47:51] sometimes even days [14:47:55] sometimes never [14:48:11] Well... I don't think an instance should really be considered 'up' until after the first puppet run. [14:48:34] But puppet needs a lot of cleaning up so that a successful run is the norm rather than a shock. [14:48:38] I created hugglewa-1 hours ago [14:48:45] I can ssh there it works, puppet never ran there [14:49:02] it's likely a bug in labs [14:49:14] there are many instances which are being used for weeks and puppet never ran on [14:49:24] that's what the nagios check is for [14:49:30] Yep, I hear you -- I usually force a puppet run on my first login to an instance. [14:49:38] that's what I do [14:49:43] question is why I have to do that :) [14:50:03] maybe the puppet configure itself to run after first run [14:50:16] because the configuration of puppet is in puppet itself [14:50:39] Ryan should know [15:00:19] ACKNOWLEDGEMENT Disk Space is now: CRITICAL on aggregator1 i-0000010c output: DISK CRITICAL - free space: / 0 MB (0% inode=93%): [15:01:41] hi Ryan_Lane :) [15:01:48] !requests [15:01:48] this is a backlog of all requests needed to be done by ops https://labsconsole.wikimedia.org/wiki/Requests [15:04:18] petan: Once we have a smarter puppet testing setup, I plan to do some puppet cleanup. I hate that you can't setup puppet before instance creation. [15:05:36] ok [15:06:05] I reconfigured nagios a bit [15:06:25] now it reports to irc after 20 hours [15:06:26] re puppet [15:06:40] after 10 hours it flag it as warning after 20 critical [15:06:46] and report to irc [15:06:51] hopefully we don't be so spammed [15:07:00] * won't [15:09:54] Ryan_Lane: u here? [15:23:49] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on puppet-lucid i-00000080 output: Puppet has not run in the last 10 hours [16:01:43] LeslieCarr: is ipv6 enabled on network where labs run? [16:01:52] no :( openstack limitations [16:02:23] i think a new version supports it ok [16:02:25] ok but, in theory it could work if I configured it by hand [16:02:31] on instance [16:02:43] No - traffic is limited on the nodes [16:02:46] ah [16:02:48] that suck [16:02:57] Well it kinda doesn't [16:03:01] Means I can't steal your ips [16:03:01] :P [16:03:06] heh [16:03:17] in environment where no hackers are, it sucks [16:03:43] LeslieCarr: any plans to upgrade? [16:04:42] Ryan would know when but yes, the latest version has more features [16:05:08] I hope it's going to be sooner than prod switch to it [16:06:05] there is a lot of stuff which needs to be done before we can enable ipv6 on production, and having a test site we could patch the tools on before we do that is quite needed [16:06:36] like almost all vandalism tools we use on wikipedia, are not ready for ipv6, or even if they are, no one ever could test it [16:07:18] btw Damianz is cluebot working on ipv6? :) [16:07:30] does it even recognize ipv6 address [16:07:36] :-/ no idea with everything [16:07:56] for example huggle gracefully crash on ipv6 user :D [16:08:19] that's what I know as for now [16:08:40] it's kind of hard for me to fix it, because I have no ipv6 connectivity in here [16:09:00] maybe an instance with graphical gui where I could vnc to would be nice :D [16:09:10] so that I could develop sw in ipv6 environment hehe [16:15:52] New review: Bhartshorne; "(no comment)" [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/4157 [16:24:22] petan: ? It doesn't really do ip stuff [16:24:50] It could be able to detect if user is anon or registered at least [16:24:59] sometimes the warning templates may differ [16:25:08] so uploading large images also fails, so must be an issue unrelated to tmh, could it be some other extension that fails with large post data? i.e. this tor extension that throws errors all the time [16:25:38] j^: is it possible to enable some debug mode in that extension you want to fix [16:25:49] so that we can check why it fail [16:26:05] petan: so far i dont know what fails [16:26:07] I think that problem is in upload wizard itself [16:26:23] it should produce better output than ERROR [16:26:32] that doesn't say really much [16:27:40] true, but could also be that some other part in the stack fails. in any case UW should give better errors [16:28:06] sadly that too is not the extension i wanted to get tested [16:28:08] I could temporarily disable other extensions on commons [16:29:32] petan: could you disable the tor extension? [16:29:36] sure [16:31:28] !log deployment-prep petrb: /extensions/TorBlock/TorBlock.php is now removed from config of commons wiki [16:31:37] oh funny [16:31:42] that bot crash when I need it [16:34:15] petan: ok that did not help. was not tor [16:35:33] !log deployment-prep petrb: undone the change [16:46:03] so the large post to http://commons.wikimedia.beta.wmflabs.org/w/api.php i get the api usage returned with MediaWiki-API-Error:help [16:55:15] ah trying to upload via http://commons.wikimedia.beta.wmflabs.org/wiki/Special:Upload it just returns to the upload form for large files [17:00:39] arg, post_max_size was still 8M in php.ini [17:13:09] petan: on web4 /etc/apache2 is missing [17:31:55] New patchset: Dzahn; "labs - make common syslogs readable by normal users per RT-2712" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4157 [17:32:08] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/4157 [17:33:13] New review: Dzahn; "(no comment)" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4157 [17:34:46] j^: run patch1 in my home [17:34:50] that fix the /etc [17:35:51] petan: its a problem with glusterfs [17:35:52] ls /data/project/apache2 [17:35:52] ls: cannot access /data/project/apache2: No such file or directory [17:43:04] wanna get a star.wmflabs.org cert from a CA that is in browsers? [17:46:04] petan: howdy [17:46:10] I wasn't there earlier [17:46:13] ok [17:46:17] now I am not here a bit :P [17:46:25] j^: type ls /data/project [17:46:29] it should automount I hope [17:47:02] I need to document that at some point [17:47:03] heh [17:47:06] ls: cannot open directory /data/project: No such file or directory [17:47:14] which instance is this? [17:47:18] web4 [17:47:29] i guess umount/mount should do [17:47:31] deployment-web4? [17:47:35] yes [17:47:42] not sure how that works with gluster [17:47:51] no. it's an automount [17:48:04] cd /data/project; ls? [17:48:16] Wish they would automount so you could see the dir though :( [17:48:24] ls: cannot open directory .: No such file or directory [17:49:24] strange [17:49:28] it says it is and isn't mounted [17:50:36] gluster daemon needed to be restarted [17:50:40] it's there now [17:50:57] Damianz: at some point they will [17:51:20] Damianz: we're using automounts right now because nova can't mount the sharedfs stuff, even in essex [17:51:31] in folsem it should support it [17:52:33] Which is the release in another 6months? [17:53:03] I was reading about some interesting openstack+ceph stuff earlier... still not convinced ceph has a hugeenough community though. [17:53:32] there's no documentation for ceph [17:53:42] and no irc channel [17:53:59] the ceph people want to work with us, though [17:54:20] There's a tiny amount of documentation, just about enough to get it working but not fix it when it breaks. [17:57:27] yeah [18:15:33] petan: Did you ever follow up with Ryan_Lane about how you think that storing the MW db on gluster was causing corruption? [18:16:10] any ideas why you think it may be doing so? [18:16:42] I'm not sure there was a theory, just evidence that it was happening. [18:16:54] If true it seems like probably a file locking thing... [18:17:26] yeah [18:17:33] locking should work properly in gluster 3.2 [18:17:54] * Damianz locks petan in the ether [18:18:13] I think that while you were away petan rebuilt the beta db and it immediately became corrupt, so he moved it to local storage and it worked properly. [18:18:39] * Ryan_Lane nods [18:18:45] I'm asking in #gluster [18:52:45] Ryan_Lane: yes it was causing lot of troubles [18:52:54] I had to move it back to local storage [18:53:07] people in gluster channel are saying its working for them [18:53:13] they have some share tweaks, though [18:53:19] it's easy to reproduce on labs [18:53:32] just move db to gluster heh [18:53:37] heh [18:53:41] you won't be able to do anything there [18:53:44] create table fail [18:54:01] everything what changes the structure [18:54:03] is failing [18:54:09] drop table, alter etc [18:54:32] !requests [18:54:32] this is a backlog of all requests needed to be done by ops https://labsconsole.wikimedia.org/wiki/Requests [18:54:35] can u check it [18:56:28] Hello Ryan_Lane . We had a discussion with mutante about gerrit using real name as a login. [18:56:35] patch for the Gerrit lord is https://gerrit.wikimedia.org/r/#change,4166 :-D [18:56:45] eh? [18:56:48] which is so simple I am pretty sure it is wrong :-b [18:56:50] what do you mean using real name? [18:57:00] the cn: entry [18:57:07] please, let's not change how authentication works [18:57:26] web login = wiki user name [18:57:33] shell login = shell account name [18:57:46] <^demon> Using CN for that is kind of silly, imho. I'd much prefer to login with my SN. [18:57:49] if we change gerrit then it's inconsistent with every other web application [18:58:03] also, we *can't* change this now [18:58:24] unless we modify the gerrit database [18:58:28] <^demon> I wish you'd at least fixed my CN when I complained ~6 months ago :( [18:58:30] it seems to freak out [18:58:43] anyway the root cause, is that I would like to have my real name to be used whenever gerrit merge a change for me [18:58:47] ^demon: give me a script to rename users in gerrit and I'll change your cn [18:58:59] my merges show as being merged by: Hashar [18:59:05] then we should change your cn [18:59:09] ohhh [18:59:12] <^demon> Ryan_Lane: Ugh, making my dig through that 5NF database? [18:59:21] but again, give me a script to rename gerrit users and I'll do it [18:59:22] <^demon> I swear, that database is too abstract to be useful :p [18:59:42] <^demon> Can't delete anything, cuz we've gotta update it in like 6 places :p [19:00:50] until we have that, I'm not changing anything :) [19:01:07] but, this is the reason I give people fair warning in the account questions [19:01:10] !account-questions [19:01:10] I need the following info from you: 1. Your preferred wiki user name. This will also be your git username, so if you'd prefer this to be your real name, then provide your real name. 2. Your preferred email address. 3. Your SVN account name, or your preferred shell account name, if you do not have SVN access. [19:01:19] "This will also be your git username, so if you'd prefer this to be your real name, then provide your real name." [19:01:24] <^demon> Yeah we didn't have the snarky bot when I played guinea pig ;-) [19:01:27] heh [19:01:29] true [19:01:51] so, gerrit rename script :) [19:02:01] <^demon> Yeah, I'll figure it out eventually. [19:02:12] if you change the cn of a user, then they try to login, gerrit freaks out [19:02:12] So the cn: is used as a realname when Gerrit does the commit and it is also used to login. Is that correct? [19:02:21] since Gerrit config currently has: accountPattern = (&(objectClass=person)(cn=${username})) [19:02:22] hashar: yep [19:03:22] also, using uid wouldn't use your real name [19:03:32] uid is your shell account name [19:03:35] so if we update my cn to Antoine Musso, I will have to use that to logon wiki and gerrit ? [19:03:39] yes [19:03:45] but, we'd need to rename your user in gerrit [19:04:11] (then I ask myself: WTF aren't we using shell account for everything, then I remember some lines above that Ryan wrote about web login !== shell login) [19:04:15] * hashar cries [19:04:20] * hashar stops merging changes :-D [19:04:37] well, the very first thing someone asked me to do was allow people to separate shell names from wiki names [19:04:44] hehe [19:05:00] because people wanted to use their real name or their wikipedia wiki name [19:05:25] you have no clue how difficult this makes things ;) [19:05:36] I totally have clue about that [19:05:37] it means two unique attributes [19:05:47] <^demon> Yeah my shell name is demon and my wiki name is Demon [19:05:49] cause I have someone to handle it for me :-]]]]]] [19:05:49] it also means you need to map between them [19:05:50] <^demon> Some separation ;-) [19:06:12] what is the sn: field for? [19:06:31] basically nothing [19:06:33] it's surname [19:06:46] * hashar googles for surname [19:06:46] it's a required attribute, though [19:06:52] <^demon> sn is the shell username, I thought? [19:06:52] surname is family name [19:07:54] <^demon> How does the LdapAuth extension handle renames? [19:08:39] ok so our surname should be Lane and Musso. Not going to be any useful for us anyway [19:10:38] <^demon> Ryan_Lane: On a totally different subject, the bug about putting a proxy in front of gerrit so you can skip port 29418 (https://bugzilla.wikimedia.org/35611) isn't just Tim's case of "it's annoying to retype and hard to remember." [19:10:57] <^demon> It actually has a practical implication in that I cannot commit from school wifi since the port's blocked :\ [19:11:08] ^demon: ldap auth doesn't [19:11:30] <^demon> So if you rename User A -> User B in ldap, MediaWiki treats them like 2 different users? [19:11:30] heh [19:11:36] well, no [19:11:45] I rename the user in ldap, then rename it in mediawiki [19:11:49] <^demon> Ah. [19:12:01] <^demon> So if we have "Rename in gerrit" we can complete the triad and allow renaming. [19:12:02] I bet it doesn't have a fucking api [19:12:25] renameuser has no api, right? [19:12:36] <^demon> Ah yes [19:12:48] * Ryan_Lane is full of hate for mediawiki right now [19:12:58] it's an extension, *why* would it have an api? [19:13:17] <^demon> Lots of extensions have APIs. This one would be nice. [19:13:27] and again, I ask, why the hell is rename user an extension? [19:13:41] <^demon> Because core users couldn't possibly want to rename? [19:13:45] Ryan_Lane: You try renaming and merging 4 users into 1... out comes the sql queries. [19:14:00] ^demon: :D [19:14:13] Damianz: no sql queries needed [19:14:22] well, some for gerrit. heh [19:14:30] but that's why I'm asking for a script :) [19:14:32] <^demon> Damianz: Part of the benefit of writing core code is you can do sql queries ;-) [19:14:52] <^demon> Ryan_Lane: Ideally we could have some special page you can change your name on, and it hits the ldap, mediawiki and gerrit rename apis. [19:14:59] The mediawiki core is rather confusing :P [19:15:07] petan: I'm going to upgrade gluster for project storage to the beta we're using for instance storage [19:15:22] petan: I was told upgrading to the latest stable should work, the same bug fixes are in the beta [19:15:39] petan: so, we can test that when I upgrade it [19:16:17] <^demon> Damianz: It's because we've got about 10-15 active core developers all with a slightly different internal roadmap of what they'd like to see MediaWiki do. Isn't open source fun? [19:18:20] That's why you have a roadmap and use features and get the community to vote on features :D [19:18:29] hahaha: "upgrade servers first, then clients. Clients will need to unmount and mount in order to pick up the new version. I'm not entirely sure that the 3.2 client will talk to the 3.3 server though. It's supposed to, but I haven't tried it myself." [19:18:37] well, that's reassuring :) [19:18:47] * Damianz gets ready for Ryan_Lane to break everything [19:18:58] this is for project storage [19:19:06] it's not being heavily used right now [19:19:21] so, it's a good time as any [19:19:40] <^demon> Damianz: Eventually I'd like to formalize our RfC process a bit for introducing big things. [19:20:01] <^demon> Right now I'd just like to get the git migration over with and go back to coding ;-) [19:20:02] I can actually test this before I try it live [19:20:23] lunch. back in a bit [19:26:39] ^demon: Is there actually a solid date for that now or are we still fixing gerrit/access/people throwing fits? [19:26:56] <^demon> I sent an e-mail about it over the weekend. [19:27:03] <^demon> I'm doing another batch of extensions on Friday [19:27:22] Yeah because I totally keep up with the tech mailing list :P [19:27:27] * Damianz notes to look after nomnoms [19:56:56] RECOVERY Current Users is now: OK on nova-gsoc1 i-000001de output: USERS OK - 1 users currently logged in [19:57:25] suhasmonk: ok. so, andrewbogott has been working on essex, rather than diablo [19:57:26] RECOVERY Disk Space is now: OK on nova-gsoc1 i-000001de output: DISK OK [19:57:48] Ryan_Lane, on the nova api wrapper? [19:57:59] i mean the new openstack api? [19:58:00] nah, the backend [19:58:17] I'm the only one working on the extension right now [19:58:19] I heard andrewbogott has a rather nice backend. [19:59:03] so I'm told :) [19:59:08] :D [19:59:25] suhasmonk: so, likely what we want to do is point the mediawiki instance at another set of instances [19:59:29] suhasmonk: that has essex installed [19:59:35] then you have something to target [19:59:58] andrewbogott: suhasmonk is going to implement the nova api in OpenStackManager [20:00:00] suhasmonk: You're the SOC person working on Open stack manager? [20:00:17] Ryan_Lane, andrewbogott yeah [20:00:24] that's great! welcome. [20:00:26] Do we actually have a testing setup for that? I was looking at some bugfixes but got fedup installing devstack for it. [20:00:45] well, a few things need to get done for essex to work properly [20:00:57] RECOVERY Total Processes is now: OK on nova-gsoc1 i-000001de output: PROCS OK: 122 processes [20:00:59] suhasmonk: it's likely possible to target diablo at first [20:01:06] RECOVERY Current Load is now: OK on nova-gsoc1 i-000001de output: OK - load average: 0.17, 1.01, 1.15 [20:01:12] to at least get used to the extension and the api [20:01:24] Ryan_Lane, yeah. that makes sense [20:01:33] we'll get essex in a usable state for you, so you can target version 2 of the api [20:01:36] RECOVERY Free ram is now: OK on nova-gsoc1 i-000001de output: OK: 82% free memory [20:01:51] Ryan_Lane, okay [20:01:57] I worked on getting essex to install via puppet. It looks like it works... but I'm not sure all the pieces are actually there. [20:02:07] so, for now, let me give you another mediawiki install on my instance [20:02:16] Also hi suhasmonk [20:02:26] andrewbogott: one thing we need to do is get keystone working with LDAP [20:02:32] Damianz, hey [20:02:34] Openstack and puppet is a bit of a bleh, working on a module for w0rk at the moment. [20:02:44] Keystone refuses to play nicely :( [20:02:45] Damianz, so you work on the openstack manager too? [20:02:52] ryan_lane: I'm not entirely clear on whether essex requires keystone. All the old diablo-style calls are still there... [20:02:58] really? [20:03:04] I thought they were going to drop that [20:03:19] if that's true, that's a good thing [20:03:29] Well, it's not clear. The calls are still there, scheduled for a big purge as soon as essex ships. [20:03:30] Ryan_Lane: I would like to help with extension too, if you needed more devs [20:03:35] But I don't know if they actually work. [20:03:40] suhasmonk: Nah, I just moan at Ryan_Lane for it not working. I was going to tidy up some bits and clear a few bugs but it needs a full devstack+ldap+mw setup to properly develop/test it. [20:03:45] lemme ask in openstack-dev [20:04:02] Worth a try, I got nothing but cricket chirps when I asked before. [20:04:27] * Ryan_Lane nods [20:04:32] I may get the same. heh [20:04:50] Would anyone know off the top of their head what port the opendj admin server runs on? [20:04:57] Damianz: well, we have a production version pre-configured [20:05:08] just not one targetting the newer version of nova [20:05:18] I think it runs on 4444 [20:05:42] Ryan_Lane: We do? I thought that was for gluster testing etc. [20:05:49] 4444 sounds about right.... *tries that* [20:05:57] I have one for testing changes before I put them into production [20:06:01] I also do development there [20:06:06] nova-production1 [20:07:05] suhasmonk: Just for future reference, what timezone are you in? [20:07:17] I'm also going to be updating OpenStackManager in production soon [20:07:23] I just pushed a really large change [20:07:31] and I seriously doubt I'm going to get a code review on it [20:08:01] I should never make a commit this large: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/114680 [20:08:01] heh [20:08:24] Can we hire some SOC people to make gerrit userable? [20:08:24] :D [20:08:52] they'd need to join the gerrit GSOC project [20:08:58] if gerrit is even doing it [20:09:20] Dunno but the web interface could be a lot prettier, also did I see gitreview got dropped? [20:10:06] eh? it did? [20:10:21] it's written by the openstack people [20:12:04] I swear I saw a thing go though our gerrit install but that might have been old stuff [20:13:08] oh. no clue [20:14:56] andrewbogott, I'm in IST (GMT +5:30) [20:15:17] Where's IST, istanbul? [20:15:25] Damianz, India [20:15:28] Ah. [20:16:33] suhasmonk: ok. I have a proxy which makes it look like I'm logged in while I'm sleeping (which, alas, will be much of your work day.) If you ping me on IRC I generally catch things in the scrollback. [20:17:29] I'm in Minneapolis which means I generally show up a couple of hours earlier than Ryan. [20:17:51] andrewbogott, cool. I'll ping you if i need anything. But I am also on usually at this time [20:18:20] ok. Also be warned that I have not written even a single line of php. So my answers about nova will tend to be abstract :) [20:19:25] andrewbogott, okay. no problem [20:22:24] if andrewbogott.isAPythonLover(): andrewbogott.cookies += 1 [20:23:27] I actually spent years working in tcl/tk. That nullifies any other street cred I might accrue elsewhere. [20:23:47] Oh god [20:23:48] * Damianz runs screaming [20:23:57] I love working in a language that can tell the difference between code and data! [20:24:38] I sometimes have to do tcl stuff (expect for rancid). It's like digging your eyes out with a blut spoon. [20:26:12] Well, it never fails to surprise! [20:27:07] heh [20:27:14] php makes me want to spoon my eyes out too [20:28:54] php is just yucky [20:29:02] hm. I manage the apache config via puppet, and i need to add an alias [20:29:03] * Ryan_Lane sighs [20:29:05] lack of care for types drives me crazy and the memory bloat is insane. [20:30:49] Damianz: The main project I worked on for ~5 years was Objective C GUI + tcl backend. Like crapping in a solid gold toilet. [20:30:56] hahahahahahaha [20:32:09] ObjC is horrible [20:32:22] Granted I don't know C that well but gah C++ > ObjC [20:32:43] Huh. [20:33:28] As long as I have a widescreen monitor I much prefer objc. It's verbose but... much less time spent chasing pointer errors. [20:33:40] But I'm not religious about it. [20:33:46] RECOVERY dpkg-check is now: OK on nova-gsoc1 i-000001de output: All packages OK [20:35:11] petan: any way to check for a hung file upload process on beta commons? I have one that's been showing "Submitting details and publishing..." for a long time now. [20:38:54] andrewbogott: You and java would be friends, feel free to work on gerrit :D [20:39:37] chrismcmahon: yes I know it stuck but I don't really know how to debug it [20:39:44] there is no useful log for that extension [20:39:54] perhaps you should implement debug output into it [20:44:12] Ryan_Lane: I'm a little bit worried that writing to a wiki for every little nova event is going to chew up storage, on account of mw saving revision information. [20:44:27] Is there an API call to purge revision info? Or do you think I just shouldn't worry about this? [20:44:34] no need to worry about it [20:44:49] we store over 1 billion revisions on our projects [20:44:56] When a page is deleted, it's rev history still persists forever, right? [20:45:02] yes [20:45:11] <^demon> Ryan_Lane: Have you every idled in #gerrit? Talk about a quiet channel. [20:45:19] ^demon: yeah, it's very quiet [20:45:26] they respond if you ask a question, usually, though [20:45:34] Hm... ok, well, I guess if it turns out to be a problem we'll deal with it then. [20:45:47] <^demon> I figured I'd idle and see if I picked up on anything. So far I've been there for ~3 hours and nobody's made a peep. [20:46:00] andrewbogott: yeah. we can likely ignore certain changes, for instance [20:46:44] I'm thinking I'll have a config setting to whitelist interesting events. [20:47:07] yeah, sounds good [20:47:47] I'll be happy to have this change it. it'll make the documentation a lot more reliable :) [20:47:50] *in [20:51:14] Getting a page like this is trivial: http://labs.wikimedia.beta.wmflabs.org/wiki/Andrewtest-bananas1 It requires some formatting of course. [20:51:24] And I'm not sure yet about live memory/disk usage. [20:51:57] (that's just a raw dump of a notification's context data.) [20:56:40] * Ryan_Lane nods [20:56:45] writing it out as a template is likely best [20:57:20] but that looks about right [20:57:34] we may want to be able to massage date strings, too [20:57:55] currently the date we write out isn't compatible with semantic mediawiki [20:58:00] so, it isn't usable in queries [20:58:18] ok, that's what I was about to ask -- the current page is just a table, not a template, right? [20:58:36] the tenant/user/instance ids don't seem terribly helpful either :D [20:58:46] I wonder if there's some way to make those the massaged values [20:59:05] the current page is just paragraphs [20:59:22] They are uuids. Should be easy to translate the uuids into display names. [20:59:32] it may be nice to be able to provide a template for the page text [20:59:41] * andrewbogott nods [20:59:46] <^demon> Ryan_Lane: Does gerrit have some sort of error log I can tail? [20:59:51] then we can massage the data however we'd like [20:59:56] ^demon: yes [20:59:58] I need to learn how to template. [21:00:14] <^demon> Ryan_Lane: Ah, logs/error_log [21:00:23] <^demon> :) [21:00:30] yep [21:00:33] also the ssh one [21:00:58] I wonder how many revisions there are in totally accross every *.wikipedia.org install. [21:01:04] That would be an interesting graph. [21:01:10] Damianz: over 1 billion [21:01:16] there's a toolserver tool for this somewhere [21:02:34] over 1 billion is a very inprecise number :P [21:02:44] well, one year ago it was 1 billion [21:02:51] wait [21:02:55] that was two years ago [21:03:16] <^demon> 2? Already? [21:05:20] Damianz: http://toolserver.org/~emijrp/wikimediacounter/ [21:05:24] 1.5 [21:06:08] Shiny [21:06:33] .4 of those new .5 are probably reverts [21:06:34] :D [21:06:54] I assume that's a fake number based on the current edit rate? As hitting mysql numerious times a second would be a sure way to kill yourself. [21:07:01] And yeah... there is a LOT of reverts [21:07:22] Damianz: it does occasional polling, estimating inbetween [21:07:38] mhm [21:10:13] 04/03/2012 - 21:10:12 - Updating keys for mglaser [21:14:56] I wonder if there was ever any progress in regards to being able to auth wikipedia users, totally can't find the page the discussion was on. [21:15:30] for labsconsole? [21:18:33] or deployment-prep? [21:21:15] <^demon> Ryan_Lane: Hooks are run as gerrit2, right? [21:21:20] yep [21:21:37] <^demon> Ah, that might explain why the hook didn't work :) [21:22:06] To the magic that is the central wikipedia login thing. IE oauth for wikipedia rather than the labs stuff :P [21:23:09] New review: Bhartshorne; "(no comment)" [operations/puppet] (test); V: 1 C: 1; - https://gerrit.wikimedia.org/r/4157 [21:23:46] We have normal users!? [21:24:04] Damianz: no progress [21:24:13] Sad times. [21:24:17] Damianz: the new security guy may work on oath/openid support, though [21:24:22] *oauth [21:25:04] I want to write a new portal for cb as the current one is beyond horrid but I'd rather use wp logins over storing seperate logins locally :( Especially as it's now on labs and lots of people have access to the encrypted passwords potentially. [21:25:21] you're not allowed to store passwords in labs [21:25:42] we're taking the same approach as toolserver on this [21:25:47] we can't do it [21:26:19] well.... [21:26:21] I say that.... [21:26:24] that's not totally true [21:26:37] your application needs to display a warning, when allowing people to create accounts [21:26:47] it's not allowed to act on behalf of production users, though [21:26:58] Well as we took logins before moving the db here :P [21:27:12] meaning, you aren't allowed to ask for a user's production username/password [21:27:21] Yeah which is why I want oauth [21:27:24] same [21:28:17] Because curently I make them sign up for another account, mainly just to stop anti-spam stuff being displayed but also to give them higher levels of access.... oh how nice it could be to just auth against teh api and use that but alas that breaks the tos stuff of the api :( [21:28:51] I'll just wait 45.3 years for some auth system then get around to rewriting the sites into 1 portal :P [21:29:13] heh [21:30:58] Sadtimes when you're watching a tv program and see the load on the servers rise that host the website for $person and $person whos site you host bah. [23:43:02] IT'S ALIVE [23:43:36] \o/ [23:43:50] Did you fix andrewbogott too? Could have left him ;) [23:43:51] for once it wasn't my fault! [23:44:06] there [23:44:07] fixed [23:44:09] grrrrrr [23:44:10] hey ryan: quick git/gerrit question [23:44:18] yessir [23:44:29] sorry everyone :( [23:44:35] Awesome my bot relays connected back fine. [23:44:40] * Damianz sends LeslieCarr to the corner [23:44:46] I pushed 4 independent commits, but gerrit treats them as depedent on each other, why? [23:44:58] independent==no shared files, no nothing [23:45:32] because they are based on each other [23:45:45] if you rebase them, they won't be dependent [23:45:58] are you using git review? [23:46:03] okay, then I review all 4 but they still don't merge [23:46:03] yes [23:46:21] rebase them [23:47:24] but it is already in gerrit and one commit says 'review in progress' while i approved it (+2) [23:48:15] I have no clue. [23:48:24] if you rebase them they'll merge [23:48:24] https://gerrit.wikimedia.org/r/#change,4227 [23:48:52] Does anyone smoke rewriteengine rules for fun? [23:49:15] it's best to not to submit a bunch of commits like that [23:49:29] drdee: you should make a branch for each change [23:49:32] apparently [23:49:38] it's the recommended approach [23:49:52] Branching is awesome [23:50:16] ryan: can you maybe just approve and see if it merges? [23:50:23] it won't [23:50:39] i would call this a bug [23:50:50] wait [23:50:58] did you just review it? [23:51:04] and not do review and submit? [23:51:13] 10 minutes ago [23:51:18] it merged just fine when I clicked submit [23:51:49] thx, i am pretty sure i did that [23:52:16] Gerrit is god of all, including retarded uis. [23:53:28] heh [23:53:48] drdee: it seems the one that everything else depended on was never merged [23:54:25] but anyway, I recommend always making a topic branch [23:54:30] or doing a commit, then a push [23:54:37] and merging, then doing a pull [23:54:44] okay got it [23:54:45] thx [23:54:46] or doing a push, then a rebase [23:54:54] See for extensions I'm more of a test driven development setup because gerrit is a bit bleh but then we end up with things like the kernel driver mailing list and non-noob friendly entry. [23:56:22] I really don't find gerrit to be so terrible [23:56:26] * Ryan_Lane shrugs [23:58:10] I like it for some things, like puppet but then I don't like how in a testing setup making changes takes days [23:58:50] eh?