[00:05:11] if you do that you'll be my hero [00:05:30] I wanted to fix it for ages but haven't found the time [00:06:31] so this is what turns out undefined <%= scope.lookupvar('puppetmaster::config::gitdir') %> [00:08:01] yeah [00:08:11] the whole cronjob should not be in labs [00:11:01] i wonder why it is on all instances when it's "puppetmaster server side scripts" [00:11:22] puppetmaster::self probably :) [00:11:29] ah, yeah [00:13:33] i guess we want to keep the other cron job though, the one that deletes old puppet reports [00:13:57] seems to make sense in labs as well.. i see report files on an instance with puppetmaster::self [00:20:57] 08/30/2012 - 00:20:57 - Created a home directory for rmoen in project(s): bastion,editor-engagement [00:21:09] Yay :) [00:26:01] 08/30/2012 - 00:26:01 - User rmoen may have been modified in LDAP or locally, updating key in project(s): bastion,editor-engagement [00:26:02] paravoid: yeah, i dont have a better idea than just adding another realm check [00:26:11] !change 21981 | paravoid [00:26:11] paravoid: https://gerrit.wikimedia.org/r/#q,21981,n,z [00:26:29] at least for now to stop the mails [00:26:52] add an "else ensure => absent" and I'm +2 :) [00:27:18] and remind me to buy you a beer or two in ~two weeks [00:31:15] paravoid: howdy [00:31:36] hi Ryan [00:31:40] I was looking for you before [00:39:56] paravoid: patch set 4 [00:40:25] hm, the bot is down probably... [00:40:45] Patch Set 4: Fails [00:41:12] ugh, never mind :p [00:42:24] too many semicolons.. i blame PHP :) [00:42:28] patch set 5 [00:42:54] paravoid: oh? [00:43:05] ah, chat from yesterday [00:45:11] mutante: hahm, there is prettier/more "puppet way" of doing that [01:01:08] paravoid: so, there's a bunch of small tasks we can do to help things out [01:01:10] in labs [01:01:22] we can switch to virtio as the network driver [01:01:34] we can switch the home directories to gluster [01:01:48] it's likely best to switch to virtio before switching the homedirs :) [01:02:02] paravoid: I talked to the piston cloud people today at vmworld [01:02:19] their commercial openstack distro uses ceph [01:02:30] and they are using boot from volume with it [01:02:58] they have a bunch of proprietary changes to openstack to make it work well when using boot from volume [01:03:01] do they have any users? :P [01:03:14] they won "best private cloud solution" at vmworld [01:03:20] at *vmworld* [01:03:37] mutante: see patchset 5 [01:03:57] so, yeah, they do have some customers. heh [01:04:03] isn't vmworld a vmware thing? [01:04:06] yes [01:04:11] which makes that funny [01:04:20] cloud people are crazy, we knew that [01:04:21] :P [01:04:28] an openstack distro won best private cloud solution at a vmware conference [01:04:36] yeah yeah I got that :) [01:04:46] it made me laugh :) [01:05:00] do we have any (better) news regarding ceph from andrew? [01:05:04] and/or asher? [01:05:05] kind of [01:05:14] except they use ssds [01:05:21] paravoid: oh yeah, gotcha, thanks [01:05:21] which isn't out of the question for us [01:05:27] the instance storage is fairly small [01:05:30] I mean, if it's going to be crappy as hell and use proprietary patches... [01:05:39] but you said boot from volume? [01:05:42] ceph isn't using patches [01:05:44] openstack is [01:05:52] I got that, but still [01:06:03] boot from volume works right now [01:06:06] slow + proprietary patches, that's already two minuses [01:06:08] the problem is live migration with it [01:06:16] we don't need the patches [01:06:35] I asked them about performance [01:07:11] if we're going the boot as volume way, I'd prefer if we'd imitate production rather than tell everyone to use /mnt or whatever for their data... [01:07:36] we don't need to have /mnt at all [01:07:50] I'd prefer that we don't have secondary storage on our instances at all [01:07:54] right [01:07:56] agreed [01:07:58] and that people mount volumes when they need them [01:08:30] we can't really do that until we have a volume service [01:08:37] so, okay, I'm *all* from switching away from local storage [01:08:42] well...... [01:08:52] what piston is doing is using the compute node storage [01:08:54] but don't we have other more urgent things to do? [01:08:55] like we are [01:08:57] yes [01:09:01] this is just ideas for the future [01:09:04] as Ceph nodes you mean? [01:09:07] yes [01:09:13] yeah, I've seen that before and I like it [01:09:39] it's good to think about this kind of stuff so that we know where we'd like to go when we have time to do so :) [01:09:42] but... [01:09:52] because if you split your clusters, you have a bunch of compute systems with no disk I/O at all and a bunch of storage systems with no CPU usage at all [01:09:56] if we're considering this, it may be good to replace the disks in eqiad with ssds [01:10:03] yep [01:10:06] replace them with SSDs? [01:10:06] agreed [01:10:13] replace 1TB disks with 100G SSDs? [01:10:22] instance storage can be fairly small [01:10:23] or is it 600G SAS? [01:10:33] 500GB SAS, I believe [01:10:39] there's no 500G SAS afaik [01:10:42] hm [01:10:44] it's 146/300/600 [01:11:22] I think using SSDs for the / of *all* labs instances is a bit of an overkill [01:11:49] well, it would be for the ceph storage [01:11:51] they're 300GB SAS [01:11:54] ah [01:12:07] if it's the only way for us to get decent performance out of ceph... [01:12:29] it would be good to test it [01:12:43] to see how much of a different it would make, and what the cost difference would be [01:12:59] I'm not a big fan of throwing money to broken software to solve its problems :-) [01:13:00] of course, we don't really have much for hardware budget this year [01:13:11] it's not necessarily broken software [01:13:17] however, people do suggest to use SSDs for the ceph *journal* [01:13:27] every write needs to do multiple writes across the network [01:13:28] if you use ceph rather than rados that is [01:15:27] the piston people said: don't use btrfs, and don't use cephfs [01:15:59] I wouldn't use btrfs [01:16:06] but asher said that cephfs was faster than rados [01:17:09] really? [01:17:21] I think so [01:17:24] the shared, nfs style block storage? [01:17:40] I've read the opposite [01:20:05] ignoring that whole thing, though.... [01:20:12] I'd really like to kill of some of the smaller issues [01:20:17] we need to add a new network in pmtpa [01:20:26] we need to switch to virtio [01:20:36] we need to move the home dirs away from the nfs instance [01:21:00] we need to do network node per compute node (which means I need to apply that bgp patch) [01:21:17] the bgp patch needs exabgp moved to ubuntu, with the patch I sent in applied [01:21:28] the patch I sent into exabgp was accepted upstream [01:21:45] exabgp is already packaged in debian [01:22:20] yeah, those would be nice [01:22:35] add a new network in pmtpa -- also have ipv6 in labs finally [01:22:39] yes [01:22:46] the new network should be added with ipv6 [01:22:55] we need to figure out how to make it apply to the old network too [01:23:00] maybe we can test that in ewiad [01:23:01] eqiad [01:23:10] we need to set up eqiad... [01:23:13] you didn't say that :) [01:23:17] well, it's set up [01:23:24] it can be used for testing anyway [01:23:32] no I mean add it as a region [01:23:35] we can't allow users to access it yet [01:23:35] yeah [01:23:42] I'd like to tackle some of these smaller tasks first [01:23:49] sure, agreed [01:24:05] it'll mean eqiad comes up with these things fixes from the beginning [01:24:16] oh [01:24:21] how difficult is to switch to virtio? [01:24:24] isn't it just a single setting? [01:24:26] we need to bring up the storage in eqiad too [01:24:34] we'll need to modify the libvirt configs [01:24:46] and change the openstack config so that new instances will get virtio [01:24:49] how about new VMs? [01:24:51] ah [01:24:58] it should be a fairly easy change [01:25:01] okay, I've told this before [01:25:10] but it's hard for me to keep track of all these things [01:25:15] yeah [01:25:17] I mean, we've talked about all of them [01:25:22] should we put these in to bugzilla now? [01:25:23] but I keep forgetting some [01:25:36] heh yeah, that was going to be my suggestion [01:25:44] cool. let's do it [01:25:55] and I hate how we do both RT & bugzilla for infrastructure work :/ [01:26:07] but one problem at a time [01:26:08] labs should really only use bugzilla [01:26:15] well, what's "labs"? [01:26:18] Wikimedia Labs project [01:26:34] sorry, that wasn't a response [01:26:45] just mentioning which product to use for bug reports :) [01:26:55] most of this work is in "production" systems [01:26:59] yes [01:27:03] true [01:27:15] and I think Mark will hate you if you e.g. open a bugzilla for the BGP peerings :) [01:27:25] it can still be done by volunteers, though [01:27:39] hm? switch to virtio or add a network? [01:27:45] yes [01:27:51] absolutely [01:27:58] openstack is set up in labs [01:28:05] you can even launch vms inside of it [01:28:15] I don't think it can reasonably, but it's always nice to have tickets in public so that the community is aware [01:28:30] s/it/they/ [01:28:34] * Ryan_Lane nods [01:29:40] so, use bugzilla, or rt? :) [01:30:01] I hate the split in the systems [01:30:10] really we just need a public RT queue for labs [01:30:20] and for RT to use LDAP auth [01:30:27] can't agree more [01:30:33] then this problem goes away [01:30:36] can we do parent tickets with bz? [01:30:41] yep [01:30:49] they are tracking tickets [01:31:06] you use the "Depends on" field for that [01:31:16] right. sorry, not very experienced with bz [01:31:20] * Ryan_Lane nods [01:31:37] I've been a user in multiple projects but never had to do this sort of thing [01:31:41] hm. depends on may not really be right [01:31:57] let's look at a tracking ticket to see [01:32:21] found one [01:32:33] it's tracking depends on child and child blocks tracking [01:32:41] ah. right [01:32:45] that makes sense [01:33:12] heh [01:33:20] they have a tracking bug for tracking bugs [01:34:05] I really wish we could send irc bug notifications for labs into this channel [01:34:37] also, we probably need a "labsconsole" component [01:34:46] anyway, I'm opening the virtio bug now [01:35:48] ah ok [01:35:57] hey [01:35:59] /dev/vda1 on / type ext4 (rw) [01:36:01] I added the one about the network [01:36:04] ? [01:36:05] we already use virtio [01:36:09] how so? [01:36:11] for disks [01:36:17] not sure if that's the case for network too [01:36:21] right [01:36:25] network is not [01:36:31] disks are [01:36:38] ah [01:37:13] adding the ticket about moving away from nfs instance [01:37:24] I thought disks were not either [01:37:27] that's more important [01:37:49] the virtio network driver is *way* faster [01:37:52] think project storage [01:38:22] oh true [01:38:46] We already have home shares made for each project, byw [01:38:51] *btw [01:39:07] they are created and managed with the project storage [01:39:21] we just need to move the data and change the mounts [01:39:45] changing the mounts will likely be the hard part [01:39:48] oh. right.... [01:39:55] hm? [01:39:56] there's another thing I wanted to do too [01:40:51] rather than mounting each homecir [01:40:52] ok, I'm opening the tracking bug [01:40:55] mount all of home [01:41:04] as "Labs infrastructure work tracking bug" [01:41:09] then use pam to make home dirs [01:41:14] sounds good [01:41:16] yes! [01:41:20] pam_mkhomedir [01:41:22] yep [01:41:29] that's actually a blocker [01:41:31] also, passwordless sudo? [01:41:34] yes [01:41:35] that too [01:41:39] yay for opening bugs [01:41:46] :-) [01:42:03] if only I had more time to work on them :P [01:42:11] that's also my problem [01:43:00] is it possible to mount /home with automount? [01:43:06] * Ryan_Lane surely hopes so [01:43:17] I'm not totally sure it's possible [01:43:32] pam_mount [01:43:42] oh, you mean the whole of /home [01:43:45] yes [01:43:46] why automount? [01:43:53] fstab mounts are fucking evil [01:44:03] we can control autofs from ldap [01:44:10] you just said half of unix is evil :P [01:44:17] sorry [01:44:22] fstab nfs/gluster mounts are evil [01:44:56] it's more difficult to change, since we need to change it on every host [01:45:05] it's nice to have that config centralized [01:47:24] okay [01:47:29] open the bug about it [01:47:37] and I'm opening the network node bug [01:47:38] I have one opened for it [01:47:43] I added the network bug [01:47:48] ah [01:48:09] https://bugzilla.wikimedia.org/show_bug.cgi?id=39781 [01:48:20] ah sorry, I meant the network node [01:48:22] per compute node [01:48:25] ah [01:48:26] right [01:48:26] ok [01:48:36] let me link the ones I made [01:48:39] my bad [01:50:15] I'm adding passwordless sudo [01:52:13] https://bugzilla.wikimedia.org/showdependencytree.cgi?id=39784&hide_resolved=1 [01:52:16] \o/ [01:52:38] I'm adding "add eqiad as a production region" [01:52:43] cool [01:59:45] I think we ordered project storage in eqiad [01:59:48] in fact, I think it's there [02:02:03] paravoid: oh, I had an idea of how to replace the awful manage puppet groups interface [02:02:29] paravoid: add keywords as comments in the puppet manifests [02:02:37] for classes and variables [02:02:48] with group assignments for them. [02:03:03] then we can parse the manifests for them [02:03:14] it would work for per-project branches, too [02:03:26] whenever we have per-project branches [02:03:52] we'd need to come up with a syntax for it, though [02:11:12] paravoid: you should really go to sleep :) [02:11:16] I'm heading home [02:14:21] dammit [02:18:04] this view just 500'd: https://labsconsole.wikimedia.org/w/index.php?title=Special:NovaInstance&action=configure&project=editor-engagement&instanceid=i-000003dd [07:42:20] Damianz wtf is ur irc client [07:42:23] it's weird [08:12:41] !log deployment-prep petrb: /home/wikipedia/common/php git pull [08:12:43] Logged the message, Master [08:30:13] petan: kvirc? And yeah it's a little dodgy on osx [08:30:26] "Hey all, we will be down for about 20 minutes at 8pm PST to shine our unicorn horns. Sorry for the inconvenience." < labs totally needs to start announcing downtime like dropbox [08:30:39] heh [08:39:13] !log nagios changing puppet-FAIL command to `echo "Puppet has not run in the last 10 hours" && exit 2` from `/usr/share/nagios3/puppet_check.sh $HOSTADDRESS$` [08:39:14] Logged the message, Master [08:39:38] Damianz btw there are some things on nagios [08:39:49] I saw you tried to change the command for checks [08:39:55] there is a template file for parser [08:40:10] Yeah I saw, it cats in default then seds it =/ [08:40:11] if you wanted to change anything in config files, you need to change it there or it will be lost [08:40:28] you just change the template, enforce update [08:42:12] It's working for now apart from the hosts that havn't run the latest puppet yet, I'd rather take a clean instance and put everything in puppet and know it works before changing -main too much. [08:42:30] eh [08:42:32] noo [08:42:41] I would rather puppetize stuff in -main and keep it [08:42:51] changing hostname again... [08:42:53] bleh [08:43:01] we still didn't recovered from last change [08:43:16] Oh I'd rather keep -main because changing that requires changing ips and names all over [08:43:25] But I'd rather try crazy stuff in -dev then have puppet apply it on -main [08:43:30] why we need to change IP because of puppet [08:43:39] nrpe [08:43:42] ok as long as you won't break the current nagios [08:43:44] I don't care [08:44:02] not that it wasn't broken already :P [08:44:12] It was...er... somewhat broken [08:44:15] :D [08:52:48] How do I have 36 emails from last night to this morning o.0 bleh [09:00:53] !log deployment-prep petrb: php multiversion/MWScript.php changePassword.php --wiki enwiki --user Petrb --password needed to change [09:00:54] Logged the message, Master [15:57:52] hashar can I deploy OSB extension on beta [15:58:02] I really need to test it on prod like wiki [15:58:21] I deployed it before but someone was constantly deleting it from configs [15:58:36] is OSB scheduled for deployment ? [15:58:41] yes [15:58:49] actually it's being reviewed atm [15:58:58] hashar: not quite "scheduled" for deployment [15:58:59] and it's hard to review it when it's not installed anywhere [15:59:04] hashar: see [[Review queue]] [15:59:06] what is OSB anyway? [15:59:10] OnlineStatusBar [15:59:10] online status bar [15:59:44] we have so many queues and schedule to review … :-) [15:59:49] I don't know if it's scheduled but it's supposed to be deployed [15:59:57] yes the deployment process is more than confusing [16:00:02] hashar: can you name them? I can try to simplify [16:00:23] sumanah: I just ignore all of them [16:00:34] petan: there are going to be a lot of steps, that's unavoidable, but if I get told about what's confusing then I can at least try to list stuff out in a public wiki page [16:00:34] cause I don't have the resource / time to invest in tracking everything that happens ;-) [16:00:42] hashar: so, you don't know :) [16:00:53] petan: so you get Krinkle and Trevor reviewing it ? [16:00:58] sumanah it's hard to understand if it's scheduled or not. It was definitely approved by wiki community [16:01:02] petan: and then we will "soon" land it in production? [16:01:07] which should make it scheduled [16:01:24] has eiter of you read the wiki page about this? how to write an ext for deployment? [16:01:33] hashar I have no idea, I never understood this process, which started almost 2 years ago [16:01:33] petan: well go ahead [16:01:48] petan: but please make sure to add your extension configuration in operations/mediawiki-config [16:01:51] and pass it via Gerrit [16:01:52] sumanah that page changed significantly over years [16:02:01] ok [16:02:06] you can do your changes on -dbdump and submit them to Gerrit [16:02:08] Yes, I updated it to make it more accurate, petan [16:02:13] the remote is ssh://gerrit something [16:02:15] I think [16:02:17] hashar I don't know how [16:02:26] I still don't know these git tricks [16:02:27] so if you get a ssh key on -dbdump you should be able to commit and then push to gerrit [16:02:30] ohh [16:02:32] aha [16:02:43] if I uploaded ssh key there, it wouldn't be really safe [16:03:02] create a new key / password protect it ? [16:03:05] hashar: please skim https://www.mediawiki.org/wiki/Writing_an_extension_for_deployment [16:03:10] but yeah indeed will not be safe [16:03:17] hashar: if it is not clear or is inaccurate in some way, I want to know [16:03:23] hashar then someone sudo su and commit as me [16:03:25] :D [16:03:34] sumanah: I am not involved in extensions sorry :-( [16:03:42] petan: hehe [16:03:47] hashar: I know that, but still, your eyes would be useful [16:04:08] petan: so what you can do is add a live hack in the files that would include( …OnlineStatusBar.php ) [16:04:12] and put all your configuration there [16:04:19] that is probably the easiest thing to do [16:04:46] I will commit on my own server where git works [16:04:52] https://www.mediawiki.org/wiki/Review_queue is where the actual list of extensions awaiting review & revision lives, with links to the Bugzilla bugs, and then when something has finally been reviewed and approved for deployment AND there is community consensus for that first deployment, Reedy moves the extension to the "deployment queue" [16:05:39] when someone says "it's confusing" but doesn't tell me WHAT PART is confusing, then I can't improve the process or the documentation. So, tell me what part is confusing, please [16:06:17] I find the entire use of bz confusing :P [16:07:00] Damianz: ok. Tell me more. [16:07:27] sumanah: what is confusing me is the mass of informations we have to handle :-) [16:07:43] I could spend a whole week just reading informations and producing nothing [16:07:47] that is what I was ranting about [16:07:56] hashar: that is the case in all complicated systems, I tihnk [16:08:01] not the Review_queue which is, I am sure of it, most probably a great queueing system [16:08:16] so I end up skipping lot of informations to preserve my brain / productivity [16:08:27] and ask the TL;DR team whenever I need an information hehe [16:08:28] hashar: there is no way to get code deployed onto the 5th highest-traffic website on the internet without various steps, and describing those steps takes words. [16:08:45] yeah I fully agree [16:08:51] hashar: how is Engineering Community Team supposed to scale if everyone does that, though? [16:08:57] that is the point of docs [16:08:58] again, I am not completing about having process / bureaucracy [16:09:05] they scale when individual conversations do not [16:09:27] it is just that we haveee so many stuff to handle that one human can not be aware of all processes much less to everything that happens in each process [16:09:43] as an example, I have close to no idea what the visual editor / mobile teams are doing :-D [16:09:50] hashar: do you at least skim the monthly report? [16:09:58] that has one-paragraph summaries of those [16:10:03] (k I read the monthly engineering report so i know about their job :p) [16:10:15] that is actually my main source of information [16:10:20] together with the Signpost [16:10:53] :) Signpost is useful, I am glad Harry does that weekly tech roundup [16:11:06] hashar I have it but I can't commit it [16:11:11] and Guillaume and the team leads are doing very useful things with project statuses & the monthly report [16:11:12] what is that command to push it [16:11:17] @search push [16:11:17] No results were found, remember, the bot is searching through content of keys and their names [16:11:27] @search config [16:11:27] No results were found, remember, the bot is searching through content of keys and their names [16:11:30] I from an idiot point of view have no idea who works on what, the wikitech list is a mash of complaining, ideas, someone apparently working on something x months ago etc. Bugzilla is partly ignored, not really informative about who is responsible, hard to find bugs at a glance with minimal overall stats etc. Then there's bz/talk pages/ops bug tracker/mailing lists which have duplicate, conflicting [16:11:38] @regsearch [Pp]ush [16:11:38] No results were found, remember, the bot is searching through content of keys and their names [16:11:41] info with no one co-ordinating it. And that's before breakfast [16:11:50] @search git [16:11:50] Results (Found 7): leslie's-reset, damianz's-reset, gitweb, account-questions, git, origin/test, git-puppet, [16:11:58] !origin/test [16:11:58] git checkout -b test origin/test [16:12:36] hashar you know? [16:12:47] I know there was some trick [16:12:56] but I forgot to save the command to push [16:13:45] lol [16:13:50] To ssh://petrb@gerrit.wikimedia.org:29418/operations/mediawiki-config.git [16:13:51] ! [remote rejected] master -> master (can not update the reference as a fast forward) [16:13:52] error: failed to push some refs to 'ssh://petrb@gerrit.wikimedia.org:29418/operations/mediawiki-config.git' [16:13:53] petanb@srv:~/mediawiki-config/wmf-config$ [16:14:37] Damianz: ok, so, if you're interested in investigating these issues to improve processes or help you understand this stuff better, I have time to talk about it. [16:14:44] hashar where are you :D [16:14:46] git pull [16:14:47] I need to go home [16:14:53] master is ahead of your branch [16:14:57] Damianz: when you say "who works on what" are you usually starting off wondering about a particular person, or a particular project? [16:15:11] Damianz what does it mean [16:15:24] Damianz speak svn or english pls [16:15:27] It means some ref in your branch on the remote is newer than the ref you're on [16:15:37] that's even less understanable :D [16:15:43] what do I need to do to fix it [16:15:58] If it's a local clone, git pull or fetch and merge [16:16:10] petanb@srv:~/mediawiki-config/wmf-config$ git pull [16:16:11] Already up-to-date. [16:16:14] Damianz LIES [16:16:28] meh [16:16:31] HELP [16:17:06] Damianz there was a trick to override it, some parameter after push [16:17:12] it never worked [16:17:19] petan: saper, in #mediawiki, is often helpful with this stuff [16:17:19] --force overrides it... but that's bad [16:17:26] nah [16:17:28] that's what that error means [16:17:28] that was other trick [16:17:49] last time it was jeremyb who talk me that secret [16:17:53] * told [16:18:14] git push origin/master (or w/e branch you're on)? [16:18:16] I know it's impossible to commit in wmf-config without using that [16:18:33] petanb@srv:~/mediawiki-config/wmf-config$ git push origin/master [16:18:34] fatal: 'origin/master' does not appear to be a git repository [16:18:35] fatal: The remote end hung up unexpectedly [16:18:46] hmm might be space seperated [16:18:56] ok that produces same error as before [16:19:12] ! [remote rejected] master -> master (can not update the reference as a fast [16:19:16] Dunno then [16:19:35] hashar I will commit to pastebin, feel free to move that to git :D [16:19:42] Are you trying to push to a repo that needs reviewing? [16:19:57] I'll just assume it's a gerrit thing if it's not a git thing [16:21:06] petan: commit from your local machine instead [16:21:14] petan: and reread the git workflow article :-) [16:21:25] http://www.mediawiki.org/wiki/Git/Workflow [16:21:43] hashar I rereaded it like 10 times and still I find it incomplete [16:21:48] there is no git trick in there I need [16:21:55] there should be examples of pushing [16:22:02] like example how to push in mediawiki config [16:22:10] which is hardest I know [16:22:18] idiot proof example [16:22:27] like: [16:22:32] 1: open terminal [16:22:35] sumanah: I don't necessarily think it's bad documentation, just hard to find documentation. Most the time I end up googling for a page rather than knowing where to go and find it. While we're allways going to have lots of 'teams' because of the nature of what's being done I feel there's a lack of a general overview and/or goals laid out. [16:22:41] 2: cd in your directory where the repo is [16:22:53] 3: type magical command: SOME MAGIC I DON'T KNOW [16:22:56] 4: Have a beer [16:23:05] Damianz: I do want to know whether you're generally starting with "what is $person working on?" or "who is working on $task?" [16:23:38] We generally suck at documentation for people who don't already know what they're doing because the people that write it know how to do it. Having the people forced to figure it out write it would provide better coverage. [16:23:43] The later [16:23:43] hashar I am unable to commit from local machine as well [16:23:50] it just doesn't work [16:24:25] ok, so, Damianz, https://www.mediawiki.org/wiki/Wikimedia_Engineering/Project_documentation_howto is at least *how people are supposed to* be documenting what they're doing if they work at Wikimedia Foundation. And that includes "who's working on this project". [16:24:41] Not so interested in what a specific person is doing more than what people are doing generally. Like what ori-l's working on is interesting, I only know about it because he 'overhead' a conversation I was having with someone else that alined with the same goals for a different purpose. [16:25:07] Damianz: how do you feel about https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals ? [16:25:20] you know what... [16:25:24] hashar: http://test.wikipedia.org/wiki/User:Petrb/config_deploy [16:25:29] Damianz: or about hub pages, for example, like https://www.mediawiki.org/wiki/Wikimedia_Platform_Engineering ? [16:25:30] that's ext-wmflabs.php [16:25:45] I am not strong enough to push it [16:26:12] I will be back in 2 hours [16:26:15] bye petan [16:28:40] Damianz: guillom is the Wikimedia Foundation's person in charge of communications about engineering [16:29:05] Damianz: I asked him to come in [16:29:25] 1 - somewhat useful, though I still have to look at 3 pages, not 100% sure what defines features as surly mobile and platform relies on features and moible runs on platform, wikipedia terminology is sometimes confusing to me though [16:29:40] (Damian is talking about the Goals doc, I think) [16:29:55] * guillom listens. [16:30:01] project overview page from the how to document stuff page [16:31:20] 2 (goals page) - I like that, seems somewhat signpost/mailing list/tech blog formatted, more useful in knowing what's happening but not totally who's doing it (from the idea of how to talk to about ideas/contributions), probably some wiki page/mailing list around if I knew where [16:33:58] 3 (hub pages) - much more useful for finding out current projects [16:34:31] Damianz: guillom is the person you have to thank for the hub pages! :-) [16:34:39] The main problem I'd have with them is I'd have no idea where to find them heh. Like I'd of thought they'd be on meta or wikitech not mediawiki [16:34:56] Oh also hi guillom [16:35:07] hello [16:35:36] Damianz: I know guillom and Ryan_Lane and others are interested in making wikitech.wikimedia.org (or some similar wiki) encompass Labs stuff, and generally make it into a better informational and working hub for non-MediaWiki engineering stufffff. [16:36:22] but that's, I believe, stalled and waiting on some information assessments regarding quality & freshness of data in some of those wikis -- I could be wrong [16:36:25] That would be useful - there is some good stuff on wikitech, there's also a lot of outdated and no longer useful info. [16:36:47] sumanah: you're right. [16:38:45] Damianz: So, is there information that you would expect to find on the hubs or the activity pages that you can't find? [16:39:47] !beta updating all extensions and core to their latest master version [16:39:48] !log deployment-prep updating all extensions and core to their latest master version [16:39:50] Logged the message, Master [16:41:10] Mostly yes, though I do think there's too much of a spare betwean that and bugzilla still. Documentation is another matter for prod stuff which seems to vary in 'up-to-dateness'. [16:41:53] Damianz: what does "too much of a spare" mean? [16:42:03] err space and trying [16:42:07] typing* gah [16:42:25] oh ok :-) [16:42:39] (I've seen way worse typos, or even committed them) :) [16:43:35] Ok the avg user isn't going to go check a specific extension/project before raising a bug report, but bug reports are hardly easy to see per project without from what I can see writing a report to get them. For some things it's not possible but if there where like an 'easy pickings' list of project/bug work I'd probably do more outside of the things I'd know about, even if it's just flouting an [16:43:41] opinion [16:44:37] Damianz: So, if we're talking about concrete improvements. Would it help if we added a link to the relevant bugzilla search for all activity pages? [16:44:54] Which is where sometimes bz is used for features, other times it's on a talk page, other times the mailing lists and things get duplicated/spread out. So someone submits a bug for $x, oh we talked about that and someone mentioned it here style things are hard to keep track of. [16:45:29] so, Damianz, when someone shows up and says "I'd like to get started" with MediaWiki, I point them to https://www.mediawiki.org/wiki/How_to_become_a_MediaWiki_hacker which links to https://www.mediawiki.org/wiki/Annoying_Little_Bug . That HOWTO is also linked to from mediawiki.org's front page. But I know mw.org is crowded, and I know that it's just MediaWiki, not other stuff like huggle or labs [16:45:30] Yeah, that would be helpful. Even if it just stops someone duplicating something that needs to be marked as such saving effort to be directed at other things. [16:45:38] guillom: good idea! [16:46:22] That's partly the problem that there's mediawiki the wikipedia stuff which are both seperated and intertwined at the same time! [16:47:01] IIRC from when I read that, it's aimed at writing php for the wiki/frontend side of things with minimal focus on backend projects/misc stuff (could be wrong). [16:47:26] Damianz: petan started https://meta.wikimedia.org/wiki/Wikimedia_developer_hub partly for this reason, [16:47:51] * Damianz sees his bookmarks getting larger today heh [16:48:14] Damianz: so, I figure, for any given person wanting to know more stuff, there is kind of a pipeline or funnel or flow [16:48:33] a person like you is on the left, and the info they want is on the right, and I have to help connect the pipes so they get it [16:48:41] so part of it depends on the person's mind & habits [16:49:14] where would you expect to find the kind of information you want? what are your usual information sources? Do you read wikitech-l, do you read the weekly Signpost, do you read the monthly engineering report, do you read wikimedia-l [16:49:30] are you on Facebook/Twitter/Google Plus [16:49:31] etc [16:49:36] Yeah... I think the documentation is there, it's just somewhat hard to find sometimes. [16:50:41] I am totally serious about asking these survey-esque questions, btw, Damianz :) [16:50:52] I read wikitech posts if the subject is interest, 40 posts of lua sucks, why are you using it after it goes out isn't so enthrilling. I didn't know signpost was weekly, thought it was monthly. I read the report/blog, didn't know about wikimedia-l, twitter is useful, g+ less so [16:51:48] Damianz: https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l is a useful list for someone like you [16:52:03] Damianz: includes reminders when the signpost comes out [16:52:24] Damianz: low-traffic, links to reports when they come out [16:53:19] guillom: hope I'm not stepping on your toes with my interrogation here ;-) [16:53:38] Subscribed [16:54:52] He probably got bored of my blabbering and fell asleep in the watercooler :D [16:54:57] so, Damianz, within the monthly report/blog, did you notice that for each department and for each project that is summarized, there's a link? those links are to those activity and hub pages [16:55:46] I was eating dessert, and therefore not typing, but I am still reading :) [16:55:54] :) [16:56:33] Damianz: it's ok to say "no I did not notice or think of clicking that" because that is useful info too :) [16:58:00] Actually just realised I hadn't seen last months heh, I don't think I've ever clicked on the links no - assumed the paragraph or 2 was the info out there [16:59:07] ok, so, guillom, I hereby suggest that we experiment with more explicitly marking those links to more info, encouraging people to put stuff on their watchlist or something..... [16:59:39] sumanah: you mean the links to activity pages in the monthly report? [16:59:50] guillom: yeah, the links to activity & hub pages [17:00:10] (it is a suggestion and I am ok with being rebuffed) [17:00:45] sumanah: my gut feeling is that those reports are already very crowded, and that "I'm a blue link" should carry enough information for the reader to mean that more information is available. [17:01:29] Damianz: to go back to Bugzilla, since that is the topic that you mentioned initially: the fragmentation is annoying for sure, as is the information architecture of Bugzilla. And there are also teams using Mingle, Trello, and probably other things :/ [17:01:49] <3 Trello [17:01:53] Ah, we all dream of a single, unified system. [17:01:57] Damianz: I used to work at Fog Creek actually [17:01:58] Yeah the reports are somewhat crowded [17:02:11] Cool, shame they use nodejs though [17:02:24] Damianz: yeah, so you can see that it's difficult, trying to convey "there's more info about this if you want it" while not making the report TOTAL "too long didn't read" [17:02:43] Damianz: (you know the new Visual Editor/Parsoid combo uses node.js right?) [17:03:45] Nope, I keep hearing about the visual editor but have no idea about it. Will probably read somewhat later. [17:05:04] Damianz: https://blog.wikimedia.org/2012/06/21/help-us-shape-wikimedias-prototype-visual-editor/ has a link to the place where you can play & test, if you haven't already [17:06:37] ok, and you mentioned that one of the issues that frustrates you is that there's information scattered in a lot of places with no one coordinating [17:06:53] this is one reason that Wikimedia Foundation is starting to use "Product Managers" for this [17:07:13] That's somewhat better than I'd just imagined, wonder how hellish it is to use on mobile though (not that trying to edit on mobile isn't near impossible currently) [17:07:30] so, you see that for the visual editor, James Forrester reduces duplication and cross-references stuff whether it's on a wiki page or mailing list or BZ or whatever [17:08:03] https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals#Visual_Editor mentions that between now and the end of Sept the VE team is working on the mobile editing issue, Damianz [17:08:36] Yay, I hadn't got that far down the page [17:08:47] And yeah, people like that are generally useful [17:09:27] right. so, for certain projects, it's Oliver (Ironholds), for analytics it's drdee (Diederik van Liere), etc [17:10:22] Damianz: we're also asking volunteers to help out -- Jack Phoenix, for example, is product manager for https://www.mediawiki.org/wiki/Admin_tools_development [17:16:07] Hmm interesting [17:18:32] so, anyway, Damianz, thanks for sharing your experience with me & guillom [17:18:45] I hope this was useful for you, it was definitely useful for me [17:19:12] And feel free to send suggestions my way. [17:24:15] I'd say useful :) [17:24:51] :) [17:29:07] * Damianz goes to find some food [17:43:16] Change abandoned: Demon; "Abandoning since I78df2b89 was abandoned too." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6541 [18:31:52] petan: did you get your magic? [18:32:34] petan: maybe you needed `git push origin HEAD:refs/publish/master` [18:33:35] * Damianz looks at jeremyb's drugs [19:11:25] Hi Ryan, didn't see you sneak in there :P [19:11:32] howdy [19:12:12] I don't know if you saw the bug about sessions, but I can't replicate the behaviour now so I dunno if you made a change/it was related to jobs being stopped. Was somewhat strange at the time [19:16:49] eh? [19:16:56] bug number? [19:17:04] I really don't know what you mean [19:18:30] * Damianz finds it [19:18:46] https://bugzilla.wikimedia.org/show_bug.cgi?id=39792 [19:19:08] Was doing it yesterday/day before. I was just too lazy to appear on irc until after you went to bed [19:20:36] <^demon> Ryan_Lane: When you've got a minute, I'm having trouble getting labs gerrit to talk to LDAP. I'm wondering if I'm using the wrong settings. [19:24:41] ^demon: I keep meaning to ask - is it possible to change the navigation in gerrit? Currently it's rather hard to get to gitweb/the project path as it means going like admin -> projects -> project -> branches -> branch. Or is that a long term plan when the alternate browser has a plugin done? [19:25:00] <^demon> Yes, it's hard to change that. [19:25:07] <^demon> 2.5 is nicer though, I made those things easier to find [19:25:59] Yay, easier is good :) [20:30:25] chrismcmahon: I got the autoupdating script :-] [20:35:11] * hashar loves git [20:35:16] and the fast gerrit [20:38:13] thanks hashar! [20:38:57] !log deployment-prep trying out {{gerrit|22116}} on deployment-integration (that is the beta auto updaher) [20:38:59] Logged the message, Master [20:39:13] F¨¨¨¨¨¨¨K [20:39:15] aeazmel [20:39:18] err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ntp::client for i-0000034a.pmtpa.wmflabs at /etc/puppet/manifests/site.pp:57 on node i-0000034a.pmtpa.wmflabs [20:39:49] hashar: I go on vacation for a week myself in about 4 hours, could make the appropriate announcements on wikitech, etc. ? [20:39:55] could you [20:40:12] don't you want to make it on the monday when you come back from vacation? :-D [20:40:34] will hopefully have it polished up tomorrow [20:40:37] I'll be flying to San Francisco then, Mon. 10 Sep :) [20:40:39] but then will have to monitor it for a few [20:40:41] oh [20:40:45] I am landing in SF at noon [20:40:52] will hopefully be in the office on monday afternoon [20:40:59] * chrismcmahon has a very busy 2.5 weeks ahead [20:41:03] Damianz: I've had that session issue forever [20:41:04] so we might be able to write the announcement together :-] [20:41:10] Damianz: use the "remember me" option [20:41:12] ok! [20:41:18] no one seems to be able to figure out what's causing it [20:41:39] Damianz: maybe the mediawiki upgrade will help? [20:42:06] paravoid: Ryan_Lane: do you know if puppet need a specific configuration to support modules? I got an instance unable to find the ntp::client class [20:42:15] and that uses puppetmaster::self of course [20:42:23] then you need to do a git pull [20:42:24] yes it does, but it's been merged to ::self alreayd [20:42:32] Ryan_Lane: Hopefully [20:42:33] what ryan said [20:42:59] I know you had the getting logged out issue before, I'd never had the being logged out on only some pages before though [20:43:04] Ryan_Lane: remember how we hated the Ciscos? [20:43:10] well my local instance has a fresh update [20:43:12] well, only kind of [20:43:20] right [20:43:24] I've been reading that thread about the dells :) [20:43:29] yeah :) [20:43:54] * Damianz yawn [20:45:20] ok. food time [20:46:41] paravoid: my instance /var/lib/git/operations/puppet has the latest master but still can't find module :D [20:46:45] any hint ? ;) [20:47:00] maybe I should restart puppet [20:47:06] oh, yeah, this won't work [20:47:23] because now it can't run to fix itself [20:47:31] \O/ [20:47:46] puppet: learn DIY please [20:48:12] ah so I got a module directory [20:48:17] looks like I need to symlink it [20:48:19] hashar: rm /etc/puppet/modules; ln -sf /var/lib/git/operations/puppet/modules /etc/puppet/modules [20:48:40] !!! [20:48:55] paravoid: you are my hero :) [20:49:05] paravoid is secretly catwoman [20:50:42] paravoid: FAQed https://labsconsole.wikimedia.org/w/index.php?title=Help:Self-hosted_puppetmaster&diff=5682&oldid=5077 [20:53:03] heh, now who's hero? :) [20:55:18] !beta applying beta::scripts to deployment-integration [20:55:18] !log deployment-prep applying beta::scripts to deployment-integration [20:55:20] Logged the message, Master [20:57:16] Can someone checkout Nova Resource:I-000003e6 ? Did I do something wrong? [20:58:41] Wiki page looks ok - what problem are you having? [20:59:11] Instance state is marked error [21:00:33] Actually weird, it has no ip assigned according to the recourse page... hmm has it worked since monday? [21:01:10] No i created the instance today. [21:01:30] Yeah, no Ip. I'm unable to delete [21:01:37] So not sure what to do with it [21:01:40] Not sure then, maybe paravoid or Ryan_Lane can look at the logs [21:02:11] bah [21:02:14] Thanks for looking [21:02:18] puppet can't find my class :/ [21:12:41] gm [21:12:45] created it today? [21:12:55] likely a scheduler issue [21:13:14] I thought about a trouble with puppet autoloader or something [21:13:25] trying something else [21:13:36] if the instance shows in an error state, it's a scheduler issue [21:13:50] hashar: what issue are you having? [21:14:08] I created a new misc::beta::scripts puppet class [21:14:12] in manifests/misc/beta.pp [21:14:13] https://gerrit.wikimedia.org/r/#/c/22116/ [21:14:20] added misc::beta::scripts to labs console puppet group [21:14:27] applied it to the deployment-integration instance [21:14:31] which runs puppet master self [21:14:42] I did the magic GIT_SSH=/var/lib/git/ssh git fetch origin refs/changes/16/22116/3 && git checkout -b 22116/3 FETCH_HEAD [21:14:45] the change is there [21:14:54] nice puppet says err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class misc::beta::scripts for i-0000034a.pmtpa.wmflabs on node i-0000034a.pmtpa.wmflabs [21:16:21] rmoen: seems the scheduler hit some weird condition [21:16:22] restarted puppet / puppetmasterd [21:16:32] rmoen: delete/recreate the instance [21:17:01] all three compute services must have shown as unavailable when it went to schedule it [21:17:09] Ryan_Lane: issue fixed by reloading puppetmaster service :-) [21:17:20] ah. good [21:17:25] notice: /Stage[main]/Misc::Beta::Scripts/Service[wmf-beta-autoupdate]/ensure: ensure changed 'stopped' to 'running' [21:17:26] \O/ [21:17:29] I love my job [21:17:45] :o [21:17:52] 25903 ? SN 0:00 \_ git-merge -q Merge branch 'master' of https://gerrit.wikimedia.org/r/p/mediawiki/core HEAD 687fcf8b466cf947b724b77ada2222b81f6e7ad7 [21:17:53] yeahh [21:18:03] hashar: Heading towards beta CI? [21:18:08] yeah indeed [21:18:13] awesome [21:18:16] I wrote a small lame script [21:18:25] that git pull everything every 180seconds [21:19:10] I assume the pull is blocking so something that takes over 3min won't end up with 2 running then 3, 4, 5 etc? [21:20:02] grmblbl [21:20:07] I have no idea where upstart log stuff [21:22:37] !log deployment-prep Deployed the automatic code updater on beta. It is running on deployment-integration, service is wmf-beta-autoupdate managed by puppet to always run. [21:22:39] Logged the message, Master [21:22:45] Ryan_Lane, tried to delete already. Fail message - The requested host does not exist. [21:22:57] hm [21:23:14] it doesn't say it succeeded in nova, but failed to delete host? [21:23:46] ugh [21:23:47] it doesn't [21:23:50] that's a bug [21:24:30] hm [21:24:44] Connects standard input to /dev/null. Standard output and standard error are connected to one end of a pseudo-terminal such that any job output is automatically logged to a file in directory /var/log/upstart/. This directory can be changed by specifying the --logdir command-line option. [21:24:45] blam [21:24:47] ah. it's just a bad error [21:24:52] looks like upstart does not really log :/ [21:24:55] error message [21:25:27] hm [21:25:33] this is still a bug, though [21:25:45] oooohhhhhhhhh [21:25:49] I see the issue [21:25:58] crap [21:26:05] deleting from the instance page is broken [21:26:16] since the instance's don't have the openstack id [21:26:28] That seems likely [21:26:29] seems all of the actions are.... [21:26:33] well, crap [21:26:46] I need to write a maintenance script to fix tis [21:26:47] *this [21:27:13] rmoen: I'm going to disable that menu for now [21:27:22] use "manage instances" to delete the instances for now [21:29:52] Ryan_Lane ok that time it worked [21:30:00] hm [21:30:18] Ryan_Lane, I'm curious what I had done to cause the error though. [21:30:18] yeah. I'm going to need to update all instance pages to include the new id [21:30:22] maintenance script it is [21:30:30] seems nova still has some bugs [21:30:34] this one is less likely to occur [21:30:44] but, the services have a state of up or down [21:30:46] I created an instanced, and checked all three security boxes, named it and hit submit [21:31:02] if all of the compute services show as down, the scheduler says "there's no hosts to launch this instance on" [21:31:11] then it sets an error state [21:31:25] it would be nice if it just waited in the scheduling state until one of them cameup [21:31:32] chrismcmahon: well I think the automatic updater is working now :-)) [21:31:35] Ahh I see. [21:39:27] Ryan_Lane: o.0 It really should have a timout and until then it gets requeued, on timeout it errors [21:39:59] yep [21:40:09] I'm about to write an email to the list asking about this [21:40:24] ideally the fucking services shouldn't go down for no reason either [21:44:14] That too [21:44:51] oh well, at least its fast and isn't throwing 500s now [21:50:41] I am off for now [21:50:51] bye hashar [22:38:46] Ryan_Lane, what do I need to do to get a DNS record and an IP address ? And do you accept bribes ? [22:44:56] rmoen: what do you need the IP for? i may be able to help you [22:45:40] mutante, so i can have manager type people browse to the instance [22:46:00] ok, which project is this ? [22:46:07] micro-design [22:46:40] !log micro-design raising IP quota to 1 [22:46:40] micro-design is not a valid project. [22:46:49] mutante, http://www.mediawiki.org/wiki/Micro_Design_Improvements is the project page [22:46:59] its on editor-engagement [22:47:24] ah [22:47:44] !log editor-engagement raising IP quota to 1 [22:47:46] Logged the message, Master [22:50:00] rmoen: alright, if you go to "Manage Addresses" on labsconsole, you should be able to "Allocate IP" to the project, and after that, Associate IP to an instance [22:50:07] given that you are netadmin in your project [22:50:11] mutante, sweet ;) thanks [22:50:29] yw [22:50:53] mutante, hmm-> Failed to allocate new public IP address. [22:50:53] then you probably want to look at "Manage Security Groups" to allow access [22:54:12] rmoen: ugh, you already had public address in that project? [22:54:29] i see like 3 [22:54:30] mutante: That project already had like 4 public IPs [22:54:37] mutante, yeah there are 3 [22:54:38] Or 3, I guess [22:54:45] so then i did not really raise it when setting quota to 1 :p [22:54:57] mutante correct [22:55:04] !log editor-engagement raising IP quota to 4 [22:55:05] Logged the message, Master [22:55:22] Allocated new public IP address: 208.80.153.14 [22:55:42] rmoen: there you go [22:55:57] 08/30/2012 - 22:55:57 - Created a home directory for dzahn in project(s): editor-engagement [22:56:03] mutante, awesome [22:56:36] mutante, ty [22:57:49] rmoen: so now it is in your project but not assigned to an instance yet [22:58:15] !log editor-engagement allocated new public IP address: 208.80.153.14 [22:58:16] Logged the message, Master [22:59:15] ah ok, i see it is on micro-design now [23:00:31] and for the DNS name, if you just want something in wmflabs.org, go to "Add host name" in Manage Addresses [23:00:59] 08/30/2012 - 23:00:59 - User dzahn may have been modified in LDAP or locally, updating key in project(s): editor-engagement