[01:14:14] Ryan_Lane: You're saying that I was right as though it was, somehow, in question. :-P [01:57:18] Coren: thanks for your assistance [01:57:40] short of having a database I've got everything in place for phase 1 [01:57:52] greenrosetta: Yeay! [01:58:04] greenrosetta: You are subscribed to labs-l? [01:58:12] no, is that a mail list? [01:58:24] https://lists.wikimedia.org/mailman/listinfo/labs-l [01:58:29] !labs-l is https://lists.wikimedia.org/mailman/listinfo/labs-l [01:58:29] Key was added [01:58:30] btw, here it is [01:58:31] I think you might save yourself a lot of time and aggravation if you spend a few minutes and read [[WP:TRUTH]], which explains how the sourcing policy works. What you ''know'' to be the truth (and I'm not doubting you) is different than being ''verified'' by a reliable source. That this BLP subject was the "center" of major campaigns is certainly not unlikely, but we need a source with editorial oversight to say this for a cou [01:58:31] ple of reasons. The first is we expect them to "fact check" their article for accuracy. The second is we need the reliable sources so we can use that inclusion in our article. [01:58:31] The fact that Peterson may have appeared in many ads and photographed by many famous photographers is irrelevant to our sourcing policy. We can't use the ads themselves for verification (to an extent) as they are a primary source. We would need an article from something like [[Advertising Age]] or (yes) [[Vogue]] that not only verifies that Peterson did these things, but explains ''why'' it was interesting. [01:58:36] oops [01:58:44] explaing for someone else [01:58:53] http://tools.wmflabs.org/common-interests/cgi.py?sock [01:58:55] thats the link [01:58:58] Yes; it's the labs's mailing list; mostly low volume, and where news about DB replication will be posted first. :-) [01:59:53] i subscribed, but will probably give it my normal email treatment... ignore [02:00:25] I might publish this app as a shell for other devs to get started [02:01:00] its pretty simple but python makes it EASY to make a UI [02:12:51] greenrosetta: That said, I have true WSGI support in the pipeline, but not for a little while. [02:14:30] ah, so we dont have to do that cgi stuff? [02:14:47] no reason not to run a webserver... not that intenstive [02:15:20] greenrosetta: Well, actual usage isn't the problem but the fact that this usage needs to be reserved and used continuously. [02:16:07] greenrosetta: the CGI method will always work, but there /are/ kinds of tools where it would make sense to work with a continuous server; but those are usually doing /something/ all the time, not just waiting for requests. [02:17:47] If your tool was a server, for instance, it would need to reserve its memory usage; and that memory would be unavailable to anything else even when the server is mostly just sitting there waiting for someone to use it. [02:18:15] With the CGI model, it's a little slower to start up but you use the memory only when it's actually /used/. [02:18:59] memory is cheap [02:19:05] :D [02:20:32] Ah.. you used to be on Arbcom [02:20:44] I thought your name sounded familiar [02:20:46] greenrosetta: No it's not; Tool labs is designed for 100% uptime and reliability: there is no memory overcommit, and no swap. If your tool runs, it'll /keep/ running. The price to pay for this, however, is hard memory allocation and enforced limits. A process that reserves a lot of ram and uses it .1% of the time is the most expensive thing. :-) [02:21:48] Well, there is always caching [02:22:24] Over the years I've realized a lot of time has been wasted in over-desiging an architecture [02:23:04] Save the net present value of the resources and spend it on future technology that is cheaper/faster [02:23:30] Interesting. In some 20 years of system administration, I've found that most of the time has been wasted cleaning up after problems caused by /under/-design. :-) [02:24:15] I remember going from 32k to 48k and what a huge difference it made [02:24:49] Heh. [02:25:25] Nevertheless. I find that 99% of the time a problem is solved by throwing more resources at it, it's because the problem was solved wrong in the first place. :-) [02:26:39] most likley. What is the architecture here? Is it VM's? [02:27:24] python is a very good example. I've never yet seen a platform so wasteful and inneficient. It allocates resources it never uses (and generously at that!), seemingly under the philosophy of "meh, it'll get even cheaper next year". The end result is that you're force to overcommit your resources. [02:27:47] rapid development trade off [02:28:15] maybe you guys can get Microsoft to donate some free software and you could have .NET apps running [02:28:22] Yeah, it's a compute grid over VMs. That makes resources (comparatively) cheap. [02:28:36] greenrosetta: 100% open-source policy. Besides, mono is available. [02:29:01] greenrosetta: And sucks even /more/ than python in its waste. :-) [02:29:07] greenrosetta, you should talk to Bill Gates. [02:29:07] never used mono [02:29:20] i wouldnt trust it for compatiblity [02:31:03] Pthon should only really be used for basic front end stuff anyways [02:31:24] greenrosetta: Still, the Tool Labs's primary requirement is stability. That means no resources overcommit. If your tools needs 500M of ram and the engine lets it start, then it *will* have 500M of ram available regardless of what happens on the other tools. [02:32:05] that of course requires the tool writier to know the upper limit [02:32:38] greenrosetta: Worst case scenario, test it with a generous upper limit and check actual usage. [02:34:52] greenrosetta: Better yet would be to write your code to actually manage its use of resources, but I've long given up on coders taking the effort to code for sharing resources. Even when the platform allows it, coders today no longe even try. [02:36:05] By "write your code" I didn't mean you specifically, but the generic "you" obviously. :-) [02:36:48] I suppose "write one's code" would have been better, but it always gets awkward when one's trying to keep one's statements person-neutral. :-) [11:18:31] Anybody knows what's up with https://en.wikipedia.beta.wmflabs.org being dead? "Error: invalid magic word 'pagesusingpendingchanges'" [11:18:35] https://bugzilla.wikimedia.org/show_bug.cgi?id=47852 [11:18:45] blocks testing. [12:20:04] [bz] (NEW - created by: Antoine "hashar" Musso, priority: Unprioritized - normal) [Bug 47870] Get rid of GlusterFS dependency on beta - https://bugzilla.wikimedia.org/show_bug.cgi?id=47870 [12:20:05] [bz] (NEW - created by: Chris McMahon, priority: Immediate - critical) [Bug 47852] beta labs down: "invalid magic word 'pagesusingpendingchanges'" - https://bugzilla.wikimedia.org/show_bug.cgi?id=47852 [12:41:51] !log deployment-prep Refreshed most extensions and running mw-update-l10n [12:41:54] Logged the message, Master [12:46:23] Warning: There is 1 user waiting for shell: Yuav (waiting 0 minutes) [12:59:53] Warning: There is 1 user waiting for shell: Yuav (waiting 13 minutes) [13:00:15] does anyone know if there is a command line tool to list instances of a project? [13:13:28] Warning: There is 1 user waiting for shell: Yuav (waiting 27 minutes) [13:26:58] Warning: There is 1 user waiting for shell: Yuav (waiting 40 minutes) [13:34:20] hashar: Hm. It could be jiggred around ladplist: ldaplist |sed -n -e '/^dn: cn=\([^,]*\),ou=projects.*/{;s//\1/;p;}' [13:34:33] Coren: ah that is nice [13:34:50] I was looking at collecting the hostnames of nodes that uses a specific class :-D [13:34:51] Oh, wait. That gives you a list of /projects/, you want instances /in/ projects. [13:35:11] * Coren ponders. [13:35:26] Huh. /Are/ instances in LDAP? [13:35:33] yup [13:35:52] let me gives you the context [13:36:01] I would like to get rid of glusterFS as a dependency on the beta cluster [13:36:10] so I thought I could setup rsyncd on each of the application server [13:36:40] then from the main work machine I could update the MediaWiki code and then do an rsync from that box to each of the instances that have the role::applicationserver::common applied [13:36:48] but maybe I should use your NFS server instead :-] [13:37:03] That'd also work. :-) [13:40:29] It's up and running, and not that hard to fire up, but it needs some trickery because you have to override the autofs info from LDAP. Lemme make a puppet class for this? [13:40:33] Warning: There is 1 user waiting for shell: Yuav (waiting 54 minutes) [13:41:30] Coren: by override do you mean replacing the GlusterFS entry in /etc/fstab ? [13:43:19] It's not in fstab, it's autofs config pulled from LDAP. But same idea. :-) [13:49:04] hashar: In the meantime, you may wish to mount the NFS filesystems manually and rsync stuff? [13:49:16] that is a good idea [13:49:26] what is the server & path ? [13:49:38] They are at labnfs.pmtpa.wmnet:/$project/{home,project} [13:50:11] Should already be there and ready. [13:53:53] it exports a /exp/deployment-prep [13:54:04] Warning: There is 1 user waiting for shell: Yuav (waiting 67 minutes) [13:54:17] Ah, you don't have the standard filesystem names? :-) [13:54:31] i did showmount -e labnfs.pmtpa.wmnet|grep deployment [13:54:45] then tried mounting using: mount -t nfs labnfs.pmtpa.wmnet:/exp/deployment-prep/project /srv/project [13:54:48] What's the name of your project? [13:54:53] deployment-prep [13:55:40] Oh, sorry, you need mount options. o nfsvers=4,port=0,hard,rsize=65536,wsize=65536 [13:55:44] ohhh [13:55:46] damn nfs [13:55:46] :D [13:55:48] -o nfsvers=4,port=0,hard,rsize=65536,wsize=65536 [13:56:04] Wait. [13:56:11] You also need one bit of trickery [13:56:20] echo 1 >/sys/module/nfs/parameters/nfs4_disable_idmapping [13:56:26] Before mounting. [13:56:32] (The puppet class will do so automagically) [13:58:04] Coren: mounted!!! :-D [13:58:16] I will be more than happy to +1 / beta test the puppet class! [13:59:03] !log deployment-prep bastion: created NFS mount point thanks to Coren. echo 1 >/sys/module/nfs/parameters/nfs4_disable_idmapping ; mount -t nfs -o nfsvers=4,port=0,hard,rsize=65535,wsize=65536 labnfs.pmtpa.wmnet:/deployment-prep/project /srv/project [13:59:06] Logged the message, Master [14:02:07] I am just wondering how the NFS array will survive all the I/O generated by labs projects :D [14:04:56] is there are a labs issue? I am failing to get wsexport.wmflabs.org [14:05:52] sDrewth: Not as far as I know. What symptom are you getting exactly? [14:07:12] Receiving objects: 1% (4373/426076), 13.25 MiB | 2.24 MiB/s [14:07:14] yummm [14:07:38] Warning: There is 1 user waiting for shell: Yuav (waiting 81 minutes) [14:09:24] Coren: The server at wsexport.wmflabs.org is taking too long to respond. [14:09:35] and that is just poking it [14:10:25] I will just hope that Tpt appears and see if he can poke it from the backend [14:18:18] zeljkof: I think beta is back up [14:18:26] zeljkof: the l10n cache was not updated [14:18:33] hashar: will run the tests [14:21:04] Warning: There is 1 user waiting for shell: Yuav (waiting 94 minutes) [14:24:13] [bz] (NEW - created by: Damian Z, priority: Unprioritized - major) [Bug 45945] Gluster on Wikimedia Labs reporting 'Read-only file system' - https://bugzilla.wikimedia.org/show_bug.cgi?id=45945 [14:25:11] hashar: In re performance; honestly, I don't expect the NFS server can be overburdened before the network is. [14:25:37] Well, that and I/O speed but it's not an issue different from any fileserver really. [14:27:01] Coren: good to know :-] [14:27:07] I am doing some file copies [14:34:39] Warning: There is 1 user waiting for shell: Yuav (waiting 108 minutes) [14:48:09] Warning: There is 1 user waiting for shell: Yuav (waiting 122 minutes) [14:55:07] hashar: https://gerrit.wikimedia.org/r/#/c/61578/ [14:56:33] \O/ [14:57:11] Coren: nfs-noidmap.conf is an upstart service isn't it ? [14:57:21] It is. [14:57:29] isn't there a sysctl.d dir or something? [14:57:46] That one isn't available through sysctl. Annoyingly enough. [14:57:53] ahhaha [14:58:26] Wait, why did Jenkins go 'LOST' on the build verification? That class actually /works/. [14:58:56] Coren: Jenkins has been shutdown and all connection signals go void/lost because it was accidentally restarted. [14:59:05] Takes about an hour to restart, nothing we can do meanwhile [14:59:11] Ah. [14:59:16] See backscroll in -operations [14:59:18] Well, I can Verify +2 manually. :-) [14:59:26] That's *something* [14:59:39] Or CR+2 and let jenkins merge it in about 20-30 minutes [15:00:06] While you do something else (do you need it to be merged right now?) [15:00:15] jenkins does not merge on operations/puppet :-D [15:00:46] s/merge/test whatever [15:00:46] Krinkle: Did you get a changce to working on moving Intiuition? [15:00:53] Coren: not yet [15:01:13] Coren: I am commenting the change :D [15:01:40] Warning: There is 1 user waiting for shell: Yuav (waiting 135 minutes) [15:01:43] You may have been overly optimistic with your ETA then? :-) [15:05:57] hashar: FYI, autofs is a bit brittle on restarting with new config. Some of my instances I had to reboot after the switch before autofs unwedged; has to do about how sucessful it is at unmounting the gluster filesystems. [15:06:29] *In theory*, the following should allow a switch without restart: [15:06:30] Coren: is the that NFS server replacing projectstorage.pmtpa.wmnet: ? [15:06:35] killall -USR1 automount [15:06:40] killall -HUP automount [15:07:12] hashar: "Slated to replace" is more accurate. Tools is using it now. [15:07:24] ;-] [15:07:32] I did some comments on https://gerrit.wikimedia.org/r/#/c/61578/1 [15:07:45] I am an autos noob so can't really comment on the erb templates [15:09:13] Coren will you have some time for me on hackaton [15:09:29] Coren I would like to move forward with merge of bots and tools project [15:09:53] first of all, if we really make bots a staging area, we need to make it identical to tools [15:10:10] this is something I can hardly do alone as I don't understand everyting in tools project [15:11:35] petan: Sure thing; although this depends on my /finally/ getting that toollabs puppet module done. [15:11:52] !log deployment-prep syncing upload data from the Gluster share to labnfs server: rsync -avv /data/project/upload7 /srv/project [15:11:55] Logged the message, Master [15:12:00] that is going to be alonnnnnngggggggg sync [15:12:08] petan: It'd be even better if we could start this /before/ the hackaton. [15:12:24] well, I don't know what exactly you mean [15:12:42] you mean a puppet class for tool labs? [15:12:45] or what you mean [15:12:59] petan: A collection of classes in a module to configure a "Tool Labs-like" project. [15:12:59] I think there needs to be a bunch of classes per role [15:13:08] ah [15:13:29] petan: https://gerrit.wikimedia.org/r/#/c/59969/ [15:13:43] WIP [15:13:49] That's just the skeleton. [15:15:10] Warning: There is 1 user waiting for shell: Yuav (waiting 149 minutes) [15:28:00] !log deployment-prep Copying l10n cache to the new NFS server: rsync -av /home/wikipedia/common/php-master/cache /srv/project/apache/common/php-master [15:28:02] Logged the message, Master [15:28:45] Warning: There is 1 user waiting for shell: Yuav (waiting 162 minutes) [15:41:01] * Betacommand wants replication working [15:41:55] I'm hungry guys. What should I eat? [15:44:49] Cyberpower678: I recommend food. [15:45:21] Coren, hmm. Yea. it beats eating my chair. :p [15:56:46] toolserver office hour in 5 minutes with Silke_WMDE_ Coren and me in #wikimedia-office [16:01:12] !log deployment-prep Clearing out years old backup from /data/project such as copy of extensions, databases dumps and some old instances backups. [16:01:12] Logged the message, Master [16:01:41] /backup/live/old/extensions/FCKeditor/fckeditor/_samples/lasso/.svn/entrie [16:01:46] form the time we had svn hehe [16:14:28] aharhghggggghg [16:14:42] the beta Apaches confs are some hacks [16:14:45] :( [16:21:34] why do instances that are advertised as having 10G root partition are actually created with 3.8G? [16:22:35] Coren: http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help <— mediawiki.org? :D [16:22:49] MaxSem: ugh. are they? [16:23:13] Ryan_Lane, df -h [16:23:13] Filesystem Size Used Avail Use% Mounted on [16:23:13] /dev/vda1 3.8G 3.8G 0 100% / [16:23:14] I wonder if the new image has issues with resizes [16:23:21] is this a new or old image [16:23:22] ? [16:23:30] created recently [16:23:37] so new I guess [16:24:38] Bastion1 is complaining it needs a restart :P [16:24:55] Matthew_, they always complain [16:25:07] MaxSem: IK, servers are never happy [16:27:10] OK, joking aside... who do I poke to get added to the tools project? [16:36:05] Ryan_Lane: you think that page should be on the wikitech wiki? :P [16:36:17] yes :) [16:37:15] I did think it was a bit of an odd place for it when I started editing it :P [16:37:26] Matthew_, try petan or Coren [16:37:47] Platonides: OK, I will. Thanks :) [16:38:32] import from mw to wikite? [16:45:59] !log deployment Mounted new NFS server on /srv/project on instances: apache32, apache33, video05 and jobrunner08 [16:45:59] deployment is not a valid project. [16:46:10] !log deployment-prep Mounted new NFS server on /srv/project on instances: apache32, apache33, video05 and jobrunner08 [16:46:13] Logged the message, Master [16:50:15] * addshore pokes Ryan_Lane to import it as I have neither Administrator or contentadmin flag :) [16:57:03] addshore: import? [16:57:11] oh [16:57:13] the docs? [16:57:52] yee :) [16:59:42] I'll let Coren handle that, since he wrote the doc :) [16:59:49] hehe okay :) [16:59:58] addshore, Wikipedia:Bots/Requests for approval/Cyberbot II 2 [17:01:14] bing [17:02:49] addshore, wow thanks. I thought you'fd have questions. :p [17:03:08] not really :P it seems simple enough :P what were you expecting I would have asked? [17:03:32] addshore, the fact that it's not really a simple script. :p [17:04:57] but the task itself is simple ;p [17:06:58] * Matthew_ pokes Coren [17:07:11] * Coren pokes back! Doink! [17:07:24] o.O [17:07:43] Can you add me to the Tools project please? [17:08:10] Your office hours convinced me that my tools can actually work on labs O.o, something I didn't think could happen [17:08:57] Matthew_: Sure. What's your Wikitech account name? [17:09:03] matthewrbowker [17:09:46] Added. There is a fair guide at [17:09:49] !toolsdoc [17:10:00] Hm. Bot sick? [17:10:19] https://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [17:10:25] Matthew_: ^^ [17:10:38] Gives a good overview. I'm always available if you have questions. [17:11:02] Coren: Thank you very much! [17:11:38] I'm going to start looking at migrating :) [17:16:31] I wanted to say nice things about hashar and he's not around :) [17:17:10] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Help was modified, changed by Addshore link https://www.mediawiki.org/w/index.php?diff=681807 edit summary: [-20165] redirect [17:17:49] all moved aroud :) [17:21:54] Ryan_Lane: I updated the labsnfs class to use the upstart_job define. Want to see if I did it Right? [18:21:20] Coren bot is sick? o.O [18:21:23] !toolsdocs [18:21:24] http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [18:21:29] eh [18:21:36] Coren try again pls :P [18:21:46] I have no idea why it didn't just respond to you [18:21:55] !toolsdocs [18:21:55] http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [18:22:08] weird... [18:27:35] !log deployment-prep rsync to the NFS server are completed. There are most probably still some tiny files than need to be copied though [18:27:37] Logged the message, Master [18:37:31] uh? [18:38:01] Platonides? [19:06:06] [bz] (NEW - created by: Chris McMahon, priority: Unprioritized - major) [Bug 47892] internal error for new user creating account and logging in - https://bugzilla.wikimedia.org/show_bug.cgi?id=47892 [19:10:33] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Help was modified, changed by Legoktm link https://www.mediawiki.org/w/index.php?diff=681901 edit summary: [+5] silly addshore [19:13:05] hashar: The puppet class is ready to push as soon as Ryan_Lane checks that I did his suggested change right. :-) [19:13:33] which one? [19:13:59] https://gerrit.wikimedia.org/r/#/c/61578/ nfslabs. The upstart_job class was quite adequate. [19:14:03] ah [19:14:04] right [19:14:06] it's fine [19:14:39] +2'd [19:14:45] Yeays! [19:14:47] * Coren merges. [19:15:04] Coren: I still have to migrate the home dir [19:15:25] hashar: It's not going to switch you over by surprise, you still need to include the role once ready. :-) [19:15:33] oh [19:15:36] that is convenient :-] [19:16:35] "File system being substituted from one storage cluster to another without a positive action" is not a feature. :-) [19:16:52] after => rsync :-D [19:17:02] Heh. [19:17:15] The class is role::labsnfs::client [19:17:29] Ryan_Lane: ty [19:17:59] hmm [19:18:26] maybe copying the existing /home is not a good idea :D [19:21:00] !log deployment-prep Migrating homes to the new NFS server [19:21:02] Logged the message, Master [19:27:00] hashar: How so. Hugeness? [19:27:22] /home/wikipedia is giant [19:28:55] Ryan_Lane: Is the autofs config in LDAP per-project or global? [19:31:16] global [19:31:57] I don't want to have hundreds of ldap entries for that [19:32:44] Understandably; makes the transition away from gluster an uphill battle though. I was hoping we could consider moving new projects to the NFS server after we have a few manual converts. [19:33:08] /home/wikipedia just has symlinks nowadays [19:33:20] so that was only 157MB [19:33:25] Coren: I think we should just take an outage for most projects [19:33:48] we should send out an email asking which projects would prefer to schedule a time to handle it [19:34:04] do those, then have the rest take the outage while we transfer files [19:34:06] I mean, for projects that are being created; it's a little silly to create new projects on gluster and then require a move. :-) [19:34:14] meh [19:34:26] Either way works in the end. [19:34:34] the amount of storage they'll use will be small [19:34:43] so transferring them will be quick [19:35:46] I also need to find a nice way of allowing projects to see what snapshots they have available. The automount works already, provided you can guess the timestamps; but autofs won't ghost. :-( [19:38:58] Coren: so in theory, if I apply role::labsnfs::client , run puppet and reboot, I get rid of Gluster on the instance? [19:39:19] hashar: That's what I tested on a puppetmaster::self instance with Great Success! [19:39:40] * hashar tries out [19:41:31] [bz] (NEW - created by: Chris McMahon, priority: Unprioritized - major) [Bug 47893] fully support Wikilove on beta (2 issues) - https://bugzilla.wikimedia.org/show_bug.cgi?id=47893 [19:43:57] !log deployment-prep Upgraded puppet manifests on deployment-integration and running puppet. [19:44:00] Logged the message, Master [19:45:29] !log deployment-prep applying the very recent `role::labsnfs::client` class on deployment-integration [19:45:32] Logged the message, Master [19:48:43] babhbhb Gluster takes over [19:49:30] o_O? [19:49:39] so [19:49:40] Can I leap on the instance to see what went wrong? [19:49:43] sure [19:49:48] @deployment-integration [19:50:47] hmm I am not sure where the project dir is supposed to be mounted [19:50:49] o_O my key doesn't let me in as root on it. [19:50:51] I assumed /data/project [19:50:57] Yes, that's it. [19:51:08] maybe you are not a sysadmin :D [19:51:14] I was, last I checked. :-) [19:51:57] hashar: Add me to project and project admins? [19:52:06] yp [19:52:08] Coren ? [19:52:14] I mean is your labs account 'Corren' [19:52:19] grgmb [19:52:20] Coren sudo policies are different there [19:52:23] Just the one R [19:52:28] Coren you need to add yourself to group admins [19:52:30] in sudo [19:52:38] not just sysadmin [19:52:50] Coren: done [19:52:58] petan: Not what I tried; I tried to use my root key to get on directly. [19:53:05] ahh [19:53:18] Coren I want root key too [19:53:20] :D [19:53:40] I am not sure why beta has a specific sudo policy [19:53:45] thanks for the tip petan [19:53:56] because it is not a lame project as others :P [19:53:58] hashar: Did you reboot? [19:54:03] Coren: nop :) [19:58:41] hashar: Ah. If you want to try to switch over without reboot, here is what you need to do: [19:58:41] I was lamely expecting the class to handle everything for me :-D [19:58:41] No, because that's a really disruptive move if you don't reboot. [19:58:41] okkk [19:58:41] let me reboot :-] [19:58:41] sudo killall -USR1 automount [19:58:42] ... or that. :-) [19:58:42] holy shit [19:58:42] that works [19:58:42] hashar: Autofs isn't very good at understanding it wants to unmount something and mount something /else/ on the same mountpoint. [19:58:42] and its fast [19:58:42] ... that sorta was the point. :-) [19:58:42] so I guess I want to migrate all the instances like that [19:58:43] and then will it be possible to get that feature automatically for newly created instances? [19:58:43] hashar: So long as you include the class. [19:58:43] hashar: Fun command of the day: showmount -e labnfs.pmtpa.wmnet|grep "$(dig +short $(hostname))"|grep .snapshot|cut -f 1 -d ' '|cut -f 5 -d /|sort -nu [19:58:43] 20130415.1400 [19:58:43] Shows you the available timetravel snapshots. :-) [19:58:43] oh really? [19:58:46] that could be handy [19:58:53] ls /home/.snapshot/$timestamp [19:59:11] or /data/project/.snapshot/$timestamp [19:59:18] how do you take the snapshots ? [19:59:40] They're automatic; every hour, last 3 kept, first of last 3 days, and first of last 2 weeks. [19:59:48] Coren I had somewhere my 1 line command which using awk grep sort gzip and so recreated the english wikipedia statistics page, that one was like 2000 characters long :D [19:59:53] na I mean what is the magic command ? :-] [19:59:57] does it simply cp -r ? [20:00:18] hashar: Ah, not. LVS snapshots on a thin volume with XFS write sync [20:00:35] hashar: So they are consistent point-in-time. [20:01:01] ah so LVS is able to somehow stop any operations while it does the copy? [20:01:08] and that is still a full disk cop [20:01:09] y [20:01:11] it's COW [20:01:24] hashar: COW; it basically just updates a pointer. :-) [20:01:25] like PIG ? [20:01:30] copy on write [20:01:46] !log deployment-prep applying role::labsnfs::client on -bastion [20:01:48] Logged the message, Master [20:02:09] hashar: LVM just signals the filesystem to chekpoint, updates the snapshot indices, and releases the filesystem. Takes ~1ms [20:03:09] I guess that is where I draw the line between me and ops :-] [20:03:14] It's just a few hours till the Wikimania talk deadline [20:03:19] I will have to read some doc about it [20:03:56] sumanah: My two are in, and so is the workshop with you. :-) [20:04:18] no wikimania for me [20:06:41] !log deployment-prep root@deployment-bastion:~# /etc/init.d/udp2log stop && /etc/init.d/udp2log-mw start [20:06:44] Logged the message, Master [20:06:54] I need to fix that issue one day [20:07:19] oh man [20:07:21] life changer [20:07:37] Coren: so I can now tail -f the MediaWiki logs in real time! [20:08:41] !log deployment-prep migrating apache32 to new NFS server [20:08:42] Logged the message, Master [20:17:27] Failed to read file '/var/lib/l10nupdate/cache-master/l10nupdate-ruq.cache' [20:17:28] bah [20:17:31] the path keep changing [20:22:17] hashar: [20:22:19] marc@deployment-integration:~$ ls /data/project/.snapshot/20130430.1817/ [20:22:19] apache upload7 [20:22:19] marc@deployment-integration:~$ ls /data/project/.snapshot/20130430.1917/ [20:22:19] apache apache2 logs upload7 [20:22:32] yeah that is really nice [20:27:14] I am migrating one of the apache box [20:35:31] uid=113(mwdeploy) gid=120(mwdeploy) groups=120(mwdeploy) [20:35:32] uid=114(mwdeploy) gid=121(mwdeploy) groups=121(mwdeploy) [20:35:34] I hate our setup [20:49:46] !log deployment-prep Recreated wikiversions.cdb on bastion for the new NFS home dir [20:49:49] Logged the message, Master [20:50:07] Coren: I ended up missing office hours today, any chance you can explain how the "federated tables" are supposed to work? being able to do joins from 'pedia to wikidata is crucial for me [20:51:33] legoktm: The short story, it's basically a symlink to another DB that works over the network. It's fairly efficient for joins that match a single row at a time on the federated table from an indexed column. [20:51:49] I expect most cross-db joins match that scenario. [20:52:35] so tl;dr my queries should still work? [20:52:52] hashar: that's normal in systems without a central uid/gid store [20:53:01] http://dev.mysql.com/doc/refman/5.0/en/federated-use.html [20:53:10] unless you specifically assign a uid [20:53:25] legoktm: They will, but they might need to be tweaked to be efficient. [20:53:34] ok [20:55:18] legoktm: It will suck greatly for many to many joins though. [20:55:25] hm [20:55:36] does this mean ill also be able to do joins with enwiki and dewiki, even though they're on different severs? [20:56:12] legoktm: Provided you have a federated table to one on the db of the other, yes. I only plan to have default links to commons and wikidata, but more can be added at need. [20:56:50] ok makes sense [20:57:07] Coren: so the Apache server on beta went from 560ms to 260ms with the new NFS server :-] [20:57:23] Coren: can you add me to the tools project, please? (username is gifti) [20:57:36] giftpflanze: {{done}} [20:57:51] hashar: Yeay! [20:57:53] oh, thanks :) [20:58:43] !log deployment-prep apache-32 running with NFS went from 560ms to 260ms when serving pages \O/ [20:58:46] Logged the message, Master [20:58:55] !log deployment-prep Migrating apache-33 to use the new NFS server [20:58:58] Logged the message, Master [21:02:25] Coren: Thanks for your help with the library forwarder. [21:02:58] JohnMarkOckerblo: My pleasure. With a bit of luck, we'll find some clean way to give you the needed functionality. [21:03:11] Are there particular kinds of tests people here want to do with it? It seems to work like the Penn one does, minus the functinoality I asked about. [21:03:37] Coren: the help page says i can create my tool user myself. trying it results in a need for the role projectadmin [21:03:57] (I'll also be bringing in updates periodically, like new data files or updated functionality. I just added "jump to your country/state" links on the Penn side, for instance, since the libraries list was getting long. [21:04:04] giftpflanze: ... it does? If you click "add service group"? [21:04:33] yes [21:04:34] !log deployment-prep both apaches are now serving content from the NFS cluster. [21:04:37] Logged the message, Master [21:05:15] giftpflanze: Huh. I'll have to ask andrewbogott to look into that. In the meantime, I'd gladly create it for you. What do you want its name to be? [21:05:31] giftbot [21:06:10] Coren, do you want all project members to be able to create service groups? [21:06:16] giftpflanze: Done. You'll have to log off and back on if you are connected before you are put in the group. [21:06:26] ok [21:06:34] andrewbogott: That was the intent, originally. Unless you saw a problem with it? [21:07:02] Should be OK as long as someone doesn't run amok and create 1000 groups [21:07:49] andrewbogott: Hm. Well, we could implement a max number of service groups that needs permission to bust. But it's probably more effort than needed. [21:08:23] andrewbogott: Let's discuss it with Ryan first [21:08:57] That said, creating a service groups isn't especially onerous for the admins either. We could live with it. [21:09:05] if someone starts creating a ton, we block them [21:09:07] and delete them [21:09:33] That was my thought. It's a human problem, not a technical one. [21:09:45] Yeah, I think it's fine. I'll change the permission check. [21:10:08] Coren: what was that magical file that added a description for the tool? [21:10:22] giftpflanze: .description in the tool account's home. [21:10:28] ah, thx [21:13:44] somewhere there is a difference betwenn bots and tools. in tools nano complains: Error opening terminal: konsole. [21:14:09] Hm. [21:15:20] giftpflanze: Konsole is a term definition very specific to KDE and won't be there by default unless KDE is installed. [21:15:23] [bz] (ASSIGNED - created by: Antoine "hashar" Musso, priority: High - normal) [Bug 47870] Get rid of GlusterFS dependency on beta - https://bugzilla.wikimedia.org/show_bug.cgi?id=47870 [21:15:27] giftpflanze: export TERM=xterm [21:15:34] will do [21:15:43] you can put that in your .profile if you want to simplify things. [21:15:54] * Coren is a little surprised KDE would have been installed on bots. [21:16:11] and, coren, you once said, that you could turn tcl8.6/tip-386-impl into a package and install it on tools? [21:16:16] [bz] (NEW - created by: Ryan Lane, priority: Low - normal) [Bug 46823] Replace glusterfs - https://bugzilla.wikimedia.org/show_bug.cgi?id=46823 [21:17:31] giftpflanze: It can be done, but having to support multiple versions of a platform is always problematic. No way you can make your tool work with 8.4? [21:17:51] are you kidding me? 8.4? [21:17:53] ;) [21:18:13] 8.4 is the default supported version on Precise. [21:18:34] atm there's no tcl at afais [21:18:37] +all [21:18:54] giftpflanze: Not yet, but installing it will take me ~2min if 8.4 does the trick. [21:18:55] KDE? :D [21:19:22] giftpflanze: If you really need 8.6, open a bugzilla and I'll backport or package it anew and deploy it. [21:19:57] who don't like KDE? XD [21:20:12] ok [21:20:13] * Coren uses KDE on his desktop, always. [21:20:21] Ryan_Lane, can you explain why the sidebar here doesn't contain the contents of MediaWiki:sidebar? http://openstack-role-dev2.pmtpa.wmflabs/wiki/MediaWiki:Sidebar [21:20:27] :P [21:20:51] Coren I installed lxde and when my pc boot up, it uses some 60mb of ram in total, full desktop [21:20:53] Ryan_Lane: Oh, wait, that's a dumb example, that one is working as designed. Hang on... [21:21:09] compared to win 7 == 2gb [21:21:12] if it's different when going from page to page, it's likely sidebar cache [21:21:19] that is some 1940mb more effective :D [21:21:38] heh. we should have a labs logo generator for projects :D [21:21:49] gnome 3 is written partially in java and it eats incredible amount of resources [21:21:52] More efficient than Windows really isn't a high bar to reach. :-) [21:21:58] heh [21:22:05] gnome 3 was eating over 1 g too [21:22:11] oh my, java [21:23:27] petan: My desktop currently uses a bit over 2g, but I have a chromium with ~20 tabs, thunderbird, kvirc, four konsoles, my music player, and Liberoffice write open atm. [21:23:34] Ryan_Lane: OK… /now/ it is exhibiting the weird behavior: http://openstack-role-dev2.pmtpa.wmflabs/wiki/MediaWiki:Sidebar [21:23:42] Coren my laptop has only 2g of ram :D [21:23:49] that forces me to write memory effective stuff [21:24:16] andrewbogott: hm. that's weird [21:24:19] I had to tweak my irc client, so that it uses now 20% of ram that it used before [21:24:22] andrewbogott: how is this page being added? [21:24:27] importDump [21:24:35] It's a dump from nova-precise2 [21:24:41] hence the page looks like you wrote it [21:24:52] I'm betting some maintenance script needs to be run [21:25:02] yeah, ok. I will google. [21:25:18] importDump suggests that I run rebuildrecentchanges but that makes no difference. [21:27:00] hm [21:27:07] I don't see anything that would require it [21:27:11] I'm betting this is a memcache issue [21:27:27] have you tried restarting memcache on the system to see if it fixes it? [21:27:54] d'oh, that was it! [21:28:15] it's annoying that importDump doesn't cause the sidebar cache to be purged when the sidebar is updated [21:28:23] that sounds like a bug to me [21:29:07] Yeah… mostly painless though. [21:29:23] Oh, some of these sidebar links have an absolute URL encoded in them -- /that's/ not going to work [21:30:36] [bz] (NEW - created by: m.p.roppelt, priority: Unprioritized - normal) [Bug 47900] install tcl 8.6 - https://bugzilla.wikimedia.org/show_bug.cgi?id=47900 [21:30:56] What's really annoying is importDump crashes in the MW tip. [21:31:25] :D [21:33:47] Coren: should the term thing be documented somewhere? [21:35:47] !log deployment-prep Fixed the git path in mediawiki/extensions.git local copy of -bastion [21:35:49] Logged the message, Master [21:37:49] giftpflanze: It might be worth mentionning it somewhere on the help page, but you're the first who has had that particular issue to date; by default konsole exports 'xterm' as its terminal type, and not 'konsole' [21:38:03] hm [21:38:24] giftpflanze: Wait, are you on linux at all? [21:38:30] ok, i have it my settings to export it … [21:38:37] Coren: yes [21:38:52] … though i don't know why … [21:38:53] giftpflanze: I've seen people putting really odd things in terminal strings in putty before is why I asked. :-) [21:39:02] hehe [21:44:38] I am off. thanks again Mark [21:45:01] hashar: That's what they pay me the big^H^H^Hreasonably-sized bucks for. :-) [21:45:49] Coren: i also need tcllib and tclcurl [21:45:58] Coren: :-] [21:46:11] giftpflanze: 8.6 also, I expect? [21:46:20] doesn't really matter [21:46:33] Oh. If 8.4 will do I can install them now! [21:47:03] if you will install 8.6 as well, if not better 8.5 [21:47:59] I can do 8.5 or 8.4 now, or 8.6 in a day or two. :-) [21:48:22] then 8.5 and 8.6 when you have it [21:49:18] Coren: you probably want to get the ssh key stuff going on the nfs server, too ;) [21:49:25] so that you aren't relying on gluster at all [21:50:30] we probably want to do the same for the public datasets [21:50:30] Ryan_Lane: Probably, yes, but I expect that'll require some coordination between us because that'd need wikitech putting 'em in both places, no? [21:50:40] wikitech doesn't do that [21:50:45] it's just a cron [21:50:47] Ah! [21:50:55] it can run as many places as we'd like [21:51:11] if we get all of the public datasets copied over, we can switch the cron on dataset2 as well [21:51:23] then switch both of those in ldap to point to labsnfs [21:51:29] Ryan_Lane: Then yes, clearly we want to do that. :-) [21:52:46] we can have the cron on both systems and it can check if it needs to run by seeing if the filesystem is mounted [21:52:52] or we can make it a daemon [21:54:27] * Coren needs to go eat soon. [21:56:32] giftpflanze: tcl 8.5 with tclcurl and tcllib are now available. [21:56:42] thank you [22:33:22] i found out why my crontab wasn't working: date +%-H has a % sign … [22:58:08] Warning: There is 1 user waiting for shell: Hkam (waiting 0 minutes) [23:11:39] Warning: There is 1 user waiting for shell: Hkam (waiting 13 minutes) [23:25:12] Warning: There is 1 user waiting for shell: Hkam (waiting 27 minutes) [23:38:42] Warning: There is 1 user waiting for shell: Hkam (waiting 40 minutes) [23:52:12] Warning: There is 1 user waiting for shell: Hkam (waiting 54 minutes)