[00:00:11] :) [00:14:59] Ryan_Lane: I've been getting empty notification emails from Wikitech for a while now [00:15:00] what gives? [00:34:58] YuviPanda: something changed in echo [00:35:09] and OpenStackManager needs to be updated for the changes [00:53:01] !ping [00:53:01] !pong [00:53:03] ok [01:11:32] hi, I would like to query "select count(*) from user where user_registration > YESTERDAY'. But I found user_registration is NULL for all users. It is not replicated. How can I access the registration date? [02:31:39] [bz] (8NEW - created by: 2Mark A. Hershberger, priority: 4Low - 6enhancement) [Bug 54427] Create ping.wmflabs.org - https://bugzilla.wikimedia.org/show_bug.cgi?id=54427 [02:59:03] Ryan_Lane: ping? [02:59:12] ? [02:59:16] Ryan_Lane: https://gerrit.wikimedia.org/r/#/c/85814/ (just a fyi) [02:59:21] should be ready to merge when you wake up [03:00:58] ok, cool [03:13:33] aaehrhr [03:13:40] I got browser tests running under jenkins :) https://integration.wikimedia.org/ci/job/qa-browsertests-run/13/console [03:13:43] and falling [03:15:32] hm, memcached is now enabled, but https://pinklake.wmflabs.org is still slow. (and i've hit all the pages now, so they *should* be cached) [03:15:56] cscott: do you have php5-apc as well ? [03:16:19] cscott: and I think there is a puppet role to setup mediawiki for you with finely tuned parameters [03:16:22] cscott@towtruck:~$ dpkg -l php-apc [03:16:22] ii php-apc 3.1.7-1 APC (Alternative PHP Cache) module for PHP 5 [03:16:38] i used the puppet role, and i'm not convinced that the parameters are "finely tuned" [03:16:42] caching is turned off, for instance [03:16:58] oh [03:17:06] restarted apache? :D [03:17:08] $wgMainCacheType = CACHE_NONE; [03:17:08] $wgMemCachedServers = array(); [03:17:30] mediawiki_singlenode hasn't been maintained for a while, IIRC [03:17:32] YuviPanda: i haven't changed anything in apache. what would restarting apache do? [03:17:47] i was thinking 'installing php5-memcached' would need a restart. [03:23:38] YuviPanda: hm, restarting apache does seem to have helped. (after i then reloaded all the pages to make sure they were in cache) [03:23:50] 'turn it off and on' always helps :P [03:24:06] the secret is knowing exactly what you need to power cycle [03:24:30] cscott: heh, true :) [03:50:14] sooo nice https://integration.wikimedia.org/ci/job/qa-browsertests-run/15/artifact/report.html [03:50:40] nice, hashar [03:50:55] I am just integrating work by people smarter than me :D [03:50:58] but yeah, that is nice [03:51:09] :D [03:51:41] damn 6am [03:51:45] hashar: https://gerrit.wikimedia.org/r/#/c/85814/ is a WIP patch that makes it easy to run vagrant type stuff on labs [03:52:22] while (<>) { if $_ ~= /vagrant/i continue else proceed } [03:52:29] sorry I haven't looked at vagrant yet [03:52:54] andrewboggot told me vagrant has a backend for the openstack API [03:53:05] so potentially we could use vagrant def to boot instances in labs *evil* [03:55:03] YuviPanda|code: you should put the vagrant stuff under /mnt/vagrant :-D [03:55:10] that is a different partition in labs [03:55:23] hashar: it is ut on /vagrant [03:55:30] because the vagrant code has that path hard coded some places :P [03:55:40] hashar: the OpenStack API thing is a long way off tho :P [03:56:18] ah [03:56:28] so you can mount /dev/vdb on /vagrant [03:56:39] no :P [03:56:53] hashar: /dev/vda1 [03:56:58] just default root partition [03:57:01] that is a small partition [03:57:04] it will get filled up :-D [03:57:13] where as /dev/vdb is where all the disk space is [03:57:15] hashar: no data is stored there [03:57:28] hashar: i doubt we'll take up ~6G with just config + code [03:57:36] ah [03:57:50] I though there was the image there as well :D [03:57:56] no images :P [03:58:03] that's the idea - it runs directly on the labs instance [03:58:07] no VM on VM :P [03:58:33] ohhhhhhhhhh [03:58:36] that is evil :-D [03:58:55] hashar: why? [03:59:05] this is just using the puppet code from vagrant [03:59:11] nothing vagrant specific :P [04:00:02] that also mean anyone with access to the vagrant code has root access on any instance having that class applied :D [04:00:16] not a big deal though [04:01:58] hashar: indeed. [04:08:15] hashar: so jenkins can run browsertests on phantomJS on various jobs? That's fantastic [04:11:32] spagewmf: for now that is the basic browser tests in qa/browsertests [04:11:43] will try to get them to run against ULS by the end of the week [04:13:43] spagewmf: if you are interested, most of the test/qa/ci activity is now on the qa-l mailing list [04:34:04] [bz] (8RESOLVED - created by: 2Tim Landscheidt, priority: 4High - 6major) [Bug 52560] icinga.wmflabs.org is down: "Error: Could not read host and service status information!" - https://bugzilla.wikimedia.org/show_bug.cgi?id=52560 [07:32:15] tool lab is having issues logging in again? [07:32:46] it does? [07:32:47] * YuviPanda checks [07:34:20] liangent: looks like it [07:34:31] * YuviPanda pings Coren, petan [07:35:35] ? [07:36:02] that means nfs is fucked which I don't have access to [07:36:03] petan: login issues [07:36:05] heh [07:36:23] I can't login eiter [07:36:52] hashar: is NFS alright for betalabs? [07:37:17] YuviPanda: tools and beta have been migrated to NFS [07:37:20] my proxy instances are also fucked. [07:37:27] hashar: indeed, and NFS is fucked right now :D [07:37:31] ohh [07:37:57] can confirm across 3-4 of my projects that are on NFS [07:38:24] coren was only 20 minutes ago, anyone seen him ? [07:39:03] hashar: I did ping him here.. [07:42:13] * aude panics [07:42:19] can't login either [07:42:44] worked a while ago [07:42:51] like less than an hour ago [07:43:02] yeah [07:43:39] http://ganglia.wmflabs.org/latest/ [07:49:12] Ryan_Lane: there? [07:49:28] he probably got paged or something [07:49:37] hope so! [07:50:13] oh well, no more work on my tools this morning :( [07:51:31] aww, aude [07:51:35] it should be back soon, I hope [07:51:50] can try again in the evening [07:52:07] zeljkof: yes :-) [07:52:17] zeljkof: beta is down because NFS is cursed right now [07:52:25] hashar: thanks [07:52:46] paravoid: hey. Can you save up the NFS server in labs or should we page Coren? [08:01:05] aude: hashar springle is looking at it, in -ops [08:01:09] liangent: ^ [08:27:59] still seems stuck but good it's being looked at [08:39:45] tools is down, web and ssh both. Any estimate when it's back up again? [08:52:38] * addshore waves [08:54:41] * addshore goes to hunt someone [08:55:19] addshore: i think some ops know [08:55:33] mhhm, they knew last time too :P [08:55:51] YuviPanda|course poked springle, hashar and paravoid [08:55:57] i dont know if they are able to fix [08:56:06] or summon someone who can fix :) [08:56:25] 08:30 < springle> !log labstore4 disk issues, /a failed mount in console, ssh key bouncing. awaiting input from ryan or coren [08:57:57] heh, think springle means labstore3 xD [09:04:24] I can't fix [09:04:32] sorry :-( [09:10:54] Coren: Wake up! [09:40:38] google is down [09:41:18] dun dun dun [09:41:27] you broke it! [09:41:30] I knew it [09:41:32] :D [09:41:33] O; [09:42:46] :< [09:44:36] petan: google works for me [09:44:55] addshore: do you know what's wrong with NFS? [09:45:16] !log integration created integration-pbuilder , a 4GB RAM instance to replace integration-jobbuilder which dies with out of memory issues with some big packages. [09:45:18] same thing as last time :) it is ebing worked on :) [09:46:52] addshore: someone just rebooted? it [09:48:32] :( [09:48:57] addshore: Why is https://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Labs%20NFS%20cluster%20pmtpa&m=cpu_report&r=hour&s=by%20name&hc=4&mc=2&st=1380016076&g=cpu_report&z=large so high? [09:49:28] because nfs hates us? :) [09:50:48] because petan hates nts? [09:50:55] *nfs [10:00:44] !log integration installing misc::package-builder on integration-pbuilder [10:04:20] Right then, nfs is back and tools labs will slowly start catching up again [10:04:47] petan: zhuyifei1999 aude hashar YuviPanda|course zeljkof liangent [10:04:49] :) [10:04:58] yay! [10:05:01] nice :) [10:05:14] YAY!!!!! [10:06:00] addshore: :) [10:10:14] addshore: thanks! [10:10:20] addshore: http://ganglia.wmflabs.org/latest/graph_all_periods.php?h=tools-login&m=load_one&r=hour&s=by%20name&hc=4&mc=2&st=1380017366&g=load_report&z=large&c=tools tools-login is still overloaded [10:10:57] ignore the load :P [10:11:01] give it 5 mins ;p [10:11:20] 25 371 701 [10:11:28] 21 364 693 [10:11:43] 18 352 686 (slowly egtting better) [10:12:19] http://ganglia.wmflabs.org/latest/stacked.php?m=load_one&c=tools&r=month&st=0 good thing is that it is better than last time [10:12:43] thats because lasttime the outage was for about 6 hours if I remember [10:13:07] managed to keep this down to about 2 [10:14:17] last time we had to wait for coren [10:15:09] hopefully the procedures are documented better or will be [10:15:22] and the bug is fixed :) [10:24:17] I would like to query "select count(*) from user where user_registration > YESTERDAY'. But I found user_registration is NULL for all users. It is not replicated. How can I access the registration date? [10:26:44] ryuch_: I hope it is public [10:27:35] i do too. but it seems not. [10:29:16] yzhuyifei, then you mean that i can't right now? [10:30:03] ryuch_: idk [10:30:10] 2013-09-22 15:34:27 done User account data. (private) [10:30:28] ryuch_: file a bug? [10:30:33] but I don't have high hopes [10:30:45] is it publicly visible via the API / Mediawiki User interface? [10:31:01] user table also includes email address and password [10:31:07] YuviPanda|course: it should be available [10:31:21] YuviPanda|course: yes [10:31:34] if so it should be made available. do file the bug! [10:31:57] a pointer to the particular API call / user interface location where you can get user_registration would also be helpful, methinks [10:34:56] YuviPanda|course: http://en.wikipedia.org/w/api.php?action=query&list=allusers&aufrom=Y&auprop=registration&format=jsonfm [10:35:21] sweet [10:35:30] file a bug? [10:35:49] I have no power over any of these things, Coren is the one who has to do these things :) [10:39:48] odds are they blocked everything and whitlisted the fields that where wanted and where safe [10:40:20] possible [10:40:45] I don't see why this field can't be added, considering it is already public information [10:40:55] YuviPanda|course: you are WMF people, why can you change that? [10:41:14] zhuyifei1999: well, I work on the Mobile Apps team. Nothing at all to do with anything labs or ops [10:41:29] I just hack on labs stuff for fun [10:41:32] :( [10:43:58] [bz] (8NEW - created by: 2Ryu, Cheol, priority: 4Unprioritized - 6normal) [Bug 54498] Replication request of user_registration on user table - https://bugzilla.wikimedia.org/show_bug.cgi?id=54498 [10:51:40] does any one know how to get a bot to come log a new channel please ? [10:51:49] ah wm-bot it is [10:52:36] @add #wikimedia-qa [10:52:36] Permission denied [10:52:40] ... [10:56:08] Ha. [11:04:53] @add #wikimedia-qa [11:04:53] Permission denied [11:05:06] @add #wikimedia-en [11:05:09] @add #wikimedia-qa [11:05:09] This channel is already in db [11:05:21] @part #wikimedia-en [11:05:33] christ, I've no idea what it's doing! [11:05:36] @add #wikimedia-qa [11:05:37] This channel is already in db [11:05:43] @join #wikimedia-qa [11:05:48] hurr durr [11:22:19] sorry, got it fixed by doing the command in the #wm-bot channel :) [11:22:26] :P [11:46:37] !log parsoid - parsoid.wmflabs.org seems down [11:46:37] parsoid is not a valid project. [11:50:13] mutante: works for me now http://parsoid.wmflabs.org:8001 [11:51:03] hashar: oh, but that's different on :8001 [11:51:10] ahhh [11:51:13] i had this bookmarked link, i want the HTML converter [11:51:30] http://parsoid.wmflabs.org/_html/ [11:51:36] that's what used to work and now doesn't [11:52:08] it's where you can just paste HTML and get nice wiki markup, using actual parsoid [13:05:24] !log integration hashar@integration-selenium-driver:~$ sudo dpkg -i phantomjs_1.9.0-1_amd64.deb [13:05:26] Logged the message, Master [13:30:27] [bz] (8NEW - created by: 2Antoine "hashar" Musso, priority: 4Normal - 6minor) [Bug 45868] [OPS] [worked around] let instances access *.beta.wmflabs public IP (NAT issue in labs) - https://bugzilla.wikimedia.org/show_bug.cgi?id=45868 [13:31:37] !log integration integration-selenium-driver : installing iptables and running ` iptables -t nat -I OUTPUT --dest 208.80.153.219 -j DNAT --to-dest 10.4.1.133` to work around NAT issue ({{bug|45868}}) [13:31:40] Logged the message, Master [13:38:27] !log deployment-prep indecies finished rebuilding some time last night. [13:38:32] Logged the message, Master [13:43:00] Hey all. [13:43:08] I'm baa-ack! [13:44:12] addshore: Yeah, we have to switch servers. That one is pain, the hardware is beating us up. :-( [13:44:22] * Coren prepares labstore4 for take over. [13:44:41] * aude waves [13:45:42] Coren: server crashed again [13:45:57] petan, I am now pleased to announce that my spambot script is now fully crash resilient. [13:46:46] Betacommand: Yeah; the hardware is flaky. The controller wedges itself in a way that the kernel is unable to recover from and it just starts spinning hoping for meager bits of data to trickle through. [13:47:04] * Coren says unkind things about PERC's failure modes. [13:47:28] Coren, I am now pleased to announce that my spambot script is now fully crash resilient. [13:47:29] Thankfully, I have another server that is almost ready to take over. [13:47:43] Cyberpower678: Year robustness. [13:47:57] WTF? [13:48:05] Unlike your net connection. :-) [13:48:16] Coren, something keeps logging in as me. [13:48:41] Cyberpower678: Looks like a mobile device. [13:48:48] Agree [13:49:05] Coren, no it's not. It's Penn State Brandywine [13:50:05] hi Coren! how are you? [13:50:21] Coren, there we go. [13:50:55] Heya sumanah. Rested and back in my junk; but that's marred a bit by the NFS server being unkind again. [13:51:10] !log [13:51:12] :/ [13:51:17] !logsearch [13:51:17] http://bots.wmflabs.org/~wm-bot/searchlog [13:51:49] Coren: is tomorrow morning an okay time to schedule a"what are all the TODOs for Tool Labs" walkthrough during which I'd help update the roadmap? [13:51:58] sumanah: I'm going to be switching to different hardware. [13:52:09] Coren: ok. Fri? [13:52:24] sumanah: Fri is all cool with me and gives me elbow room. [13:52:33] okay! [13:52:38] I shall try to get Silke in as well [13:52:55] * Silke_WMDE waves [13:53:48] sumanah: Coren At what time on Friday would that be? [13:54:23] ah [13:54:26] Silke_WMDE: Coren - I was thinking 10am NYC time, which would be late afternoon your time [13:54:30] I can go earlier [13:55:19] Coren: alexandros did reboot / fix up labstore4 note that the server is in decommissioned :/ [13:55:33] sumanah: Coren 4pmBerlin fits me perfectly [13:55:38] Coren: and somehow we did not have to woke you up since Alexandros successfully rebooted labstore3 ! [13:56:06] sumanah: Works for me. [13:56:18] Cool, thank you both [13:56:27] off for a nap [13:56:34] hashar: Wait, the server is listed in decomissioned? [13:56:39] Coren: to prep, I figure I should page through the Tool Labs roadmap onwiki + bugs in BZ - any other place I should look through ahead of time? [13:56:51] Coren: yup labstore4 is decommissioned to prevent ganglia monitoring apparently [13:57:15] sumanah: BZ is the primary source indeed. [13:57:22] hashar: That seems... like an ugly hack to me. [13:57:44] Coren: https://gerrit.wikimedia.org/r/#/c/84547/ [13:58:02] sumanah: Coren I just met the ts admins and we made some plans that will add to the migration roadmap [13:58:05] Coren: I guess you want to revert / unmonitor it properly hehe :] [13:58:06] cool [13:58:24] hashar: ... I really do. :-) [13:59:45] Coren: hashar : note how we have manifests/decommissioning.pp (and use it) but then there is also: modules/ganglia_new/manifests/configuration.pp:"decommissioned" => { [14:00:04] not that i touched it.. but i guess you could just decom it there to prevent ganglia [14:00:17] but not get the other results of decom.pp, like no nagios [14:03:09] amha if it s decommissioned, it get powered off, disk wiped and label removed and the box enter a pool [14:03:21] then either it is reused for something else or it get unraked /destroyed [14:03:23] but hmm [14:03:39] need sleep now. See you in a few [14:04:13] yea, the thing is that we use it for "temp. decom" as well [14:04:26] and then sometimes it's reclaimed with the same hostname [14:18:39] * Coren will bbiab. [14:19:58] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789782 edit summary: /* Schedule */ update: Render has migrated [14:20:50] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap de was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789783 edit summary: /* Zeitplan */ Aktualisierung: Render ist migriert [14:30:08] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789791 edit summary: /* Schedule */ data of inactive users [14:34:20] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap de was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789792 edit summary: /* Zeitplan */ Daten inaktiver Accounts [14:38:46] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789795 edit summary: /* Schedule */ no new TS accounts [14:40:41] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap de was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789798 edit summary: /* Zeitplan */ keine neuen TS-Accounts [14:45:18] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789800 edit summary: /* Schedule */ inactive accounts 2nd round [15:06:52] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap de was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789812 edit summary: /* Zeitplan */ Runde 2 inaktive Accounts [15:08:10] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789813 edit summary: /* Schedule */ [15:13:00] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789815 edit summary: /* Schedule */ Account expirations before final date [15:13:43] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap en was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789816 edit summary: /* Schedule */ typo [15:16:49] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap de was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789823 edit summary: /* Zeitplan */ Account-Expiration [15:17:22] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Roadmap de was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=789824 edit summary: /* Zeitplan */ typo [15:40:18] [bz] (8ASSIGNED - created by: 2Betacommand, priority: 4High - 6major) [Bug 54052] tools.wmflabs.org inaccessible via labs instances - https://bugzilla.wikimedia.org/show_bug.cgi?id=54052 [16:03:00] [bz] (8NEW - created by: 2Ryu, Cheol, priority: 4Unprioritized - 6normal) [Bug 54498] Replication request of user_registration on user table - https://bugzilla.wikimedia.org/show_bug.cgi?id=54498 [16:16:46] ㅅ [16:16:51] ups [16:18:34] !ping [16:18:34] !pong [16:42:55] Coren: any reason nfs died again? [17:01:33] Ryan_Lane: about to head to sleep in about 10mins, think you can do a quick review of https://gerrit.wikimedia.org/r/#/c/85814/ before that? [17:02:19] I'm getting on a plane really soone [17:02:21] *soon [17:02:24] haha [17:02:24] but I'll have wifi [17:02:39] i'll probably wake up in a few hours. i've poked ori too [17:02:55] * Ryan_Lane nods [17:08:28] off to sleep for a bit now [17:12:05] Betacommand: The hardware; she is effed up. [17:12:16] Betacommand: I'm switching to a different server this week. [17:13:13] The disk controllers wedge up, and become unresponsive to all but a hard reset. [21:00:21] Firefox and Chromium are prompting me to trust security certificates for en beta labs and https://bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/load.php , should I? [21:04:23] spagewmf: short answer is "yes" https://bugzilla.wikimedia.org/show_bug.cgi?id=53113 [21:50:18] YuviPanda_zz: are you still sleeping? [22:02:27] IRC meeting for RFC review about to start in #mediawiki-rfc [23:20:12] !ping [23:20:12] !pong [23:55:08] man, so windy in here! [23:57:45] andrewbogott: hm [23:58:17] yeah, that can go