[00:24:43] RECOVERY Free ram is now: OK on bots-3 i-000000e5 output: OK: 46% free memory [01:28:43] PROBLEM Free ram is now: WARNING on mobile-enwp i-000000ce output: Warning: 10% free memory [01:53:37] RECOVERY Free ram is now: OK on mobile-enwp i-000000ce output: OK: 21% free memory [02:58:49] RECOVERY Puppet freshness is now: OK on wikidata-dev-3 i-00000222 output: puppet ran at Thu Apr 26 02:58:35 UTC 2012 [03:08:09] PROBLEM Puppet freshness is now: CRITICAL on nova-production1 i-0000007b output: Puppet has not run in last 20 hours [03:12:09] PROBLEM Puppet freshness is now: CRITICAL on nova-gsoc1 i-000001de output: Puppet has not run in last 20 hours [03:32:09] PROBLEM Current Load is now: WARNING on bots-sql3 i-000000b4 output: WARNING - load average: 6.71, 5.92, 5.28 [03:37:29] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 14% free memory [03:39:19] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 16% free memory [03:55:41] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 16% free memory [03:57:31] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 5% free memory [03:59:11] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 4% free memory [03:59:21] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 17% free memory [04:04:11] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 96% free memory [04:07:32] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 92% free memory [04:15:42] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 5% free memory [04:19:23] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 4% free memory [04:20:43] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 96% free memory [04:24:23] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:27:13] PROBLEM Free ram is now: WARNING on test3 i-00000093 output: Warning: 11% free memory [04:32:13] RECOVERY Free ram is now: OK on test3 i-00000093 output: OK: 96% free memory [04:42:13] RECOVERY Current Load is now: OK on bots-sql3 i-000000b4 output: OK - load average: 3.78, 4.57, 4.88 [06:28:35] petan|wk: regarding the translations i had an idea, right now they are only build on update right? is this done for all wikinames? could it be that UploadWizard/TimedMediaHandler is only installed on commons and so its i18n strings are not extracted if the update is done for another wikiname [06:40:12] I know [06:40:19] that's why I tried to do that for commons [06:40:24] latest build is from commons [06:40:57] in fact the commons extensions have cache [06:41:05] it's broken for another reason [07:33:13] PROBLEM Puppet freshness is now: CRITICAL on wikidata-dev-2 i-0000020a output: Puppet has not run in last 20 hours [08:34:59] hi [11:24:48] PROBLEM Free ram is now: WARNING on mobile-enwp i-000000ce output: Warning: 18% free memory [11:29:48] RECOVERY Free ram is now: OK on mobile-enwp i-000000ce output: OK: 23% free memory [13:09:09] PROBLEM Puppet freshness is now: CRITICAL on nova-production1 i-0000007b output: Puppet has not run in last 20 hours [13:09:39] PROBLEM Disk Space is now: CRITICAL on deployment-feed i-00000118 output: DISK CRITICAL - free space: / 0 MB (0% inode=37%): [13:11:39] PROBLEM Disk Space is now: CRITICAL on deployment-transcoding i-00000105 output: DISK CRITICAL - free space: / 39 MB (2% inode=53%): [13:13:09] PROBLEM Puppet freshness is now: CRITICAL on nova-gsoc1 i-000001de output: Puppet has not run in last 20 hours [13:29:34] RECOVERY Disk Space is now: OK on deployment-feed i-00000118 output: DISK OK [13:33:44] PROBLEM Current Load is now: CRITICAL on wikidata-dev-3 i-00000225 output: Connection refused by host [13:34:24] PROBLEM Current Users is now: CRITICAL on wikidata-dev-3 i-00000225 output: Connection refused by host [13:35:04] PROBLEM Disk Space is now: CRITICAL on wikidata-dev-3 i-00000225 output: CHECK_NRPE: Error - Could not complete SSL handshake. [13:35:44] PROBLEM Free ram is now: CRITICAL on wikidata-dev-3 i-00000225 output: CHECK_NRPE: Error - Could not complete SSL handshake. [13:36:54] PROBLEM Total Processes is now: CRITICAL on wikidata-dev-3 i-00000225 output: CHECK_NRPE: Error - Could not complete SSL handshake. [13:37:34] PROBLEM dpkg-check is now: CRITICAL on wikidata-dev-3 i-00000225 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:34:34] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on nova-production1 i-0000007b output: Puppet has not run in last 20 hours [14:34:49] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on nova-gsoc1 i-000001de output: Puppet has not run in last 20 hours [14:35:04] ACKNOWLEDGEMENT Puppet freshness is now: CRITICAL on wikidata-dev-2 i-0000020a output: Puppet has not run in last 20 hours [14:55:14] petan|wk: any reason not to re-run that script to enable interwiki links in beta labs? I think the db may have been altered to prevent that recently [14:55:39] let me try it [14:55:52] thanks tparveen ^^ [15:00:18] petan|wk I also got a message: Sorry! We could not process your edit due to a loss of session data. Please try again. If it still does not work, try logging out and logging back in. when I tried to link video from beta labs [15:00:59] did you try it again? [15:01:36] chrismcmahon: reruned [15:03:11] petan|wk: I may have the syntax wrong but the lower link here http://en.wikipedia.beta.wmflabs.org/wiki/User_talk:Cmcmahon used to show that video and no longer does [15:04:00] chrismcmahon: lower link display video [15:04:11] when I click to play [15:04:26] oh good! [15:05:05] ah, ok, good, the thumb is broken but the media file plays tparveen ^^ [15:05:43] * chrismcmahon wonders why the thumb is broken, but that's maybe outside current scope [15:05:49] yes got it.. [15:05:52] I ran rebuild all [15:05:58] that probably broke [15:05:59] that [15:07:27] thanks petan|wk [15:08:19] (I know I complain a lot, but I am really happy to see the labs beta envs improving more and more) [15:09:33] petan|wk do you know why I am getting this error every time I try to save changes? [15:10:01] tparveen: everytime? [15:10:11] Platonides: hi [15:10:12] yes [15:10:15] ok [15:10:21] sec [15:11:00] Platonides: can you somehow make the logs.php in wmf-config write the token to code of of wiki pages? [15:11:22] there was temporary hack using echo which broke some styles and such [15:12:47] tparveen: can you tell me what you did? [15:12:51] so that I can try it too [15:13:51] http://en.wikipedia.beta.wmflabs.org/w/index.php?title=User:Tparveen&action=edit [15:14:30] wait...it did not do that last time... [15:14:51] let me try linking another video [15:16:17] see all I did was add [[commons:File:Electric_sheep.webm]] [15:16:32] hit save changes [15:16:44] or save page - sorry [15:18:36] works to me [15:18:58] Platonides: can you somehow make the logs.php in wmf-config write the token to code of of wiki pages? [15:19:21] so that it's in source code of html [15:19:32] like [15:19:37] on end of each page [15:19:48] or beginning [15:19:57] beginning is better [15:20:14] which token? [15:20:26] if you open logs.php you will see it [15:20:29] $randomHash [15:20:41] petan|wk - do I have the syntax wrong to link the video? is it [[File:Electric sheep.webm]] or [[commons:File:Electric sheep.webm]] [15:20:51] because I don't see any videos anymore... [15:20:56] it's [[File:Electric sheep.webm]] [15:20:59] tparveen, just File: [15:21:01] that's correct [15:21:37] $wgOut->addHTML should work as it's there [15:21:42] it doesn't [15:21:53] when I use it page crashes no idea why [15:22:00] errors are disabled [15:22:08] and I don't even know on which server to look for logs [15:22:56] why I'm no longer in depops group? :S [15:23:02] you are [15:23:08] probably logged to wrong instance [15:23:08] no, I'm am [15:23:19] deployment-dbdump has a read-only copy? [15:23:28] no it has write copy :D [15:23:32] copy of what [15:24:18] platonides@deployment-dbdump:/usr/local/apache/common-local/wmf-config$ touch test [15:24:18] touch: cannot touch `test': Permission denied [15:24:29] Platonides: sudo su wmfdeploy [15:24:45] being in depops group, I shouldn't need that [15:24:50] is the ldap bug back? [15:24:58] no mwdeploy [15:25:05] yes the bug is back [15:25:07] use mwdeploy [15:25:14] it makes stuff easy [15:29:04] php erros go to syslog [15:29:14] what does syslog do with them? [15:29:25] no idea [15:29:31] hashar did that [15:31:13] it is using rsyslog [15:31:19] * Platonides looks at /etc/rsyslog.conf [15:31:51] *.info;mail.none;authpriv.none;cron.none @syslog.pmtpa.wmnet [15:32:41] that's unliklelly to be right [15:33:03] ok, I found php errors in /var/log/messages [15:33:14] PHP Warning: array_map() [function.array-map]: Argument #2 should be an array in /usr/local/apache/common-local/php-trunk/includes/User.php on line 1410 [15:33:34] PHP Warning: file(/usr/local/apache/common-local/php-trunk/../wmf-config/mwblocker.log) [function.file]: failed to open stream: No such file or directory in /usr/local/apache/common-local/php-trunk/includes/User.php on line 1410 [15:34:04] hm, mwblocker.log seems to be a proxy list [15:38:07] that should fix those warnings [15:39:23] it still doesn't load :( [15:39:43] Apr 26 15:39:25 i-00000217 apache2[23671]: PHP Fatal error: Allowed memory size of 104857600 bytes exhausted (tried to allocate 71 bytes) in /usr/local/apache/common-local/php-trunk/includes/profiler/Profiler.php on line 135 [15:40:37] Apr 26 15:39:52 i-00000217 apache2[13585]: PHP Fatal error: Allowed memory size of 104857600 bytes exhausted (tried to allocate 71 bytes) in /usr/local/apache/common-local/php-trunk/includes/cache/MessageCache.php on line 718 [15:40:55] that's a 100M limit [15:44:19] it'd be nioce to be able to force use of a server [15:48:53] now it downloaded [15:49:50] that's still slow, though [15:49:52] 28s [15:53:16] who placed Scripting folder in php-trunk ? [16:08:52] question about the definition of "interwiki link", I want to validate something I think is true: given that media files are served from the same host (upload.beta.wmflabs.org) for both labs enwiki and labs commons, do we in fact have in place a true interwiki link when linking to a media file from labs enwiki? [16:12:31] chrismcmahon, an interwiki link is not using commons files [16:12:58] an interwiki link is a wiki link which leads you from en to commons/meta/fr... [16:13:39] the feature you're thinking in could probably be named instantantcommons [16:14:21] or "Shared image repository" [16:14:53] now, what do you mean by "true interwiki link" ? [16:16:54] thanks Platonides, what I am trying to achieve is to play a media file on labs enwiki (which should not have TMH in place) by way of an interwiki link to a host (commmons) that does have TMH in place. [16:17:14] Platonides: sorry, I am lost...could use some explanation...so when we use [[file:filename]] to create an interwikilink, where is the media coming from? [16:17:17] I'm starting to think this may not be possible [16:17:25] or is this not the way to create an interwiki link... [16:17:30] chrismcmahon, it's probably not possible [16:17:43] it's the local wiki what plays it [16:17:52] so I think it would need TimedMediaHandler [16:18:10] also, what does * leads you from en to commons/meta/fr... * means? [16:18:19] tparveen, [[file:filename]] isn't an interwiki [16:18:49] an interwiki would be [[fr:MediaWiki]] [16:18:58] [[wikipedia:mainpage]] is? [16:19:10] such link in enwiki would point you to http://fr.wikipedia.org/wiki/MediaWik [16:19:23] tparveen, that might or not be one [16:19:35] it's easier to think in them as language links [16:20:16] at wikisource [[wikipedia:mainpage]] would be an interwiki [16:20:37] but in Wikipedia it would be an internal link to the project namespace (which is named Wikipedia there) [16:21:08] back to your question, [[file:filename]] is an inclusion of a file named filename [16:21:18] the steps it does are: [16:21:30] 1) look for filename uploaded in the local wiki [16:21:35] tparveen Platonides what if we make a page on labs commons that links to a media file, and then another page on labs enwiki that links to the commons page with the media file but *not* to the file itself. I think that gets us there. [16:21:44] 2) if there's no local file, it can look at different repository [16:21:54] which is where wikimedia commons is usually configures [16:22:50] chrismcmahon, I'm not sure what are you wanting... [16:23:21] you can link to a different wiki which does have TMH [16:23:35] but then, that's probably not what you wanted to test... [16:24:43] Patonides/Chris: maybe I am asking dumb questions but I have to ask....when you say *link to a different wiki which does have TMH* give me an e.g of a wiki that has TMH [16:25:20] I think commons.wikimedia.org has it [16:26:16] but that's our production env...that means http://commons.wikimedia.beta.wmflabs.org/wiki/ also has TMH? [16:26:58] Platonides: I think that is what we want to test. put another way: given that labs commons has TMH in place; AND given that a page on labs commons has a playable media file; WHEN I create an interwiki link to that page on labs commons from a wiki that does not have TMH in place; THEN I can still play the media correctly. [16:26:59] hmm... I don't see it there [16:27:23] chrismcmahon, that's not an interwiki [16:27:30] otherwise, yes, it should play fine [16:27:39] with the OggHandler [16:29:29] Platonides: for example, http://commons.wikimedia.beta.wmflabs.org/wiki/File:Alice_in_Wonderland_1903.ogv shows at the bottom of the page partial processing by TMH (it had an error on one of the transcodes) [16:30:46] I see [16:32:36] that /mnt/data/ folder it's trying to use doesn't exist [16:33:26] Platonides: for context, tparveen is helping out for as much final testing of TMH as we can get in. labs beta commons is our test env of record. I have done some testing in there, but I also have other things taking my attention. [16:34:03] and I am new so I am having to understand and learn in a very short time, hence all my questions... [16:34:05] Platonides: and part of that testing process is improving the stability/usability of labs commons, so thanks for the help! [16:37:42] that helps reaching two goals at once :) [16:41:07] * chrismcmahon is all about the serendipity :) [17:13:44] PROBLEM Total Processes is now: CRITICAL on pediapress-ocg2 i-00000226 output: Connection refused by host [17:14:24] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg2 i-00000226 output: CHECK_NRPE: Error - Could not complete SSL handshake. [17:15:44] PROBLEM Current Load is now: CRITICAL on pediapress-ocg2 i-00000226 output: CHECK_NRPE: Error - Could not complete SSL handshake. [17:16:27] PROBLEM Current Users is now: CRITICAL on pediapress-ocg2 i-00000226 output: CHECK_NRPE: Error - Could not complete SSL handshake. [17:16:54] PROBLEM Disk Space is now: CRITICAL on pediapress-ocg2 i-00000226 output: CHECK_NRPE: Error - Could not complete SSL handshake. [17:17:34] PROBLEM Free ram is now: CRITICAL on pediapress-ocg2 i-00000226 output: CHECK_NRPE: Error - Could not complete SSL handshake. [17:20:24] PROBLEM Current Load is now: WARNING on bots-sql3 i-000000b4 output: WARNING - load average: 6.65, 7.01, 5.76 [17:45:24] RECOVERY Current Load is now: OK on bots-sql3 i-000000b4 output: OK - load average: 1.78, 2.76, 4.33 [18:01:44] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 45 MB (3% inode=53%): [18:06:44] PROBLEM Disk Space is now: CRITICAL on deployment-transcoding i-00000105 output: DISK CRITICAL - free space: / 30 MB (2% inode=53%): [18:14:49] hey [18:17:11] heh [18:17:14] hi there [18:18:20] I'm going to upgrade gluster for project storage today, likely [18:18:27] Ryan_Lane: I was thinking of that welcome message, do we really want to have LQT as default [18:18:31] ok [18:18:34] cool [18:18:39] hm. probably not, actually [18:18:43] we can enable it per-page [18:18:54] right [18:18:58] it's not really stable [18:19:07] well, it runs in production [18:19:13] so, it's stable enough for that [18:19:18] yes I know [18:19:25] but, they are doing a full rewrite of it [18:19:27] so, yeah [18:19:45] I hope althought I haven't seen a code so far [18:19:57] it would be really cool if they started to do development inpublic :P [18:20:57] it's top secret or they didn't start yet :D [18:21:58] is that update going to affect current projects? [18:22:06] no idea [18:22:09] ok [18:22:23] it would be nice to give us a notification if you were to shut down storage etc [18:22:29] hm, how do I disable it by default? [18:22:36] I'm not shutting anything down [18:22:43] sec [18:22:46] it should be unnoticable [18:22:54] You say that everytime :D [18:23:19] Though I don't actually use project storage atm, probably should [18:23:22] eh? nothing has broken due to an upgrade of something yet [18:23:35] gluster broke all on its own last time [18:23:36] http://noc.wikimedia.org/conf/liquidthreads.php.txt this is prod version [18:24:06] Ryan_Lane: nothing has broken are you sure [18:24:08] :D [18:24:10] hehe [18:24:17] not due to upgrades [18:24:19] I remember some outage imho [18:24:27] again, not due to an upgrade ;) [18:24:31] ok [18:24:44] Though you havn't really done upgrades when we're in a 'not broken' state? :D [18:24:58] not of gluster [18:25:17] I plan to test this in instances first ;) [18:25:22] gluster really need a lot of fixes [18:25:32] it seems to have even worse io than local storage [18:25:34] Gluster is pretty awesome [18:25:48] It has worse write io for sure [18:25:55] yes [18:25:59] But read io should actually be faster me thinks [18:26:01] wait [18:26:03] ah [18:26:06] that's bad [18:26:09] I want write too [18:26:12] :D [18:26:15] there's no way it's slower than local [18:26:27] Local was nfs ontop of fs ontop of gluster ontop of fs right? [18:26:32] no [18:26:35] actually there is maybe it's just loaded [18:26:46] Or you mean vm local not nfs? which was on gluster anyway [18:26:49] local is using different hardware I guess [18:27:05] so even if it's configured worse it may be faster because of load [18:27:17] I guess big gluster has bigger load than local storage [18:27:22] it's ext3 -> qcow2 -> gluster -> ext3 -> lvm -> raid [18:27:23] I've only noticed writing is slower a little and that's in a setup where I have 6 replicas and it sometimes writes thousands of small cache files :( [18:27:25] that's local [18:27:31] there's no way the project storage is slower [18:27:57] the project storage is gluster -> xfs -> lvm -> raid [18:28:05] ok it's like when I type vi /data/project/somefile my shell stuck like for 20 seconds [18:28:08] ah [18:28:11] while local file open in 1 sec [18:28:14] when it isn't mounted yet? [18:28:20] mounting takes a while [18:28:30] I need to change the automount settings [18:28:34] when it's mounted it's better but it dismount in a while [18:28:38] yeah it kinda sucks atm [18:28:48] yeah, it dismounts way too quick right now [18:29:05] it would be cool to make it permanent on some instances which have /etc linked to it [18:29:10] * Damianz resists 'can't keep it up' note [18:29:22] apaches have /etc/apache linked to /data/project/etc [18:29:40] so they all share same config [18:29:53] That should be in puppet? [18:29:58] probably [18:30:08] if I could merge stuff in puppet I would be happy to do that [18:30:14] yeah true [18:30:24] but waiting hours for someone to review it is not really what I like to do [18:30:31] maybe local puppet repository for project [18:30:40] where members could approve stuff [18:30:51] that would be nice [18:30:56] petan [18:30:58] err [18:31:08] Btw, have AFAIK gluster docs say to put the mounts in fstab - when I tried that it mounted mounts on mounts on mounts etc, anyone EVER got that to work? using rc.local and grep atm [18:31:09] petan|wk: it's possible to mount it permanently in the fstab [18:31:10] :po [18:31:14] ok [18:31:24] don't put /data/project in there, though [18:31:43] Damianz: it works for me fine [18:31:46] ok [18:32:10] Hmm, so if you run mount -a;mount -a;mount -a;mount -a you don't get 4 mounts ontop of each other? [18:32:18] I hope ubuntu 12 is LTS [18:32:21] :D [18:32:22] It is [18:32:29] I just installed it on my new laptop [18:32:44] nice [18:32:52] Damianz: you shouldn't. no [18:33:15] though, that said, I rarely type mount -a [18:33:32] Weird then, might moan in #gluster later then as when I tested it on some centos boxes it did that and caused really horrid issues. [18:35:50] Home internet is totally being laggy as hell atm :( [19:13:45] hi sumanah [19:13:53] hi petan petan|wk [19:14:05] I wanted to ask about that new team introduced on wikitech [19:14:24] what is it exactly? [19:14:40] * Damianz paws sumanah [19:14:46] Damianz: do not do that. [19:14:49] is there going to be some announcement related to that [19:15:02] petan|wk: As Rob said, it's the same as the TL;DR group was. it is just a renaming. [19:15:21] * Damianz notes to retract his claws next time [19:15:24] what was TLDR [19:15:30] never heard of that [19:15:54] petan|wk: then you never looked at our blog or our communications in the monthly engineering report or at the platform eng hub [19:16:12] https://blog.wikimedia.org/2011/11/15/tld/ https://www.mediawiki.org/wiki/Wikimedia_Platform_Engineering https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2012/March [19:16:12] yes I am looking on reports [19:16:20] petan|wk: Then you didn't read the reports. [19:16:31] :) [19:16:45] https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2012/March#Technical_Liaison.3B_Developer_Relations [19:16:57] I don't see any TL DR mentioned in there [19:17:11] petan|wk: do you know how abbreviations work? [19:17:23] eh, depends [19:17:26] Sometimes we refer to things by the first initials of the words [19:17:53] "Technical Liaison; Developer Relations" shortens to the abbreviation "TL;DR". Rob explained this in his announcement email about my promotion. [19:18:00] oh [19:18:04] Damianz: do you have a specific question? [19:18:28] Nope, I'm just bored and avoiding paperwork :D [19:18:38] TL;DR has a whole other meaning also [19:18:41] :D [19:18:46] Yes, that was a deliberate joke. [19:18:47] I think so [19:19:03] heh [19:19:06] petan|wk: So, next time, please try reading the email. [19:19:21] I am trying :) I had to overlook that [19:19:25] "Sumana [19:19:26] has been working with Guillaume Paumier and Mark Hershberger under the [19:19:26] somewhat ad hoc group title of "Technical Liaison; Developer Relations [19:19:26] (tl;dr)", serving as lead of that group since last year. Under the [19:19:26] new "Engineering Community" name, this group will continue...." [19:19:53] petan|wk: When you ask for information that's already been given to you, you waste time [19:20:05] probably [19:20:17] sumanah: what this change means for us? [19:20:30] Who is "us"? [19:20:39] the community? [19:20:40] by us, I mean community folks, like me [19:20:41] The Wikimedia technical community? [19:20:55] ok, I thought so, just wanted to check [19:21:15] * Damianz sniffs sumanah and thinks she's in a bad mood [19:21:32] Damianz: you're being irritating with the "pawing" and "sniffing" so please act more appropriately [19:21:43] petan|wk: as you saw in Rob's email, this title change really just reflects what I've been doing already and is more accurate [19:22:35] Totally not even being irritating :D [19:22:36] petan|wk: so, the change you will see is a change in my signature line in my emails. [19:22:52] people still use signature lines? :) [19:22:56] I do. [19:23:15] Be glad I don't clutter it up with disclaimers and inspiring quotes! :) [19:23:25] or quips we have [19:23:38] petan|wk: you may also be interested to read https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals [19:23:40] btw Damianz did you know I quoted you in bugzilla? [19:24:05] Probably not, I don't pay that much attention to bz even for the bugs I personally submit. [19:24:23] sumanah: I don't see ipv6 there [19:24:30] why! [19:24:33] ask Ops. [19:24:40] Ryan_Lane: ^ [19:24:47] OK, sounds like I've answered all the questions for me today [19:25:01] sumanah: sorry for being so annoying [19:25:18] I was just curious what is happening [19:25:20] petan|wk: Just work on your reading comprehension and I'll be happier. [19:25:21] :) [19:25:55] Also, BAH at the person who moved the labs mailing list a while ago [19:26:07] Happy Thursday. [19:26:17] :O [19:26:34] * Damianz eats sumanah [19:26:38] next time she will eat you Damianz [19:26:45] I'm tasty [19:27:11] Damianz: moved the mailing list? [19:27:14] moved it where? [19:27:42] Well the List-Id changes, IIRC from a @ to a . [19:27:51] Had to fix my sieve rules the other day to tidy my inbox [19:28:08] Damianz: your client is stupid [19:28:16] :P [19:28:25] it should do it auto [19:29:37] Damianz: ypou attend the hackaton? [19:29:59] What do you mean auto? Sieve filters mail into nice folders for me [19:30:06] And nope, it was in europe [19:30:22] Damianz: I mean will you attend it [19:30:29] this year [19:30:37] you don't live in eu [19:30:49] I thought u UK guy [19:31:01] that's eu [19:31:30] Well yeah, too busy to get over the pond I think though [19:31:39] meh [19:32:03] you know there is a highway in pond [19:32:12] Under the pond :D [19:32:16] right [19:34:22] Bleh so I un-subscribed from the bugs mailing list [19:34:29] there is some [19:34:31] :D [19:34:47] I hope we can make bz use rss [19:35:01] The bz interface really sucks for finding stuff imo [19:35:01] custom feed [19:35:06] yes [19:35:07] it does [19:35:27] I know some trackers and all sucks [19:35:33] I would say bz is best [19:36:11] when I need to find all bugs I need to search regex .* :D [19:42:30] * Damianz gives up on looking for quotation of self and goes to finish paperw0rk [19:54:37] Damianz: open all quotes [19:54:55] it's not a bug heh [19:56:02] I don't even know where the quotes thing is tbh [19:59:16] Oh THERE they are [19:59:29] Totally had to google for the actually cgi as the link totally doesn't exist [20:18:34] uuuuugggghhh [20:18:43] now I can't log in thanks to NewUserMessage [20:21:07] fixed in trunk, thankfully [20:28:15] paravoid: hm. to run of production branch by default, or keep the test branch.... [20:29:05] I kind of despise the test branch, overall [20:30:06] it's possible to test gated changes in labs from the production branch [20:30:18] since it's review before merge [20:30:30] and a change is a branch [20:31:43] petan|wk: does this mean anything to you? http://commons.wikimedia.beta.wmflabs.org/wiki/Chris_page " Sorry, your browser either has JavaScript disabled or does not have any supported player." [20:32:07] It doesn't help the test branch is broken [20:32:15] well, it's not broken [20:32:18] it's just our of sync [20:32:43] Broken in the sense it will never be usable outside of its current use without huge amounts of work to undo the mess that's been made [20:33:01] we cherry-pick changes out of it [20:34:43] I'm thinking we should just use production [20:34:58] and test labs-wide using pre-merge changes [20:35:27] New patchset: Ryan Lane; "Change the automount timeout to 2 hours" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/5928 [20:35:41] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/5928 [20:35:50] I should really switch the home directories to gluster before I change that [20:48:50] Ryan_Lane: how do I see a list of all the projects in labs? [20:49:07] !projects [20:49:08] https://labsconsole.wikimedia.org/wiki/Special:Ask/-5B-5BResource-20Type::project-5D-5D/-3F/-3FMember/-3FDescription/mainlabel%3D-2D [20:49:13] it's linked off the main page [20:51:08] Ryan_Lane: so, we don't have a project for Lua? [20:51:12] nopw [20:51:15] *nope [20:51:27] Ryan_Lane: can you make a Scribunto project? [20:51:30] sure [20:52:00] done [20:52:38] 04/26/2012 - 20:52:38 - Creating a project directory for scribunto [20:52:38] 04/26/2012 - 20:52:38 - Creating a home directory for preilly at /export/home/scribunto/preilly [20:52:51] labs-home-wm: <3 [20:53:01] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 6.21, 13.69, 7.01 [20:53:39] 04/26/2012 - 20:53:38 - Updating keys for preilly [20:54:10] Created instance i-0000022b with image ami-0000001d and hostname i-0000022b.pmtpa.wmflabs. [20:54:15] PROBLEM Disk Space is now: CRITICAL on gluster-client2 i-00000228 output: Connection refused by host [20:54:36] PROBLEM Free ram is now: CRITICAL on gluster-client2 i-00000228 output: Connection refused by host [20:54:41] PROBLEM Disk Space is now: CRITICAL on gluster-client1 i-00000227 output: Connection refused by host [20:55:11] PROBLEM Free ram is now: CRITICAL on gluster-client1 i-00000227 output: Connection refused by host [20:55:51] PROBLEM Total Processes is now: CRITICAL on gluster-client2 i-00000228 output: Connection refused by host [20:56:01] PROBLEM dpkg-check is now: CRITICAL on gluster-server1 i-00000229 output: Connection refused by host [20:56:31] PROBLEM dpkg-check is now: CRITICAL on gluster-client2 i-00000228 output: Connection refused by host [20:56:31] PROBLEM Total Processes is now: CRITICAL on gluster-client1 i-00000227 output: Connection refused by host [20:56:36] PROBLEM Current Load is now: CRITICAL on gluster-server2 i-0000022a output: Connection refused by host [20:57:02] PROBLEM dpkg-check is now: CRITICAL on gluster-client1 i-00000227 output: Connection refused by host [20:57:02] PROBLEM Current Load is now: CRITICAL on gluster-server1 i-00000229 output: Connection refused by host [20:57:02] PROBLEM Current Users is now: CRITICAL on gluster-server2 i-0000022a output: Connection refused by host [20:57:51] PROBLEM Disk Space is now: CRITICAL on gluster-server2 i-0000022a output: Connection refused by host [20:57:51] PROBLEM Current Users is now: CRITICAL on gluster-server1 i-00000229 output: Connection refused by host [20:57:51] PROBLEM Current Load is now: CRITICAL on gluster-client2 i-00000228 output: Connection refused by host [20:58:21] PROBLEM Current Load is now: CRITICAL on gluster-client1 i-00000227 output: Connection refused by host [20:58:21] PROBLEM Current Users is now: CRITICAL on gluster-client2 i-00000228 output: Connection refused by host [20:58:21] PROBLEM Disk Space is now: CRITICAL on gluster-server1 i-00000229 output: Connection refused by host [20:58:21] PROBLEM Free ram is now: CRITICAL on gluster-server2 i-0000022a output: Connection refused by host [20:59:01] PROBLEM Current Users is now: CRITICAL on gluster-client1 i-00000227 output: Connection refused by host [20:59:31] PROBLEM Total Processes is now: CRITICAL on gluster-server2 i-0000022a output: Connection refused by host [21:00:01] PROBLEM Free ram is now: CRITICAL on gluster-server1 i-00000229 output: Connection refused by host [21:00:14] PROBLEM dpkg-check is now: CRITICAL on gluster-server2 i-0000022a output: Connection refused by host [21:00:14] PROBLEM Total Processes is now: CRITICAL on gluster-server1 i-00000229 output: Connection refused by host [21:02:34] Ryan_Lane: heh [21:02:45] you said [21:02:48] that's what I get for creating 4 instances at once [21:02:48] no outage [21:02:49] spammy [21:02:55] thats new instances [21:02:55] aha [21:03:00] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.70, 2.59, 4.07 [21:03:04] I thought storage is down heh [21:03:13] that would show up in -operations ;) [21:03:17] ok [21:03:35] I should make pings on that [21:03:41] too many channels hehe [21:03:55] PROBLEM Current Load is now: CRITICAL on scribunto i-0000022b output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:04:18] it would be nice if the nagios server ignored new clients until nrpe was up [21:04:24] PROBLEM Current Users is now: CRITICAL on scribunto i-0000022b output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:05:04] PROBLEM Disk Space is now: CRITICAL on scribunto i-0000022b output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:05:44] PROBLEM Free ram is now: CRITICAL on scribunto i-0000022b output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:06:44] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 40 MB (3% inode=53%): [21:06:54] PROBLEM Total Processes is now: CRITICAL on scribunto i-0000022b output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:07:34] PROBLEM dpkg-check is now: CRITICAL on scribunto i-0000022b output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:12:10] Ryan_Lane: 21:11:58 up 38 days, 16:25, 7 users, load average: 5.01, 2.09, 0.77 [21:12:11] PROBLEM Disk Space is now: CRITICAL on deployment-transcoding i-00000105 output: DISK CRITICAL - free space: / 24 MB (1% inode=53%): [21:12:16] Ryan_Lane: for bastion [21:14:03] PROBLEM Current Load is now: WARNING on bots-sql3 i-000000b4 output: WARNING - load average: 3.12, 5.30, 5.06 [21:17:29] Ryan_Lane: can I get an IP for scribunto? [21:17:34] yep [21:18:01] done [21:18:39] 04/26/2012 - 21:18:38 - Creating a home directory for tstarling at /export/home/scribunto/tstarling [21:19:39] 04/26/2012 - 21:19:39 - Updating keys for tstarling [21:20:52] Ryan_Lane: thanks! [21:21:02] yw [21:21:12] Ryan_Lane: is labs super slow right now? [21:21:20] yes [21:23:49] it should ease up soon [21:24:17] Ryan_Lane: what is the issue? [21:24:22] IO [21:29:03] RECOVERY Current Load is now: OK on bots-sql3 i-000000b4 output: OK - load average: 1.57, 3.58, 4.61 [21:29:37] Ryan_Lane: what do I need to allow port 80 to scribunto? [21:29:42] heh [21:29:47] Ryan_Lane: I've got https://labsconsole.wikimedia.org/wiki/Special:NovaSecurityGroup [21:29:50] you should have made another security group first [21:30:00] if you didn't, then you can't add it to the instance now [21:30:06] (this is a really stupid ec2 limitation) [21:31:50] Ryan_Lane: argh [21:31:57] Ryan_Lane: okay, deleting and recreating [21:33:27] well, you could have also just added port 80 to default [21:33:43] there's a crappy side effect of that, which is that all instances in that project then have port 80 open [21:35:44] PROBLEM host: scribunto is DOWN address: i-0000022b CRITICAL - Host Unreachable (i-0000022b) [21:38:12] Ryan_Lane: yeah, I wanted to avoid that [21:38:19] * Ryan_Lane nods [21:38:33] I hope the nova api lets us modify the groups [21:40:45] RECOVERY host: scribunto is UP address: i-0000022c PING OK - Packet loss = 0%, RTA = 0.97 ms [21:42:02] preilly: lemme clear the dns cahe for you [21:42:32] Ryan_Lane: thanks [21:42:41] done [21:54:48] paravoid: http://www.meetup.com/openstack/events/62289442/ [21:55:35] another openstack meeting? [21:56:28] it's a meetup [21:56:33] they have them in SF every other week [21:56:41] and in the south bay every other week [22:07:08] PROBLEM Puppet freshness is now: CRITICAL on swift-be4 i-000001ca output: Puppet has not run in last 20 hours [22:56:08] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 6.34, 18.78, 11.12 [23:11:08] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.32, 1.37, 4.50