[02:45:18] I'm getting intermittent 503 Service unavailable from lab bits: [02:45:22] e.g. Request: GET http://bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/load.php?debug=true&lang=en&modules=ext.flow.templating&only=scripts&skin=vector&version=20140627T000400Z&*, from 198.73.209.2 via deployment-cache-bits01 deployment-cache-bits01 ([10.68.16.12]:80), Varnish XID 403561897 [02:45:27] goodnight [05:18:02] 3Wikimedia Labs: Puppet is erasing growthdoc.wmflabs.org and proveit.wmflabs.org sites-enabled - 10https://bugzilla.wikimedia.org/66751#c21 (10Matthew Flaschen) 5PATC>3RESO/FIX Thanks, I added symbolic links there, pointing to sites-available, and it's fixed. For the record, sites-available was never bein... [11:41:08] I'd like to allow a new port for the math group on this page [11:41:09] https://wikitech.wikimedia.org/wiki/Special:NovaSecurityGroup [11:41:35] I don't know what to enter to the form that open if I click add rule [11:41:43] the error message is not very specific [11:41:51] It says failed to add group [11:48:06] ok my mistake was that I did not leave the source group field empty [12:21:02] 3Wikimedia Labs / 3tools: Add some of the missing tables in commonswiki_f_p - 10https://bugzilla.wikimedia.org/59683#c10 (10merl) On s5 SELECT * FROM commonswiki_f_p.recentchanges limit 1; fails with an error: ERROR 1296 (HY000): Got error 10000 'Error on remote system: 1054: Unknown column 'rc_moved_to_n... [13:36:48] Hi all, hi Coren -> have you seen this? https://bugzilla.wikimedia.org/show_bug.cgi?id=59683#c10 [13:39:11] * Coren checks [13:39:50] Auuugh. Sometimes, the inconsistencies is our prod schemas make me want to scream. [14:04:09] * Coren tries to figure it out. [14:08:09] Run update.php. :P [15:46:16] Nemo_bis: The problem is when creating views from one DB to another; it gets "fun" when the view you have to create depends on which db is the target even though it's ostensibly the same. [17:54:32] Coren: still around? :) [18:14:42] YuviPanda: On and off. [18:16:21] Coren: ah, ok. have a bunch of trivial tools patches that I'll add you as a reviewer to, do +2 when you're 'On' :) [18:16:43] Linky, I'll clikcy. [18:17:20] Coren: oh, coming up [18:17:21] Coren: https://gerrit.wikimedia.org/r/#/c/142631/ [18:17:29] Coren: https://gerrit.wikimedia.org/r/#/c/142732/ [18:17:38] Coren: https://gerrit.wikimedia.org/r/#/c/142812/ [18:17:51] Coren: https://gerrit.wikimedia.org/r/#/c/142790/ [18:17:55] Coren: https://gerrit.wikimedia.org/r/#/c/142792/ [18:17:58] Coren: https://gerrit.wikimedia.org/r/#/c/142793/ [18:18:02] all trivial changes to let us monitor more things [18:19:59] Coren: :D just that one isn't monitoring :) rest all are. [18:23:49] Coren: woot! :D ty [18:24:45] Coren: I'm wondering what else collectors would be useful. mysql for tools-db and the slaves, and then a grid engine one, I think. both need some work, so I'll do them over the next few days [18:25:04] Coren: how well puppetized is our grid? [18:26:45] YuviPanda: Almost not at all, except for packages. Most of gridengine is configured runtime and not from the actual nodes. Finding a clean way to puppetize that is on my short-time TODO since ori might have some use for a general gridengine class. [18:26:56] Coren: ah, cool. [18:27:16] Without puppet resource collection, though, it's going to be "amusing" [18:28:05] We can't do puppetdb because security [18:28:11] rightg [18:28:12] *right [18:28:22] Coren: that also was the problem with icinga. [18:28:29] which is why I did graphite now [18:30:15] Coren: hmm, interesting. puppet on tools-webproxy has been failing for quite a while. [18:30:15] err: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to fetch instance ID at /etc/puppet/modules/base/manifests/init.pp:198 on node i-000000e6.eqiad.wmflabs [18:30:28] Coren: seems unrelated to merged patches, since log says this has been happening before that as well [18:30:58] Huh. [18:31:21] Coren: preceeded by 'Could not retrieve ec2id: private method `chomp' called for nil:NilClass' [18:31:22] I've seen that before; give me a moment to recall. [18:31:25] Coren: ok [18:32:06] IIRC that's caused by a 500 at a bad time, and hoses all future runs until there's a fix by hand, but I can't recall what and where. Lemme look into it. [18:32:26] ok! [18:33:33] Ah, right, the intermediate cause is facter not setting ec2id [18:34:18] For that matter, facter isn't setting most things. hmm. [18:35:29] puppetversion => 2.7.11 [18:35:38] I wonder if that's the issue. [18:36:02] At the very least, it means it's been broken since the move to 3 [18:36:14] yeah, that sounds plausible [19:52:34] 3Wikimedia Labs / 3tools: Tools using Tomcat missing in tool list - 10https://bugzilla.wikimedia.org/67259 (10Peter Schlömer (dapete)) 3NEW p:3Unprio s:3minor a:3Marc A. Pelletier Tools without a public_html directory are not included in the tool list on http://tools.wmflabs.org/. I had deleted it fo... [20:52:33] Damianz: hey! I'm thinking of renaming charcoal.wmflabs.org (where a graphite lives, and is getting data from all labs nodes, and isn't behind a password wall) to graphite.wmflabs.org (which I don't know what sends events to, and is behind a password wall I can't access). Is that ok with you? [21:07:54] !log tools removed alias for tools-webproxy and tools.wmflabs.org from /etc/hosts on tools-webproxy [21:07:56] Logged the message, Master [21:13:51] Coren: one thing we should eventually do is to kill toolsbeta as it exists, and then run it as an actual clone, with a self-hosted puppetmaster. should be way more useful. [21:18:57] anyway, off to sleep [21:18:59] thanks for the merges!