[00:27:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [00:57:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [01:27:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [01:57:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [02:27:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [02:57:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [03:16:04] silly me. just tried to get to a labs host and had a DNS failure so i thought maybe DNS was broken still. turns out my wifi was still disabled. ;P [03:27:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [03:32:21] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 13% free memory [03:37:21] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 17% free memory [03:38:20] jeremyb: I'm refactoring, but I was going to ask you to try to break something for me? http://occupymediawiki.org/apprount/index.php/search=Category:Potawot [03:40:08] break what? sleeping soon [03:40:37] just see if there are any obvious security issues? [03:41:04] (or any other obvious bugs) [03:42:00] uhhh, ask me monday or tuesday? ;) [03:42:22] or read some OWASP ;) [03:42:49] Tuesday mayhap. Thanks! [03:43:00] (looks up OWASP) [03:47:21] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 15% free memory [03:52:21] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 15% free memory [03:52:21] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 5% free memory [03:57:21] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 3% free memory [03:57:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [04:02:21] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 94% free memory [04:02:21] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:07:21] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 5% free memory [04:07:21] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 3% free memory [04:12:21] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 97% free memory [04:12:21] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 94% free memory [04:22:48] good night ;) [04:27:51] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [04:35:01] PROBLEM Free ram is now: WARNING on test3 i-00000093 output: Warning: 7% free memory [04:40:00] RECOVERY Free ram is now: OK on test3 i-00000093 output: OK: 96% free memory [04:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [05:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [05:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [06:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [06:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [07:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [07:39:54] !nagios.wmflabs.org is 208.80.153.210 [07:39:55] Key was added [07:40:13] !nagios del [07:40:13] Successfully removed nagios [07:40:45] !nagios is http://208.80.153.210/nagios3 http://nagios.wmflabs.org/nagios3 [07:40:45] Key was added [07:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [08:21:44] !log deployment-prep hashar: updating TimedMediaHandler and MwEmbedSupport to their latest versions [08:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [08:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [09:00:00] PROBLEM Free ram is now: WARNING on incubator-bot2 i-00000252 output: Warning: 19% free memory [09:22:28] heh dns outage [09:22:50] I have 2 dumps-2 entries in NovaInstance currently [09:25:36] Hydriz: the deleted one isn't gone? [09:25:42] yeah :( [09:26:24] hm the services are all showing as up [09:26:27] what's the id? [09:26:39] have you tried deleting it again? [09:26:51] id is i-00000257 [09:27:03] I recreated it already, and the new id is i-000002d8 [09:27:16] * Hydriz tries deleting it again [09:27:47] Message: "Successfully deleted instance, but failed to remove dumps-2 DNS entry." [09:28:13] ah [09:28:14] got it [09:28:17] finally :P [09:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [09:32:21] yeah. old deletion requests were dropped [09:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [10:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [10:44:18] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11261 [10:44:21] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11261 [10:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [11:07:58] Ryan_Lane: Does the DNS change refer to the wmflabs.org domain? [11:08:32] we changed the NS records for the domain [11:09:17] you should also resolve bug 36885 in the meantime :) [11:14:17] which bug is that? [11:14:26] !bug 36885 [11:14:26] https://bugzilla.wikimedia.org/36885 [11:15:10] I'm not screwing around with dns at all until at minimum this situation is solved [11:15:23] heh okie :) [11:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [11:31:22] New review: Faidon; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11359 [11:31:24] Change merged: Faidon; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11359 [11:36:55] New review: Hashar; "For production branch: https://gerrit.wikimedia.org/r/11605" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11359 [11:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [12:25:10] RECOVERY Free ram is now: OK on incubator-bot2 i-00000252 output: OK: 29% free memory [12:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [12:37:01] New patchset: Hashar; "Explicitly define fonts package for Precise" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11358 [12:37:20] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/11358 [12:38:42] New review: Faidon; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11358 [12:44:51] New patchset: Hashar; "Explicitly define fonts package for Precise" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11358 [12:45:10] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/11358 [12:46:14] New review: Hashar; "Patchset 4 moves fonts related packages from imagescaler::packages to imagescaler::packages::fonts" [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11358 [12:49:40] New patchset: Hashar; "Explicitly define fonts package for Precise" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11358 [12:49:59] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/11358 [12:50:02] New review: Hashar; "Patchset 5 rebase / fix conflict." [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11358 [12:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [13:05:44] New review: Faidon; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11358 [13:05:46] Change merged: Faidon; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11358 [13:15:57] New review: Reedy; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/6541 [13:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [13:28:44] New patchset: Hashar; "RT #3117 phase out wikimedia-fonts package" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11612 [13:29:03] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/11612 [13:30:34] New review: Hashar; "https://rt.wikimedia.org/Ticket/Display.html?id=3117" [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11612 [13:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [14:28:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [14:58:30] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [15:28:32] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [15:58:32] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [16:28:32] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [16:29:31] both labsconsole and virto.wikimedia.org seem to be down for me [16:30:48] they are listed as the NS resolvers for labs, so labs instances don't resolve currently [16:31:40] make that virt0 btw [16:32:54] gwicke, and if you ask to ns0.wikimedia.org ? [16:33:03] there were some dns errors yesterday [16:33:16] and some entries would need propagation [16:34:13] ns0 doesn't return anything, but labs-ns0 does [16:34:51] neither labs-ns0 nor labs-ns1 are listed in the whois info though [16:35:38] so an update to that info might be enough to fix the issue [16:52:20] no [16:52:37] we had a couple things wrong [16:52:51] which will hopefully be fixed when the NS records expire in all resolvers [16:53:19] the resolvers currently are labsconsole.wikimedia.org and virt0.wikimedia.org [16:54:41] switching to google dns for the time being will solve the issue [16:56:40] monday we'll be switching to labs-ns0 and labs-ns1 [16:58:32] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [17:28:32] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [17:58:32] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [18:28:48] Ryan_Lane: both labsconsole and virt0 are not reachable here- are their IPs incorrectly cached? [18:29:00] as mentioned, switch to google dns for now [18:29:03] we're working on it [18:29:08] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [18:30:41] ok [18:44:34] PROBLEM Current Load is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Error - Could not complete SSL handshake. [18:45:14] PROBLEM Current Users is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Error - Could not complete SSL handshake. [18:45:54] PROBLEM Disk Space is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Error - Could not complete SSL handshake. [18:46:24] PROBLEM Free ram is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Error - Could not complete SSL handshake. [18:47:44] PROBLEM Total Processes is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Error - Could not complete SSL handshake. [18:48:24] PROBLEM dpkg-check is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:00:04] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [19:30:04] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [20:00:04] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [20:30:07] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [20:54:48] Could someone remind me of the procedure for being added on a project? [20:55:02] (if it amounts to "ask in this channel", could someone add me to the Etherpad project?) [20:55:14] https://www.mediawiki.org/wiki/Git/Gerrit_project_ownership [20:55:21] ah, for labs [20:55:29] *nods* [20:55:31] I was still thinking in gerrit [20:55:40] you ask a project admin [20:55:44] who owns that project? [20:55:55] 5 people [20:56:16] probably any of them could [20:56:24] Ryan_Lane, Wikinaut, Johnduhart, Abartov, and dzahn [20:56:40] Ryan_Lane, I think you're the winner, could you add me to the Etherpad project on labs? [20:59:26] xD [21:00:19] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [21:14:24] Hi [21:19:18] I would like to create a new security group for my commons-dev project When I go to https://labsconsole.wikimedia.org/wiki/Special:NovaSecurityGroup I got tables for the two existing groups in bastion and commons-dev [21:19:36] I can't figure where to create the new set of rules [21:30:19] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [21:30:45] Dereckson, on that page, next to the project name, "Add group"? Is that it? [21:33:43] No, I don't have those links. I wonder if I'm net admin on my project and not only project admin. How could I check that? [21:35:28] Dereckson, https://labsconsole.wikimedia.org/wiki/Special:NovaProject [21:35:35] (I think) [21:42:26] Indeed, that's it, Dzahn is netadmin, I'm sysadmin [21:52:45] I sent a mail to request the netadmin right. [22:00:19] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [22:30:19] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [23:00:19] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [23:10:05] andrewbogott, around? [23:30:19] PROBLEM host: wikistats-archive is DOWN address: i-000002d7 check_ping: Invalid hostname/address - i-000002d7 [23:30:56] Eloquence: I am [23:31:01] Eloquence: anything I can do? [23:31:12] paravoid, having trouble getting into the mwreview instance from bastion [23:31:21] not sure why ..did anything change about labs auth? [23:31:28] not as far as I know [23:31:32] let me check [23:31:46] just getting Permission denied (publickey). [23:35:46] Eloquence: Did you forward your key? [23:35:52] yep [23:36:06] erik@WMF359:~$ ssh -A bastion.wmflabs.org [23:36:17] ssh-agent is running [23:36:35] forwarding works fine via fenari [23:36:46] is this a new machine? [23:36:52] mwreview? [23:36:55] yes [23:37:04] it was set up a couple weeks ago .. I was able to ssh into it before, not sure if andrew changed anything [23:37:10] that's why I prodded him first [23:37:19] Is it a precise instance? [23:37:52] There was an auth bug affecting precise instances only earlier, Ryan said he'd fixed it and it worked for mine [23:38:09] I believe it's precise, yes [23:38:09] it is, yes [23:38:14] I just looked at its console output [23:38:46] let me look at puppet's git history [23:39:58] oh, it's probably from when he changed the authorized keys filepath [23:44:18] I'm trying to add my key to root's authz keys [23:47:19] ah, found it [23:47:25] it's on the private repo… [23:49:22] New patchset: Faidon; "Add faidon to root's authorized keys" [labs/private] (master) - https://gerrit.wikimedia.org/r/11718 [23:49:40] New review: Faidon; "(no comment)" [labs/private] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11718 [23:49:50] New review: Faidon; "(no comment)" [labs/private] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11718 [23:49:52] Change merged: Faidon; [labs/private] (master) - https://gerrit.wikimedia.org/r/11718 [23:50:41] Eloquence: puppet runs every 20', so it will take as much as this for me to able to do a root login [23:51:08] ok, I'll be around, no urgency - thanks for poking [23:51:51] that's for me to able to login and troubleshoot, the problem is yet to be fixed [23:52:35] hm [23:52:39] on precise, right? [23:52:42] which instance? [23:53:19] mwreview? [23:53:25] oh hi [23:53:26] yes [23:54:03] Ryan_Lane: btw, whatever you did on markmonitor it didn't work [23:54:10] yeah. I see that [23:54:15] we'll need to try labs-ns0/ns1 [23:54:39] I don't think that it's needed, but it won't hurt either [23:54:40] I'd kill for salt, or mcollective, or something like that right now [23:54:46] what for?! [23:54:57] I guess I can use my dsh python script for this [23:55:01] for what? [23:55:09] the precise nodes need autofs restarted [23:55:21] Eloquence: it should work niw [23:55:22] *now [23:55:35] whee [23:55:38] thanks :) [23:55:40] yw [23:55:52] there's something different between precise and lucid for autofs [23:56:03] reload worked properly for lucid [23:56:11] restart is needed on the precise nodes [23:56:31] I guess I can limit the search by image id [23:58:29] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 14% free memory [23:59:18] ami-00000026 [23:59:25] easy enough [23:59:30] lemme fix all of them right now :)