[00:01:17] <wikibugs>	 6Labs, 10Beta-Cluster, 10Labs-Infrastructure, 6operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#1629101 (10Dzahn)
[00:15:33] <Krenair>	 andrewbogott, is wikitech behaving badly for you as well?
[00:16:03] <Krenair>	 ah, nope
[00:16:05] <Krenair>	 my bad
[00:16:11] <Krenair>	 thought login was broken
[00:20:46] <leila>	 helloo YuviPanda. I think we are very close to finish the survey. just sent you an email about it. we can wrap it up tomorrow if you'll be around, or later tonight.
[00:23:32] <YuviPanda>	 leila: looking!
[00:48:35] <wikibugs>	 6Labs, 10Tool-Labs, 5Patch-For-Review: new labs host sends out "mpt raid status change" emails - https://phabricator.wikimedia.org/T104779#1629216 (10scfc) 5stalled>3Resolved a:3scfc On a new Trusty instance, there was no `mpt-statusd` process, so I think this is resolved.
[00:51:56] <YuviPanda>	 leila: replied
[00:53:22] <YuviPanda>	 leila: <3 thank you very much!
[01:08:09] <wikibugs>	 6Labs, 7Shinken: Newly created instance is in ERROR state - https://phabricator.wikimedia.org/T111988#1629267 (10scfc) 5Open>3Resolved a:3Andrew
[01:08:50] <wikibugs>	 6Labs, 7Shinken: Newly created instance is in ERROR state - https://phabricator.wikimedia.org/T111988#1621894 (10scfc) (I didn't try to "salvage" the newly created instance, but did just create another one, and that went fine as usual.)
[01:19:50] <tgr>	 andrewbogott: I created sentry-alpha4.sentry.eqiad.wmflabs during the openstack upgrade (sorry about that), and it seems to be stuck in some half-existing state
[01:19:57] <tgr>	 can you remove it?
[01:41:13] <wikibugs>	 6Labs, 6Discovery, 10Maps: Replacements for a.toolserver.org, b.toolserver.org, c.toolserver.org not available - https://phabricator.wikimedia.org/T103272#1629295 (10Krinkle) This is causing SSL certificate warnings on production wikis at three levels:  https://nl.wikipedia.org/wiki/Amsterdam -> "Kaart" (Map...
[03:31:28] <YuviPanda>	 !log ores restart redis server on ores-redis-02 to apply tcp-keepalive
[03:31:33] <labs-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master
[09:16:42] <wikibugs>	 6Labs, 10Tool-Labs, 10Continuous-Integration-Config, 5Patch-For-Review: Change sid pbuilder image name to 'unstable' - https://phabricator.wikimedia.org/T111097#1629825 (10hashar) https://gerrit.wikimedia.org/r/#/c/237604/ creates a symlink for unstable to sid as suggested by @akosiaris above.
[09:18:04] <wikibugs>	 6Labs, 10Tool-Labs, 10Continuous-Integration-Config, 5Patch-For-Review: Change sid pbuilder image name to 'unstable' - https://phabricator.wikimedia.org/T111097#1629829 (10hashar) a:5akosiaris>3hashar
[09:18:18] <wikibugs>	 6Labs, 10Tool-Labs, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Change sid pbuilder image name to 'unstable' - https://phabricator.wikimedia.org/T111097#1594382 (10hashar)
[09:18:37] <wikibugs>	 6Labs, 10Tool-Labs, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Change sid pbuilder image name to 'unstable' - https://phabricator.wikimedia.org/T111097#1594382 (10hashar) p:5Triage>3Normal
[09:40:55] <wikibugs>	 6Labs, 10Labs-Infrastructure, 5Continuous-Integration-Scaling: curl http://169.254.169.254/latest/meta-data/public-keys/ is unavailable - https://phabricator.wikimedia.org/T112001#1629847 (10hashar) It works now!  On the `contintcloud` project I have generated a ssh key pair `hashar-cloudinit-keypair`.  Boot...
[09:41:06] <wikibugs>	 6Labs, 10Labs-Infrastructure, 5Continuous-Integration-Scaling: curl http://169.254.169.254/latest/meta-data/public-keys/ is unavailable - https://phabricator.wikimedia.org/T112001#1629848 (10hashar) 5Open>3Resolved a:3hashar   @andrew I am not sure whether you fixed it over night  / Juno upgrade fixed...
[09:43:56] <wikibugs>	 6Labs, 10Tool-Labs, 10Continuous-Integration-Config: Job labs-toollabs-debian-glue is failing for labs/toollabs repository - https://phabricator.wikimedia.org/T110939#1629857 (10hashar)
[09:43:58] <wikibugs>	 6Labs, 10Tool-Labs, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Change sid pbuilder image name to 'unstable' - https://phabricator.wikimedia.org/T111097#1629855 (10hashar) 5Open>3Resolved Solved by using a symlink from unstable to sid.  Thank you @akosiaris for the suggestion.
[09:48:45] <Cblair91>	 Howdy, can anybody help me. Having issues ssh(ing) to gerrit... Error: Unable to negotiate with 208.80.154.81: no matching key exchange method found. Their offer: diffie-hellman-group1-sha1
[10:40:02] <hashar>	 Cblair91: yeah we have a bug for that
[10:40:24] <hashar>	 Cblair91: https://phabricator.wikimedia.org/T112025 "Wikimedia Gerrit doesn't work if OpenSSH version is higher than 7.0
[10:40:25] <hashar>	 "
[10:40:54] <hashar>	 Cblair91: so in your ~/.ssh/config  you want to fallback to an older algo:
[10:40:57] <hashar>	 Host gerrit.wikimedia.org
[10:40:57] <hashar>	     KexAlgorithms +diffie-hellman-group1-sha1
[10:47:47] <Cblair91>	 hashar: Thanks, will try that later. Decided to fall back and use a box server I have to handle my git stuff instead :P
[11:11:04] <wikibugs>	 6Labs, 10Salt, 6operations: salt does not run reliably for toollabs / labs generally - https://phabricator.wikimedia.org/T99213#1630013 (10ArielGlenn) so the reason that keys don't get deleted from salt via this script when the instance is deleted is that (some of) them stay around in ldap.  Is that intentio...
[12:50:38] <wm-bot>	 Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Wikiscan was created, changed by Wikiscan link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Wikiscan edit summary: Created page with "{{Tools Access Request |Justification=Statistics for wikiscan.org |Completed=false |User Name=Wikiscan }}"
[13:07:19] <wikibugs>	 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109, 3labs-sprint-113: Setup monitoring and reporting for disk space usage of each project on NFS - https://phabricator.wikimedia.org/T106476#1630316 (10coren) It turns out that the scheme I had thought of is considerably less useful than I had init...
[13:23:19] <Nemo_bis>	 What's a reasonable time to wait for a reboot to complete (ordered via Special:NovaInstance)?
[13:24:40] <Coren>	 Nemo_bis: That really depends on a lot of things.  There is a puppet run at boot by default which can add minutes to this - especially if it hasn't been run in a while.  In general, though it should be a bit below the 5 minute mark at worse.
[13:24:56] <Coren>	 Best case is about 1 minute.
[13:27:32] <Coren>	 If it seems to have been stuck for longer than than, the console output might point at a specific issue.
[13:32:08] <Nemo_bis>	 Coren: that's what worries me, the console output is blank. :) https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=consoleoutput&instanceid=20853100-12c6-480d-9eb7-a3d9e1864280&project=pagemigration&region=eqiad
[13:32:41] <Nemo_bis>	 But this instance was supposed to be rebooted in one of the past maintenances IIRC, perhaps it failed back then and is now irrecoverable. Can I just delete it?
[13:33:30] <Coren>	 That points to an issue, although not a very specific one.  :-)  Yeah, you can delete it if it is disposable.  In fact, that's probably the best option unless you have valuable data on it.  If you want, though, I can give a quick try to a forcible manual restart first.
[13:34:28] <Nemo_bis>	 Nah, not worth it.
[13:34:47] <Coren>	 Fair 'nuff.
[13:35:22] <wikibugs>	 6Labs, 3Labs-sprint-112, 3ToolLabs-Goals-Q4, 3labs-sprint-113: Fix documentation & puppetization for labs NFS - https://phabricator.wikimedia.org/T88723#1630393 (10mark) I think a diagram would be really helpful too...
[14:38:43] <andrewbogott>	 tgr|away: alpha4 is cleaned up now… nova was just waiting for that virt node to come back online.
[14:40:57] <wikibugs>	 6Labs, 10Labs-Infrastructure, 3Labs-sprint-112, 5Patch-For-Review, 3labs-sprint-113: Update remaining virt nodes to kilo - https://phabricator.wikimedia.org/T112200#1630574 (10Andrew) labvirt1002 done
[14:49:36] <wikibugs>	 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109, 3labs-sprint-113: Setup monitoring and reporting for disk space usage of each project on NFS - https://phabricator.wikimedia.org/T106476#1630590 (10scfc) What about not monitoring disk usage after the fact, but instead (always) creating an volu...
[14:51:21] <halfak>	 !log ores removed ores-web-02 from ores-lb-02 pool
[14:51:26] <labs-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master
[14:57:19] <grrrit-wm>	 (03CR) 10Niedzielski: "@Yuvipanda, please review :)" [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/231697 (https://phabricator.wikimedia.org/T99115) (owner: 10Niedzielski)
[15:00:53] <wm-bot>	 Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Wikiscan was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=177179 edit summary: 
[15:15:52] <grrrit-wm>	 (03CR) 10Tim Landscheidt: "recheck" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/234934 (https://phabricator.wikimedia.org/T91231) (owner: 10Tim Landscheidt)
[16:03:27] <wikibugs>	 6Labs, 10Beta-Cluster, 10Labs-Infrastructure, 6operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#1630870 (10BBlack)
[16:59:52] <wikibugs>	 6Labs, 10Tool-Labs: role::relic - changes not applied by puppet? on which node or instance is it? - https://phabricator.wikimedia.org/T104537#1631214 (10coren) 5Open>3Resolved role::relic is currently enabled on the instance: https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=conf...
[17:02:07] <wikibugs>	 6Labs, 7Shinken: Newly created instance is in ERROR state - https://phabricator.wikimedia.org/T111988#1631240 (10scfc) 5Resolved>3Open http://permalink.gmane.org/gmane.org.wikimedia.labs/4039 said that the issue should have been resolved, but it has reappeared for me when I try to create new instances.  I...
[17:15:52] <wikibugs>	 6Labs: Bring toolserver.org redirects back - https://phabricator.wikimedia.org/T109488#1631324 (10scfc) 5Open>3Resolved a:3scfc http://permalink.gmane.org/gmane.org.wikimedia.labs.announce/76:  > As an update: The security team has completed their review, and the > redirects are back online. Thank you for...
[17:16:02] <wikibugs>	 6Labs: Bring toolserver.org redirects back - https://phabricator.wikimedia.org/T109488#1631327 (10scfc) a:5scfc>3None
[18:23:47] <hashar>	 YuviPanda: hello! do we have a grafana for labmon1001 statsd ? 
[18:24:36] <YuviPanda>	 No hashar
[18:25:01] <hashar>	 YuviPanda: and I guess servers on the labs host subnet can't reach statsd.eqiad.wmnet but should send to labmon1001 right ?
[18:25:14] <YuviPanda>	 Yup
[18:26:32] <hashar>	 though labmon1001 has 
[18:26:33] <hashar>	 ./puppet/statsd.yaml::statsd_host: 'statsd.eqiad.wmnet'
[18:26:33] <hashar>	 ./puppet/statsd.yaml::statsd_port: 8125
[18:26:55] <hashar>	 and I can't find where its diamond metrics are send to :/
[18:27:41] <hashar>	 ah to statsd.eqiad.wmnet
[18:27:43] <hashar>	 :-D
[18:28:36] <YuviPanda>	 :)
[18:28:38] <YuviPanda>	 yesss
[18:47:25] <wikibugs>	 6Labs, 5Patch-For-Review: Create a catchpoint check for labs puppetmaster - https://phabricator.wikimedia.org/T107456#1631817 (10Andrew) This is done.  Yuvi, check my work?
[19:31:50] <wikibugs>	 6Labs: Setup checkpoint check for private DNS - https://phabricator.wikimedia.org/T107453#1631995 (10Andrew) I've verified that socket.gethostbyname_ex errors out in absence of upstream dns, even if called on the current host's fqdn.  So this check should be as simple as socket.gethostbyname_ex(socket.getfqdn())
[19:45:49] <wikibugs>	 6Labs: Have checkpoint check for public labs DNS - https://phabricator.wikimedia.org/T107451#1632077 (10Andrew) 5Open>3declined a:3Andrew I don't think this is useful... if public dns fails then /all/ of our catchpoint alerts will fail.
[19:45:50] <wikibugs>	 6Labs, 3Labs-Sprint-108, 3Labs-Sprint-109, 3labs-sprint-113: Have catchpoint checks for all labs services (Tracking) - https://phabricator.wikimedia.org/T107058#1632080 (10Andrew)
[19:59:44] <wikibugs>	 6Labs, 3Labs-Sprint-108, 3Labs-Sprint-109, 3labs-sprint-113: Have catchpoint checks for all labs services (Tracking) - https://phabricator.wikimedia.org/T107058#1632152 (10Andrew)
[19:59:44] <wikibugs>	 6Labs: Have a checkpoint check for labs proxies - https://phabricator.wikimedia.org/T107450#1632150 (10Andrew) 5Open>3Resolved a:3Andrew
[20:00:02] <wikibugs>	 6Labs, 3Labs-Sprint-108, 3Labs-Sprint-109, 3labs-sprint-113: Have catchpoint checks for all labs services (Tracking) - https://phabricator.wikimedia.org/T107058#1485532 (10Andrew)
[20:06:55] <wikibugs>	 6Labs, 7Shinken: Newly created instance is in ERROR state - https://phabricator.wikimedia.org/T111988#1632208 (10Andrew) I'm in the process of trying to convince the scheduler to fill up virt node harddrives up to 100%.  The disk checks are a bit erratic, though -- it'll run twice in a row, the first time decl...
[20:18:50] <wikibugs>	 6Labs: Have checkpoint check for public labs DNS - https://phabricator.wikimedia.org/T107451#1632254 (10coren) >>! In T107451#1632077, @Andrew wrote: > I don't think this is useful... if public dns fails then /all/ of our catchpoint alerts will fail.  I think that's a bug - is there a //requirement// to have che...
[20:19:49] <wikibugs>	 6Labs: Have checkpoint check for public labs DNS - https://phabricator.wikimedia.org/T107451#1632257 (10yuvipanda) If we want to fix that we can make all the checks use IPs instead of DNS from catchpoint's side.
[20:29:19] <wikibugs>	 6Labs: Have checkpoint check for public labs DNS - https://phabricator.wikimedia.org/T107451#1632291 (10coren) I think that would be a good idea - being able to distinguish between "labs broke" and "DNS has failed" is important to direct recovery efforts.
[20:31:49] <wikibugs>	 6Labs, 6Discovery, 7Elasticsearch: Replicate production elasticsearch indices to labs - https://phabricator.wikimedia.org/T109715#1632310 (10demon) >>! In T109715#1627648, @yuvipanda wrote: > We'd also need to make sure that deleted / revdelled content doesn't show up.  Deleted content disappears from the pr...
[20:42:42] <Gerghwww>	 Hi, what happened to https://tools.wmflabs.org/ ??
[20:43:08] <Cblair91>	 Gerghwww: What're you trying to locate? :)
[20:43:28] <mutante>	 what's wrong with it?
[20:43:41] <mutante>	 i still see the list of tools
[20:43:49] <mutante>	 oh.. or not
[20:43:54] <Gerghwww>	 nothing special... but the page is skipped after "addbot"
[20:44:07] <mutante>	 yea, just scrolled down .. eh
[20:44:11] <Cblair91>	 So it is :P
[20:44:15] <mutante>	 that's odd
[20:44:27] <Gerghwww>	 take a look at the source code. stopping at line 156
[20:47:03] <Cblair91>	 Seems to be an issue somewhere located here: http://git.wikimedia.org/blob/labs%2Ftoollabs.git/f275d97d7010b3bb2709d4a5211e2530df178447/www%2Fcontent%2Flist.php#L101
[20:47:28] <Cblair91>	 Presumably throwing an exception =]
[20:50:02] <mutante>	 reload the page , guys
[20:50:12] <mutante>	 Coren fixed
[20:50:42] <Coren>	 Yeah, the actual webservice was cray-cray.
[20:51:41] <mutante>	 Gerghwww: works again
[20:52:00] <Gerghwww>	 looks good. thx
[21:01:39] <leila>	 hello yuvipanda.
[21:01:47] <leila>	 shall we chat about the survey some time today yuvipanda?
[21:02:51] <YuviPanda>	 leila: yes. are you in the office?
[21:02:55] <YuviPanda>	 leila: I also responed on the etherpad!
[21:03:00] <leila>	 I'm working remotely yuvipanda.
[21:03:08] <leila>	 ah! lemme check your responses first then, yuvipanda.
[21:03:58] <wikibugs>	 6Labs, 10Tool-Labs: Make tools-mail route mail for @tools-*.pmtpa.wmflabs correctly - https://phabricator.wikimedia.org/T63484#1632479 (10scfc) I set `/etc/mailname` to `tools.wmflabs.org`, restarted `gridengine-master`, submitted a job on `toolsbeta-master` and `qstat -j 4` still gave:  ``` […] mail_list:...
[23:40:04] <grrrit-wm>	 (03CR) 10Yuvipanda: "Sorry about the delay!" (032 comments) [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/231697 (https://phabricator.wikimedia.org/T99115) (owner: 10Niedzielski)
[23:57:49] <Krinkle>	 !log cvn Restore localised messages (nl) for CVNBot12
[23:57:54] <labs-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cvn/SAL, Master