[00:14:26] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2485961 (10yuvipanda) i think sed and grep aren't installed.... [00:14:34] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2485962 (10yuvipanda) neither is jq [03:41:47] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Striker: Deploy "Striker" Tool Labs console to WMF production - https://phabricator.wikimedia.org/T136256#2486115 (10bd808) Based on my reading of https://wikitech.wikimedia.org/wiki/MariaDB/misc I think the Striker database should probably live on the m5... [05:30:07] 06Labs, 10Tool-Labs: Webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2486158 (10Gorthian) It's still been unavailable intermittently through the day. It's down at the moment. [05:38:08] 06Labs, 10Tool-Labs: Webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2486161 (10yuvipanda) Seems to be https://phabricator.wikimedia.org/T140988 again, I restarted that and it's back up. [05:38:20] 06Labs, 10Tool-Labs: Webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2486162 (10yuvipanda) Is there some custom code that's attempting to autorestart this tool? [05:52:39] 06Labs, 10Tool-Labs, 06Operations, 10Phabricator, and 2 others: Install Arcanist in toollabs::dev_environ - https://phabricator.wikimedia.org/T139738#2486166 (10mmodell) FWIW I am pretty sure arcanist works with php 5.3. In fact, it works with 5.2: From [[ https://secure.phabricator.com/book/phabricator/... [05:53:33] 06Labs, 10Tool-Labs: Webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2486167 (10yuvipanda) (am asking since no other tools seem to be suffering this right now, so need to figure out what makes this tool special) [06:05:11] 06Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure, 07Tracking: Log files on labs instance fill up disk (/var is only 2GB) (tracking) - https://phabricator.wikimedia.org/T71601#2486184 (10Joe) [08:05:06] 06Labs, 10Tool-Labs: Webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2486293 (10Gorthian) I'm afraid I know nothing about the bot. The nominal maintainer, JaGa, hasn't been active on en.wiki for a month. RussBlau, listed as a maintainer, opened and re-opened this ticket. T... [08:26:00] 06Labs, 10Labs-Infrastructure, 10Shinken, 13Patch-For-Review: labmon1001 renderer requires authentication breaking labs Shinken probes - https://phabricator.wikimedia.org/T140976#2486311 (10hashar) Awesome thank you @yuvipanda [09:18:10] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2486373 (10Magnus) Ah. I guess they were on gridengine, so I just expected the same config... Will you be installing them? [09:33:50] 06Labs, 07Documentation: Document, explain, diagram labs vlans and network setup - https://phabricator.wikimedia.org/T100529#2486380 (10hashar) If you are willing to have the diagram in git puppet.git (together with the hiera file hieradata/common/network.yaml to keep config and doc in sync), there is a neat G... [09:57:54] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2485077 (10tom29739) @Magnus: {T140110} is the tracking task for packages to be installed in Kubernetes containers. You'll need to create a subtask of that. [10:04:04] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 07Tracking: Install jq, sed, grep, sort - https://phabricator.wikimedia.org/T141082#2486419 (10Magnus) [10:04:41] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2486445 (10Magnus) Now T141082 [10:06:08] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 07Tracking: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#2486455 (10tom29739) [10:12:58] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2486495 (10tom29739) [10:13:01] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Install jq, sed, grep, sort - https://phabricator.wikimedia.org/T141082#2486494 (10tom29739) [10:13:09] 06Labs, 10Tool-Labs: Webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2486496 (10russblau) >>! In T115231#2486162, @yuvipanda wrote: > Is there some custom code that's attempting to autorestart this tool? Not that I am aware of. [10:51:16] 06Labs, 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: madhuvishy is moving to operations on 7/18/16 - https://phabricator.wikimedia.org/T140422#2486553 (10elukey) We'd need to fix some pwstore issues so this could be a good occasion to add Madhu's key. Best to wait a bit for Moritz in my opinion! [11:44:36] 10Wikibugs: Make wikibugs use SSL in IRC - https://phabricator.wikimedia.org/T141089#2486660 (10Pokefan95) [11:48:51] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Drop some Trusty permanent slaves from integration labs project - https://phabricator.wikimedia.org/T139535#2486678 (10hashar) 05Open>03Resolved 2 big ones got dropped. That is good enough for now. More will be deleted as jobs are sh... [12:07:04] 06Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure, 07Tracking: Log files on labs instance fill up disk (/var is only 2GB) (tracking) - https://phabricator.wikimedia.org/T71601#2486742 (10hashar) [12:07:06] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: atop (monitoring system) logs fill up instances /var/ partition - https://phabricator.wikimedia.org/T71605#2486740 (10hashar) 05Open>03declined The labs instance now have /var in the / partition that is 20GBytes. So that atop logs size is no more of an i... [12:07:50] 10MediaWiki-extensions-OpenStackManager: Non-Admin users can't see anything in manage addresses interface - https://phabricator.wikimedia.org/T57897#2486743 (10hashar) 05Open>03Resolved a:03hashar That works given you have proper rights. [13:13:24] Guest93561: identify to services? [13:25:30] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2484363 (10chasemp) >>! In T141017#2485695, @Stashbot wrote: > {nav icon=file, name=Mentioned in SAL, href=https://tools.wmflabs.org/sal/log/AVYPn9R-pirJUPGy-mkC} [2016-07... [13:27:32] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2486929 (10chasemp) heavy io usage for particular containers? https://graphite-labs.wikimedia.org/render/?width=1051&height=474&_salt=1469130163.848&target=tools.tools-wo... [13:32:48] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Users can't run EXPLAIN queries to check the theoretical efficiency of their SQL - https://phabricator.wikimedia.org/T141095#2486960 (10chasemp) [13:33:09] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Users can't run EXPLAIN queries to check the theoretical efficiency of their SQL - https://phabricator.wikimedia.org/T141095#2486960 (10chasemp) p:05Triage>03Normal [13:36:16] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Having lots of accounts with separate grants makes auditing difficult. - https://phabricator.wikimedia.org/T141096#2486986 (10chasemp) [13:36:37] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Having lots of accounts with separate grants makes auditing difficult. - https://phabricator.wikimedia.org/T141096#2486986 (10chasemp) p:05Triage>03Normal [13:57:26] PROBLEM - High iowait on tools-worker-1015 is CRITICAL: CRITICAL: tools.tools-worker-1015.cpu.total.iowait (>11.11%) [14:00:31] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: labsdb* has no HA solution - https://phabricator.wikimedia.org/T141097#2487045 (10chasemp) [14:00:33] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2469380 (10Ivanhercaz) Hi! I'm having this issue too. Last night works fine. [14:02:45] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487065 (10Magnus) [14:02:47] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: labsdb* has no HA solution - https://phabricator.wikimedia.org/T141097#2487079 (10jcrespo) I think that, of all things to fix, this one is a hard blocker for T140452. [14:04:56] !log tools reboot tools-worker-1015 as stuck w/ high iowait warning seconds ago. I cannot ssh in as root. [14:05:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [14:06:54] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Users can't run EXPLAIN queries to check the theoretical efficiency of their SQL - https://phabricator.wikimedia.org/T141095#2486960 (10AlexMonk-WMF) Duplicate of {T50875}? [14:12:21] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 07Tracking: Issues with 'webservice' kubernetes backend - https://phabricator.wikimedia.org/T139107#2487107 (10tom29739) [14:12:57] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487110 (10tom29739) [14:12:59] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 07Tracking: Issues with 'webservice' kubernetes backend - https://phabricator.wikimedia.org/T139107#2419652 (10tom29739) [14:17:26] RECOVERY - High iowait on tools-worker-1015 is OK: OK: All targets OK [14:27:06] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2487137 (10chasemp) [14:29:39] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2487146 (10chasemp) so, I just saw a warning from high iowait on worker-1015. I tried to ssh in to no effect. Saw this warning > PROBLEM - High iowait on tools-worker-1... [14:35:39] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 07Tracking: Tool name too long - https://phabricator.wikimedia.org/T141100#2487172 (10Magnus) [14:39:01] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Tool name too long - https://phabricator.wikimedia.org/T141100#2487204 (10Luke081515) [14:39:40] anyone able to get the page ... https://tools.wmflabs.org/wikidata-todo/creator_from_wikidata.php ? [14:40:09] sDrewth: times out for you? [14:41:01] for me too [14:41:28] k, thx [15:46:02] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: labsdb* has no High Availability solution - https://phabricator.wikimedia.org/T141097#2487451 (10bd808) [15:47:07] 06Labs, 10MediaWiki-Vagrant: performance of vagrant in labs instances - https://phabricator.wikimedia.org/T141111#2487460 (10Physikerwelt) [15:48:55] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: labsdb* has no High Availability solution - https://phabricator.wikimedia.org/T141097#2487492 (10chasemp) [16:22:14] 06Labs, 10Tool-Labs: dplbot webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2487652 (10yuvipanda) [16:22:16] 10Tool-Labs-tools-Pageviews, 05Security, 07Vuln-XSS: Non-persistent XSS - tools.wmflabs.org/pageviews - https://phabricator.wikimedia.org/T140752#2487653 (10Bawolff) [16:25:53] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2487677 (10yuvipanda) >>! In T141041#2486373, @Magnus wrote: > Ah. I guess they were on gridengine, so I just expected the same config... > Will you be installing them? I'd like to keep the default containers just have PHP and n... [16:31:49] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2487686 (10yuvipanda) @Ivanhercaz try now? This has been somewhat frustrating to track down since *my* account continues to work fine... [16:48:06] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2487811 (10chasemp) https://graphite-labs.wikimedia.org/render/?width=1600&height=1000&_salt=1469200578.824&target=cactiStyle(log(highestMax(tools.tools-worker-1015.iostat... [16:56:50] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2487877 (10yuvipanda) [16:56:58] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2487891 (10yuvipanda) See T141126 [17:07:10] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2469380 (10Framawiki) Hi ! Same, I've got a 504 now. [17:12:08] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Security-Reviews, 10Striker: Security review of Tool Labs console application - https://phabricator.wikimedia.org/T135784#2487952 (10bd808) @dpatrick Can we call this closed now, or are there other issues that you would like to see addressed? [17:12:40] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487953 (10yuvipanda) Putting data in /shared is a pretty big security risk - anyone can read *and* write into it (I just checked by writing arbitrary data as a different tool into it...) [17:14:08] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487954 (10yuvipanda) (I'll probably mount /shared next week) [17:20:40] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2487958 (10Framawiki) I've tryed to clear the cookies. All the connection part work fine (//Sign in with mediawiki// button...), but when i click on the big green button after OAUTH it don't work again. [17:21:15] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487959 (10yuvipanda) @magnus I've mounted it in /data/project/shared, available in both gridengine and kubernetes. Can you switch to using that path or would that be too much work? [17:25:08] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2487966 (10ellery) [17:25:10] 06Labs: Request creation of detox labs project - https://phabricator.wikimedia.org/T139864#2487965 (10ellery) 05Invalid>03Open [17:25:25] 06Labs: Request creation of readmore labs project - https://phabricator.wikimedia.org/T139863#2487967 (10ellery) 05Invalid>03Open [17:25:27] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#1485353 (10ellery) [17:26:32] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2487969 (10yuvipanda) @Framawiki try now? [17:29:41] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487979 (10Magnus) Done, thanks! [17:30:01] 10Labs-Kubernetes: Kubernetes does not mount shared path - https://phabricator.wikimedia.org/T141098#2487982 (10Magnus) Forgot: Yes, it works on k8s! [17:30:40] !log tools repool tools-worker-1018 [17:30:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [17:36:10] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2488008 (10Framawiki) Yes it work for me ! After click on the green button //start the server// the page take 15-20 seconds to come. Actually I am in the terminal and it work fine. [18:22:41] 10PAWS: Paws display 502 - Bad gateway error - https://phabricator.wikimedia.org/T140578#2488294 (10Ivanhercaz) @yuvipanda Now it works! I had to clean cookies and logout via https://paws.wmflabs.org/paws/hub/logout Then I sign in again and now, it's everything perfect. Thank you for your support! Keep in touch... [18:22:50] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2488295 (10chasemp) also of note from 1015 atop stopped recording at the same time as: * iowait increasing * leading to load increasing * iostat collector failing to repo... [18:25:20] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: labsdb* has no High Availability solution - https://phabricator.wikimedia.org/T141097#2488317 (10chasemp) [18:37:57] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2488477 (10chasemp) 1018 displaying similar behavior yesterday and then a bit of interesting activity recently relative link https://graphite-labs.wikimedia.org/render/... [18:46:12] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Omidfi was created, changed by Omidfi link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Omidfi edit summary: Created page with "{{Tools Access Request |Justification=To help improve Wikipedia |Completed=false |User Name=Omidfi }}" [19:35:54] My instance is asking for a password when I sudo [19:36:11] I'm pretty sure it's not supposed to be doing that [19:36:17] How can I fix it? [19:39:44] tom29739: that usually means that puppet and/or LDAP broke some how [19:40:05] Is it easiest to rebuild the instance? [19:40:34] possibly :/ [19:40:52] you could try a reboot first [19:44:54] bd808, that appears to have fixed it [19:45:18] \o/ Probably just the ldap client that died then [19:46:57] tom29739: a bit of insurance you can setup for yourself is to add a "passwords::root::extra_keys" section to your Hiera config that will let your ssh in as root. Check out https://wikitech.wikimedia.org/wiki/Hiera:Tools for an example [19:47:31] after you put your key into heira like that and puppet runs you can do "ssh root@instance" [19:47:51] which can help you get in to fix other things that have broken [23:34:59] is something broken in labs right now? I created an instance (polestar on maps-team) and it gets stuck on initializing [23:35:13] log ends with Cloud-init v. 0.7.5 finished at Fri, 22 Jul 2016 23:32:04 +0000. Datasource DataSourceOpenStack [net,ver=2]. Up 13.86 seconds [23:52:02] SMalyshev: I.. don't.. [23:52:05] Think so [23:52:36] I created a new instance a few minutes ago and it's worked fine