[00:19:59] !log ores deployed ores-wmflabs-deploy:770d131 [00:20:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [00:20:07] <3 labs-morebots [01:51:04] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Billcountry was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=803628 edit summary: [03:37:14] 10Tool-Labs-tools-Xtools: Unable to login to X!'s Tools - https://phabricator.wikimedia.org/T132855#2504293 (10Matthewrbowker) >>! In T132855#2503575, @MusikAnimal wrote: > Do we even need OAuth anymore? I believe it was introduced for XEcho, and cross-wiki notifications are native now. Yes, it's used on the ed... [04:51:38] (03PS1) 10Legoktm: Ignore @Phabricator_maintenance and clean up code [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/301742 (https://phabricator.wikimedia.org/T141570) [05:10:14] (03CR) 10Legoktm: [C: 032] Ignore @Phabricator_maintenance and clean up code [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/301742 (https://phabricator.wikimedia.org/T141570) (owner: 10Legoktm) [05:10:31] (03Merged) 10jenkins-bot: Ignore @Phabricator_maintenance and clean up code [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/301742 (https://phabricator.wikimedia.org/T141570) (owner: 10Legoktm) [05:13:57] !log tools.wikibugs legoktm: Deployed 79d62890d3d4f56a3f69bad4349795bae6746e50 Ignore @Phabricator_maintenance and clean up code wb2-irc [05:14:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL, Master [05:14:40] 10Wikibugs: Don't notify actions by @Phabricator_maintenance - https://phabricator.wikimedia.org/T141570#2504336 (10Legoktm) 05Open>03Resolved a:03Legoktm [22:13:57] !log tools.wikibugs legoktm: Deployed 79d62890d3d4f56a3f69bad4349795bae6746e50 Ignore @Phabricator_maintenance and clean up cod... [06:34:45] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2504397 (10yuvipanda) p:05Normal>03High a:03yuvipanda [06:50:22] PROBLEM - Puppet run on tools-grid-shadow is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:58:11] PROBLEM - Puppet run on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:09:07] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2504439 (10yuvipanda) Step 1: Create a new instance Step 2: ``` # parted /dev/vda unit B print Model: Virtio Block Device (virtblk) Disk /dev/vda: 42949... [07:30:24] RECOVERY - Puppet run on tools-grid-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [07:38:12] RECOVERY - Puppet run on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [07:43:11] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2504472 (10yuvipanda) That was useful, since that meant that the parted stuff was a red herring. I've a shorter way to get this to work just now: 1. `vg... [07:53:27] PROBLEM - Host tools-docker-lvm-test-01 is DOWN: CRITICAL - Host Unreachable (10.68.23.111) [07:55:50] PROBLEM - Host tools-docker-test-03 is DOWN: CRITICAL - Host Unreachable (10.68.19.62) [08:04:32] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2504511 (10yuvipanda) No restart solution! 1. `lvcreate --wipesignatures y -n data vd-l 95%VG` 2. `lvcreate --wipesignatures y -n metadata direct-lvm -l... [08:39:19] 06Labs: Please restart WDQ VM - https://phabricator.wikimedia.org/T141606#2504581 (10Magnus) [08:39:48] 06Labs: Please restart WDQ VM - https://phabricator.wikimedia.org/T141606#2504594 (10Magnus) p:05Triage>03Unbreak! [08:55:52] 06Labs: Please restart WDQ VM - https://phabricator.wikimedia.org/T141606#2504638 (10Magnus) Hold on, found it in Horizon... [08:57:45] 06Labs: Please restart WDQ VM - https://phabricator.wikimedia.org/T141606#2504640 (10Magnus) Restarting. I'll keep the ticket open for now, in case I can't get it to work again... [09:05:06] 06Labs: Please restart WDQ VM - https://phabricator.wikimedia.org/T141606#2504651 (10Magnus) 05Open>03Resolved a:03Magnus VM's back, WDQ still struggling, but that's my problem... I also terminated the "backup" instance wdq-mm-02, as it was unused, had ancient data, and was m1.xlarge. Free resources! [11:27:13] PROBLEM - Puppet run on tools-webgrid-lighttpd-1203 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:02:18] Not again.. (someone changed the topic to spam) [12:02:20] Again [12:03:08] Lol [12:03:10] who did that [12:04:12] paladox: spammers [12:04:36] They seem to have attacked this channel in their droves recently [12:04:48] Probably because it's an easy target [12:04:51] * ALVAROMOLINA| (~White@177.54.150.173) has joined [12:04:51] * ALVAROMOLINA| has changed the topic to: JEM DE M13RD4.... ERES UN PT0 DE MI3RD4 Q HACE C4C4 ENCIMA DE LOURDES Y PLATONIDES... HAHAHHAHAHAAH [12:05:07] Same thing to attack -operations [12:05:11] the other day [12:05:25] yuvipanda ^^ [12:05:34] @ban 177.54.150.173 [12:05:34] I doint know how they could change topic [12:05:37] LOL [12:05:50] paladox: anyone can change it [12:06:01] Operator tools were already enabled on this channel [12:06:01] @optools-on [12:06:13] How do I ban.. [12:06:29] I think the bot needs to be in ChamServ [12:06:37] *ChanServ [12:06:55] oh [12:06:58] Um [12:07:00] not sure [12:07:31] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 2.8.0.0 [libirc v. 1.0.3] my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [12:07:31] @help [12:07:54] is it /ban [12:08:07] Unknown command type @commands for a list of all commands I know [12:08:07] @help ban [12:08:12] I know: add, changepass, channel-info, channellist, commands, configure, drop, github-, github+, github-off, github-on, grant, grantrole, help, info, instance, join, language, notify, optools-off, optools-on, optools-permanent-off, optools-permanent-on, part, rc-ping, rc-restart, reauth, recentchanges-bot-off, recentchanges-bot-on, recentchanges-minor-off, recentchanges-minor-on, recentchanges-off, recentchanges-on, reload, restart, revoke, revokerole, seen, seen-host, seen-off, seen-on, seenrx, suppress-off, suppress-on, systeminfo, system-rm, time, traffic-off, traffic-on, translate, trustadd, trustdel, trusted, uptime, verbosity--, verbosity++, wd, whoami [12:08:12] @commands [12:08:33] @kb 177.54.150.173 [12:08:33] Sorry but I don't see this user in a channel [12:08:39] Oh it left [12:08:41] Damn you wm-bot [12:08:51] lol [12:08:59] Obviously I can't op myself [12:09:26] We really need Sigyn in here [12:09:49] She stops trolls and spammers in their tracks with k lines [12:10:28] How do you change the topic back [12:11:00] paladox_, same way as normal [12:11:06] With /topic [12:11:10] Oh [12:11:22] Anyone can set it [12:11:28] I like it that way [12:12:43] Of course, it's a problem when stupids come along [12:12:48] Yeh [12:13:09] We really need Sigyn [12:13:13] Yeh [12:13:43] Not sure how you get it in channels [12:13:52] I did try asking in -ops [12:14:02] Was ignored, as per usual [12:16:25] oh [12:18:12] Sigyn is just ignoring me [12:18:24] oh [12:18:31] paladox: can you ask in #freenode ? [12:18:49] But i doint think they will get involved with wikimedia channels [12:18:55] Try asking operations again [12:19:06] So we have to get approved from -ops [12:29:03] ok [12:33:58] tom29739: When did you ask in -ops? [12:34:23] Ah, found it. [12:35:07] I'll talk with Alex [12:36:08] Thanks [12:36:29] (sorry, really bad connection at the moment) [12:38:12] np [16:14:30] (03PS1) 10BryanDavis: Wheels for striker/requirements.txt @a596f12 [labs/striker/wheels] - 10https://gerrit.wikimedia.org/r/301843 [16:15:53] (03CR) 10BryanDavis: [C: 032 V: 032] Wheels for striker/requirements.txt @a596f12 [labs/striker/wheels] - 10https://gerrit.wikimedia.org/r/301843 (owner: 10BryanDavis) [16:17:58] (03PS1) 10BryanDavis: Add tooling to build wheels and initial wheels [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301844 [16:20:03] @seen scfc [16:20:03] Niharika: I have never seen scfc [16:20:30] (03CR) 10Yuvipanda: [C: 04-1] Add tooling to build wheels and initial wheels (031 comment) [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301844 (owner: 10BryanDavis) [16:25:12] 06Labs, 10Tool-Labs: [tracking] Block spider / web crawler on tool labs - https://phabricator.wikimedia.org/T70300#2505540 (10Danny_B) [16:25:26] 06Labs, 10Tool-Labs: Block spider / web crawler on tool labs - https://phabricator.wikimedia.org/T70300#2505542 (10Danny_B) [16:27:05] 06Labs, 10Labs-Other-Projects, 10Tool-Labs, 06WMDE-Analytics-Engineering: Add http://tools.wmflabs.org/grafana-json-datasource as a datasource to labs grafana instance - https://phabricator.wikimedia.org/T141265#2505547 (10Addshore) p:05Triage>03Normal [16:33:05] (03PS1) 10BryanDavis: Rebuilt wheels without site-packages [labs/striker/wheels] - 10https://gerrit.wikimedia.org/r/301846 [16:33:50] hi addshore [16:33:58] hi YuviPanda ! [16:34:08] (03CR) 10BryanDavis: [C: 032 V: 032] Rebuilt wheels without site-packages [labs/striker/wheels] - 10https://gerrit.wikimedia.org/r/301846 (owner: 10BryanDavis) [16:34:12] the data source plugin [16:34:18] do you know how to add it? does it need code? [16:34:25] to be deployed? how? [16:34:30] yeh, it needs code, but it may even already be there [16:34:49] I see. [16:34:50] But you can't tell unless you are a grafana admin i guess [16:34:53] addshore yeah, I feel we can deploy it and see how it goes :) [16:35:36] :D [16:35:59] so YuviPanda if it needs code then the code is at https://github.com/grafana/simple-json-datasource [16:36:20] what version of grafana are we on? [16:37:10] addshore I'm going to give you admin now [16:37:14] AFAIK this is the directory it needs to go in https://github.com/grafana/grafana/tree/master/public/app/plugins/datasource [16:37:17] ahh okay! [16:37:59] addshore yeah, we might have to make a gerrit patch to make that happen, and also have to import that into gerrit. [16:38:09] yup [16:39:12] (03PS2) 10BryanDavis: Add tooling to build wheels and initial wheels [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301844 [16:39:35] addshore try? [16:39:44] okay, so it is not a current data source [16:39:52] I guess we need to add it to https://github.com/wikimedia/operations-software-grafana actually [16:40:04] (03CR) 10BryanDavis: Add tooling to build wheels and initial wheels (031 comment) [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301844 (owner: 10BryanDavis) [16:40:04] and once added there it would automatically appear / you would be able to configure datasources [16:41:00] addshore can I leave you to do it and only act as a +2 stooge? :) [16:41:16] Reedy: :/ they're back again today [16:41:38] Freenode really need to be doing something abou tit [16:41:53] addshore also does this allow arbitrary users to hit arbitrary urls? [16:42:00] or do you need to be admin to configure what URLs they can hit? [16:42:06] YuviPanda: perhaps :) I may poke ori as he did the last build [16:42:20] So adding the code only then allows admins to configure new datasource [16:42:25] *datasources [16:42:46] so, it works in the same way as the graphite data source essentially, admins can make it hit whatever url they want [16:42:51] addshore right. as long as it's only admins adding new data sources this works great [16:42:52] right [16:43:12] addshore sure! that works too. I just wanted to unblock you since I don't have too much bandwidth just now unfortunately [16:43:20] yup, and then we can have a datasource on labs which conforms to the API, and then send any random data we want into grafana :) [16:43:32] YuviPanda: thats fine! :D [16:43:46] :D [16:43:50] addshore what data are you thinking of? [16:44:17] well, to start with I want to be able to include the same annotations as dashiki uses, but on grafana [16:44:43] ok [16:44:48] but analytics generates a bunch of CSVs, and being able to graph them in grafana would be nice, and any other random data [16:44:53] ah nice [16:44:55] but my primary usecase right now is annotations [16:45:04] right [16:45:10] be able to edit a wikipage, say something, and then it be on the graphs :) [16:45:22] could, for example, write a wrapper around the SAL ;) [16:45:38] addshore can I ask you ping/file a task before adding a data source just so there's a paper trail? no need to wait for anyone to 'approve', just a task + cc me [16:46:34] sure, do you mean a task for adding the datasource to labs / production / the code repo ? [16:46:48] or actually configuring a datasource and pointing it at some api on labs? [16:47:47] addshore I mean, once you have this plugin added you can easily add new datasources on labs right? [16:47:48] so for that [16:48:12] yup, so for adding the datasource plugin to grafana! [16:48:48] addshore and for the latter as well. basically anytime a new thing appears under the datasources drop down :D [16:48:53] 06Labs, 10Labs-Other-Projects, 10Tool-Labs, 06WMDE-Analytics-Engineering: Add simple-json-datasource plugin to operations-software-grafana - https://phabricator.wikimedia.org/T141636#2505596 (10Addshore) [16:48:57] okay! :D [16:49:20] YuviPanda: in that case you also have https://phabricator.wikimedia.org/T141265 which is about configuring that datasource pointing to labs [16:50:07] addshore cool. we should figure a long term way of making sure it is maintainable. [16:50:26] addshore so I assume you'll create one tool per new data source? [16:50:46] well, the plugin is maintained by grafana, and I highly doubt the api signature would change! [16:51:06] there is no reason everything could not be put in the same tool [16:51:13] addshore no I'm talking about https://github.com/addshore/wmf-grafana-json-datasource [16:51:27] addshore sure, but dashiki might change, etc. [16:51:42] addshore and I also don't want you to be spof so need to keep in mind to find additional maintainers :D [16:52:08] ahh indeed, yes, 1) I should probably move it to gerrit 2) yes, other maintainers ;) [16:52:18] 10Wikibugs: Don't notify actions by @Phabricator_maintenance - https://phabricator.wikimedia.org/T141570#2505643 (10Danny_B) [16:52:22] YuviPanda / addshore: dashiki doesn't have any fancy format like limn, it just uses a few conventions for date strings and formatting on top of basic TSV / CSV [16:52:42] 06Labs, 10Labs-Other-Projects, 10Tool-Labs, 06WMDE-Analytics-Engineering: Add simple-json-datasource plugin to operations-software-grafana - https://phabricator.wikimedia.org/T141636#2505645 (10Addshore) [16:54:18] milimetric: indeed :) I already have the code for the annotations in https://github.com/addshore/wmf-grafana-json-datasource and also made https://github.com/addshore/grafana-tsv-datasource which can read in CSV and TSVs [16:54:48] cool, we'll try not to break the format :) [16:55:09] although I decided to stop writing plugins for grafana and simply use the simple json api datasource they provide, as I feel in the long run that will be much more sustainable! [16:57:26] Earwig: Around? [16:58:18] addshore anyway, thanks for taking it on :) [16:58:28] addshore and let me know if you get blocked again [16:59:00] hehe, YuviPanda, I actually just poked ori about it! as I am sure it will take him far less time than me ;) [17:10:50] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:48:50] !log tools disable puppet on all tools k8s worker nodes [17:48:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [17:50:50] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:51:40] (03PS1) 10BryanDavis: Generated staticfiles [labs/striker/staticfiles] - 10https://gerrit.wikimedia.org/r/301852 [17:52:53] (03CR) 10BryanDavis: [C: 032 V: 032] Generated staticfiles [labs/striker/staticfiles] - 10https://gerrit.wikimedia.org/r/301852 (owner: 10BryanDavis) [17:53:04] 06Labs, 06Operations, 06Project-Admins: Archive old Incident-* projects - https://phabricator.wikimedia.org/T134624#2271591 (10greg) per T140202 (and the follow-up in T141493) I think we can en-masse archive these. No one was using them for their workboards and we've got all of the still open tasks now in th... [17:56:43] (03PS1) 10BryanDavis: Bump staticfiles submodule and link docroot files [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301854 [17:57:49] (03CR) 10BryanDavis: [C: 032 V: 032] Bump staticfiles submodule and link docroot files [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301854 (owner: 10BryanDavis) [18:03:26] 06Labs, 06Operations, 06Project-Admins: Archive old Incident-* projects - https://phabricator.wikimedia.org/T134624#2505879 (10Danny_B) [18:04:01] 06Labs, 06Operations, 06Project-Admins: Archive old Incident-* projects - https://phabricator.wikimedia.org/T134624#2271591 (10Danny_B) 05Open>03Resolved All archived. [18:05:42] PROBLEM - Puppet run on tools-worker-1021 is CRITICAL: CRITICAL: 71.43% of data above the critical threshold [0.0] [18:40:46] PROBLEM - Host tools-worker-1021 is DOWN: CRITICAL - Host Unreachable (10.68.20.157) [18:40:54] hi guys i haven't been able to ssh into my labs account for a couple of days now - and it's down [18:41:01] [ssh pushipedia.eqiad.wmflabs] [18:42:32] andrewbogott ^ [18:43:08] jdlrobson: the instance is down, you mean? [18:43:12] jdlrobson have you tried rebooting it? [18:43:26] RECOVERY - Host tools-worker-1021 is UP: PING OK - Packet loss = 0%, RTA = 0.53 ms [18:44:31] PROBLEM - SSH on tools-worker-1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:44:32] sorry this is a dumb question but where can i reboot on the horizon interface? [18:44:53] PROBLEM - Host tools-worker-1021 is DOWN: CRITICAL - Host Unreachable (10.68.20.157) [18:45:17] Compute->Instances [18:45:49] i dont see a reboot button there andrewbogott [18:46:05] there should be a menu to the right of each instance with actions [18:46:22] RECOVERY - Host tools-worker-1021 is UP: PING OK - Packet loss = 0%, RTA = 0.72 ms [18:46:48] ah in th drop down [18:46:51] soft or hard reboot? [18:47:11] hard won't hurt [18:47:24] lets see if this works [18:50:17] i can ssh in now cool [18:50:20] wonder what happened [18:51:57] thanks guys. Sorry for the dumb question :) [18:52:11] no worries, glad you're back in [19:06:03] PROBLEM - Puppet run on tools-worker-1020 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:16:03] RECOVERY - Puppet run on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [19:16:21] PROBLEM - Host tools-worker-1021 is DOWN: CRITICAL - Host Unreachable (10.68.20.157) [19:27:20] (03PS3) 10BryanDavis: Add tooling to build wheels and initial wheels [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301844 [19:27:30] (03CR) 10BryanDavis: [C: 032 V: 032] Add tooling to build wheels and initial wheels [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301844 (owner: 10BryanDavis) [19:27:54] 06Labs: confirm that new base labs base image is adequate for kubernetes &c. - https://phabricator.wikimedia.org/T134944#2506481 (10yuvipanda) @Andrew yes: On a fresh host, no cgroups: ``` # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-4.4.0-1-amd64 root=UUID=cb80806d-1de8-43e6-9c70-18a749bd32d2 ro console=ttyS0... [19:28:33] andrewbogott ^ as well [19:28:47] thanks [19:34:50] RECOVERY - Host tools-worker-1021 is UP: PING OK - Packet loss = 0%, RTA = 0.93 ms [19:37:16] (03PS1) 10BryanDavis: checks/virtualenv.sh: Handle initial venv creation properly [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301872 [19:37:54] (03CR) 10BryanDavis: [C: 032 V: 032] checks/virtualenv.sh: Handle initial venv creation properly [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/301872 (owner: 10BryanDavis) [19:44:41] bd808 I wrote a small amount of bash, https://gerrit.wikimedia.org/r/#/c/301853/ let me know what you think [19:49:48] YuviPanda: looks ok. I think I would make that file a template so the docker version could be injected via hiera instead of hard coded, but that's a nit [19:50:41] RECOVERY - Puppet run on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [19:50:43] bd808 ah, right. I'm actually going to make it a parameter and have puppet pass that to the exec [19:50:57] *nod* that will work [19:51:32] 06Labs, 10Horizon, 13Patch-For-Review: Horizon dashboard for managing instance puppet config - https://phabricator.wikimedia.org/T91990#2506523 (10yuvipanda) [19:54:34] 06Labs, 13Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#2506528 (10Dzahn) Thank you very much. That works. No concerns removing ldaplist then. [19:57:07] bd808 what's the most canonical way for me to test if a file exists? [19:57:12] to put in an 'unless' in puppet [19:57:39] use a "creates" in puppet instead? [19:58:13] bd808 aaah, cool! [19:58:22] otherwise `/usr/bin/test -e $FILE` [20:01:17] bd808 thanks! <3 [20:13:35] might be worth +t here too [20:16:48] Probably just being daft, but how would I go about using WinSCP to upload to a tool's public_html? Permissions error, I swear I sorted it a while back but can't remember the steps [20:18:26] myrcx: can you upload things to your own $HOME? [20:19:10] bd808, gimme a sec to confirm, but I'm almost certain I can [20:19:58] bd808, yes I can :) [20:20:29] myrcx: what tool are you trying access? I'll check the directory permissions on the server side [20:20:50] guess I could upload there and the mv to the tool - and tool is wmf-task-samtar [20:21:02] owner is tool, as expected [20:21:23] I would chown or something but that's restricted [20:21:31] the public_html dir there is missing the group write bit [20:21:47] I can fix it for you [20:21:59] bd808, that'd be great if you could, thank you [20:23:04] !log tools.wmf-task-samtar chmod -R g+w /data/project/wmf-task-samtar/public_html to fix non-group writable files [20:23:17] myrcx: ^ should work for you now I think [20:24:43] bd808, permissions error, lemme log out and back in [20:25:37] bd808, fixed, thank you a load, owe you one :) [20:25:50] myrcx: yw [20:26:17] !log tools built new worker nodes tools-worker-1020 and 21 with direct-lvm storage backend [20:26:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:26:32] myrcx: if it happens again, you should be able to do that sort of chmod after becoming the tool [20:28:04] bd808, I did try as the tool, but I imagine that could have just been me being a little rusty with chmod [20:28:05] :) [20:29:07] !log tools depool tools-worker-1001, going to recreate with to test new puppet deploying-first-run [20:29:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:29:17] ooo, but before that, laundry! [20:41:15] !log tools delete tools-worker-1001 [20:41:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:43:14] PROBLEM - Host tools-worker-1001 is DOWN: CRITICAL - Host Unreachable (10.68.16.76) [20:47:29] 06Labs, 10Horizon, 13Patch-For-Review: Create puppet backend with REST api for labs instance configuration - https://phabricator.wikimedia.org/T133412#2506811 (10Andrew) This server gets/sends hiera json, but presumably the user wants to edit yaml. I can convert back and forth in the client but formatting a... [20:52:01] PROBLEM - Puppet run on tools-worker-1020 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:56:39] PROBLEM - Puppet run on tools-worker-1021 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:04:27] RECOVERY - Host tools-worker-1001 is UP: PING OK - Packet loss = 0%, RTA = 0.93 ms [21:06:24] addshore btw, can you move your dashik related tool to phabricator instead of gerrit? [21:16:40] RECOVERY - Puppet run on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [21:17:01] !log tools depooled tools-worker-1020/21 after fixing them up [21:17:02] RECOVERY - Puppet run on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [21:17:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [21:23:44] PROBLEM - Puppet run on tools-worker-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:25:37] 06Labs, 10Horizon, 13Patch-For-Review: Create puppet backend with REST api for labs instance configuration - https://phabricator.wikimedia.org/T133412#2506968 (10Andrew) a:05Andrew>03yuvipanda After discussion with yuvi, we're going to try making the API yaml-only. Yuvi will update the API when he has a... [21:30:20] !log tools depool tools-worker-1003 to be recreated with new docker config, picking this because it's on a non-ssd host [21:30:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [21:32:27] PROBLEM - Host tools-worker-1003 is DOWN: CRITICAL - Host Unreachable (10.68.21.21) [21:33:45] RECOVERY - Puppet run on tools-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [21:45:18] !log tools repool new tools-worker-1003 with direct-lvm docker storage backend [21:45:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [21:48:15] !log tools deleted tools-worker-1006 after depooling+draining [21:48:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [21:49:22] PROBLEM - Host tools-worker-1006 is DOWN: CRITICAL - Host Unreachable (10.68.20.17) [22:03:53] RECOVERY - Host tools-worker-1006 is UP: PING OK - Packet loss = 0%, RTA = 0.76 ms [22:04:06] !log tools repooled tools-worker-1006 [22:04:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [22:05:53] PROBLEM - Puppet run on tools-worker-1006 is CRITICAL: CRITICAL: 57.14% of data above the critical threshold [0.0] [22:11:53] did... someone just delete all the dashboards? [22:11:55] ah, nope [22:15:54] RECOVERY - Puppet run on tools-worker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [22:19:37] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2507131 (10yuvipanda) The current working hypothesis is: 1. We were running docker with the default storage backend it came with which uses devicemapper over a loopback d... [22:41:50] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [23:01:50] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:19:49] (03PS1) 10BryanDavis: Set ALLOWED_HOSTS to ['*'] by default [labs/striker] - 10https://gerrit.wikimedia.org/r/301904 [23:57:20] 06Labs, 10Labs-Infrastructure, 07Tracking: Labs instances sometimes freeze - https://phabricator.wikimedia.org/T124133#2507273 (10yuvipanda) [23:57:22] 06Labs: Instances locking up randomly - https://phabricator.wikimedia.org/T121998#2507270 (10yuvipanda) 05Open>03Resolved a:03yuvipanda I'm going to close this, since we are pretty sure this particular type of hang was caused by NFS [23:57:41] 06Labs: Track labs instances hanging - https://phabricator.wikimedia.org/T141673#2507274 (10yuvipanda)