[00:56:44] any tool labs want to help me real quick? will take maybe 1 minute of your time [00:56:54] *tool labs admins [00:58:52] nvm [01:08:26] Damn i was gonna help til u corrected urself musikanimal [01:56:33] 06Labs, 10Tool-Labs: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2804720 (10yuvipanda) [01:58:16] 06Labs, 10Tool-Labs, 13Patch-For-Review: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2804748 (10yuvipanda) ^ patch should do the first part. @jcrespo #dba can you grant the labsdbadmin user access from labstore1004 and labstore1005 (failover host)? Thank you! [01:58:29] 06Labs, 10Tool-Labs, 10DBA, 13Patch-For-Review: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2804751 (10yuvipanda) [02:00:07] 06Labs, 10Tool-Labs, 13Patch-For-Review: Tool creation fails? - https://phabricator.wikimedia.org/T150946#2804765 (10yuvipanda) Hello! I fixed up your own credentials, and this patch should fix it for future workflows. `public_html` is no longer auto created - you've to create it manually (since it's only us... [06:32:10] PROBLEM - Puppet run on tools-exec-1413 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [06:38:40] PROBLEM - Puppet run on tools-exec-1416 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:07:10] RECOVERY - Puppet run on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [07:13:39] RECOVERY - Puppet run on tools-exec-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [08:08:00] 06Labs, 10Tool-Labs: Several hour replag reported by heartbeat_p - https://phabricator.wikimedia.org/T151026#2805040 (10Matthewrbowker) [08:18:18] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/MnemonicFlow was created, changed by MnemonicFlow link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/MnemonicFlow edit summary: Created page with "{{Tools Access Request |Justification=Create visualizations of Wikipedia/Wikivoyage articles stored in the replica databases that have geo coordinates in them. |Completed=fals..." [08:27:44] Hello, to get access to the replica databases I need to be a member of the tools project, right ? [08:31:24] Nevermind, found the info @ https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database [08:31:38] Just sent the request [09:46:55] 06Labs, 10Tool-Labs: Hashtag tool 500 internal server error - https://phabricator.wikimedia.org/T150984#2805260 (10Ciell) Thanks, it's up and running again! [11:58:12] https://meta.wikimedia.org/wiki/Tech#SQL_query_against_production_database <-- can be done? [11:59:20] I am not anwering here officially or anything [11:59:41] but if they are not on labs, it means it contains private data [12:00:42] "how many pages are deleted per year" may be on the logs, not sure [12:01:09] PROBLEM - Puppet run on tools-exec-1419 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [12:14:20] 10Tool-Labs-tools-Xtools: Convert all xtools issues to either Phabricator or GitHub - https://phabricator.wikimedia.org/T134632#2805781 (10Aklapper) 05Open>03stalled >>! In T134632#2499082, @Matthewrbowker wrote: > None yet. We will probably hold off until the xTools rewrite is done. Setting task status to... [12:16:41] 06Labs, 10Tool-Labs, 10DBA, 13Patch-For-Review: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2804720 (10Marostegui) users created in labsdb1001 and labsdb1003 for both, labstore1004 and labstore1005 ``` root@neodymium:~# host labstore1004; host labstore10... [12:20:29] RECOVERY - Host tools-secgroup-test-102 is UP: PING OK - Packet loss = 0%, RTA = 0.96 ms [12:24:24] PROBLEM - Host tools-secgroup-test-102 is DOWN: CRITICAL - Host Unreachable (10.68.21.170) [12:36:11] RECOVERY - Puppet run on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [12:52:45] RECOVERY - Host tools-secgroup-test-103 is UP: PING OK - Packet loss = 0%, RTA = 423.00 ms [13:12:47] PROBLEM - Host tools-secgroup-test-103 is DOWN: CRITICAL - Host Unreachable (10.68.21.22) [13:14:07] Any ideas why I get ERROR 1045 (28000): Access denied for user 'u4507'@'10.68.23.58' (using password: YES) ? when trying to do sql enwiki [13:15:32] I'm connect to login.tools.wmflabs.org host [13:44:18] mflow is that a new user? [13:44:40] there were some temporary problems with new account recently [13:46:31] mflow is that a new user? [13:46:34] there were some temporary problems with new account recently [13:47:39] jynus: there user was created in 2015, and I have a replica.cnf, but the user/pass from it seems invalid [13:49:38] On my talk page I have two events: 1) Welcome to Tool Labs 6:04, 4 February 2015 (UTC) 2) Your shell access was granted 16:04, 4 February 2015 (UTC) [13:50:08] what about the tool [13:50:15] the access is for the tool [13:51:04] Mar 17 2014 replica.my.cnf [13:51:26] I may not be part of the 'tools' project? [13:52:14] I've sent today a request to https://wikitech.wikimedia.org/wiki/Special:FormEdit/Tools_Access_Request , but after reading more on the wiki I thought that shell access equated to tools access [13:52:44] I am not sure, but I thought only tools got mysql access, not users [13:53:20] yes, it seems like, shell access is one thing and access to 'tools' project is another [13:53:32] I'll wait for a response [13:54:01] I had a previous user before 2014 but I forgot the credentials and I cannot access it anymore, can I request it's deletion? [14:05:20] mflow, sure, not sure if it can be deleted, but at leasy it should be disabled, if you can demonstrate you were the owner [14:05:28] please create at ticket on phabricator [14:05:33] with the details [14:26:41] RECOVERY - Host secgroup-lag-102 is UP: PING OK - Packet loss = 0%, RTA = 2.46 ms [14:28:47] PROBLEM - Host secgroup-lag-102 is DOWN: CRITICAL - Host Unreachable (10.68.17.218) [14:43:24] RECOVERY - Puppet staleness on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [3600.0] [14:44:24] RECOVERY - Puppet staleness on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [3600.0] [14:47:53] RECOVERY - Puppet staleness on tools-puppetmaster-02 is OK: OK: Less than 1.00% above the threshold [3600.0] [15:04:12] RECOVERY - Puppet staleness on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [3600.0] [15:18:42] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling: Support dedicating a specific virt node to a specific nova project - https://phabricator.wikimedia.org/T84989#2806223 (10hashar) 05stalled>03declined Thanks @Andrew I am thus forgetting about it. [15:19:54] 06Labs, 10Horizon, 05Continuous-Integration-Scaling: Labs project admin can not delete per project image on Horizon - https://phabricator.wikimedia.org/T110936#2806240 (10hashar) 05Open>03declined No more needed, we use the openstack CLI from labnodepool1001 using nodepoolmanager account. I have not rev... [15:33:20] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2806257 (10hashar) self note to check the quota: ssh labnodepool1001.eqiad.wmnet sudo -iH -u nodepool nova absolute-limits [15:39:39] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling, 13Patch-For-Review: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2806265 (10hashar) Crafted the puppet patch and poked about it: ------------- > Hello, > > I have created the puppet patc... [15:44:58] yuvipanda: Hello! Discovery would like to access the mysql DB on stat1002 to generate some stats on maps usage [15:45:39] yuvipanda: looking at the puppet code, it seems there are already credentials for 'analytics-research' defined, and you seemed to have taken part in that. [15:46:02] yuvipanda: any idea what the process is to create another user for discovery? [15:51:22] gehel: The process I went through was a phab ticket requesting access, 3 day wait, and then techops added me to the right access group. It requires prod shell access as well which some people may need to get through a similar process. -- https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Production_access -- https://phabricator.wikimedia.org/T115548 [15:52:11] bd808: Thanks! I'll have a look at that doc and see if I can find my way... [15:52:11] gehel: you may be able to find more help in the #wikimedia-analytics channel [15:53:39] gehel: https://wikitech.wikimedia.org/wiki/Production_shell_access has some links and info as well [15:53:42] bd808: that documentation already looks like a good start. [15:54:36] bd808: in our case, we want a technical user to get access. The individual team members already have access. [15:55:16] bd808: we have a cron that needs to publish some stats regularly. But the process is most probably similar. [15:57:10] gehel: talk to otto :) [15:57:51] chasemp: he does not seem to be here yet... I'll try to put some order in that request first... [16:23:19] !log shinken re-enabling and running puppet [16:23:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Shinken/SAL [16:31:33] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Salgo60 was created, changed by Salgo60 link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Salgo60 edit summary: Created page with "{{Tools Access Request |Justification=I will look into the possibility to create an Apple Watch application that communicates with Wikdata and request all graves next to me on..." [16:45:53] I am receiving the following error sometimes when I save files edited through WinSCP: http://pastebin.com/ABfipPVp [16:51:45] sometimes or all the time? [16:52:09] all the time* [16:52:41] PROBLEM - Free space - all mounts on tools-docker-registry-01 is CRITICAL: CRITICAL: tools.tools-docker-registry-01.diskspace.root.byte_percentfree (<100.00%) [16:53:06] DatGuy: in general you can't login as the servicegroup user at this time, you'll need to edit in your home dir and then login to become, take, and deploy iiuc [16:54:56] 10Tool-Labs-tools-Xtools: Convert all xtools issues to either Phabricator or GitHub - https://phabricator.wikimedia.org/T134632#2806495 (10Matthewrbowker) >>! In T134632#2805781, @Aklapper wrote: >>>! In T134632#2499082, @Matthewrbowker wrote: >> None yet. We will probably hold off until the xTools rewrite is d... [16:55:52] I mean, it does update it [16:56:36] basically, it saves. I don't even know what set time does. Perhaps last-edited date? [17:38:40] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:03:11] PROBLEM - ToolLabs Home Page on toollabs is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:08:02] RECOVERY - ToolLabs Home Page on toollabs is OK: HTTP OK: HTTP/1.1 200 OK - 3670 bytes in 0.035 second response time [18:17:20] 06Labs, 10Tool-Labs, 10DBA, 13Patch-For-Review: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2806678 (10yuvipanda) Thanks a lot @Marostegui! I think we should keep the labstore1001 / 2 ones enabled for now, in the unlikely event we have to roll back our mi... [18:28:45] 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labmon1001 graphite instance archiver keeps archiving the same instances - https://phabricator.wikimedia.org/T120377#2806718 (10Krinkle) [18:36:55] is Tool Labs down for anyone else? [18:37:31] musikanimal: looks ok to me, what issues arey ou having? [18:37:32] *you [18:38:00] I thought maybe it was just me, didn't get an email about my tools being down from Uptime Robot [18:38:05] anyway, everything is timing out [18:38:10] including SSHing in [18:38:13] 06Labs, 10DBA: labsdbadmin needs to be able to drop users and update passwords - https://phabricator.wikimedia.org/T151076#2806773 (10chasemp) [18:38:26] 06Labs, 10DBA: labsdbadmin needs to be able to drop users and update passwords - https://phabricator.wikimedia.org/T151076#2806785 (10chasemp) p:05Triage>03Normal [18:38:32] I'm on a public WiFi, through Wikimedia VPN, maybe there's some weirdness somewhere in between [18:39:22] musikanimal: all lgtm tho [18:39:36] yeah and everything works on my phone [18:39:41] oh well :/ [18:39:44] thanks for checking! [18:39:50] yw musikanimal [18:44:34] 06Labs, 10DBA: labsdbadmin needs to be able to drop users and update passwords - https://phabricator.wikimedia.org/T151076#2806809 (10jcrespo) 05Open>03Resolved a:03jcrespo I believe labsdbadmin can drop users already ok cool reopen the ticket even if it is for "I need db help"... [19:04:49] 06Labs, 10Labs-Infrastructure: Labs: Figure out if disk space quotas can be set per project - https://phabricator.wikimedia.org/T151079#2806850 (10Andrew) [19:54:22] chasemp yuvipanda hi, it seems that pages on wikitech are being deleted [19:54:31] like https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:Gerrit-mysql.git.eqiad.wmflabs&action=edit&redlink=1 [19:55:07] andrewbogott: ^ [19:56:04] I believe that's from known work happening on SMW things or so [19:56:08] paladox: yuvipanda: that's on purpose... [19:56:12] oh [19:56:19] part of an ongoing (and endless) process of me trying to get those pages to actually be accurate [19:56:22] and clean up old ones [19:56:28] ok [19:56:33] thanks for replying :) [19:56:37] They're refreshed every few hours so the easiest thing is for me just to delete every damn thing and let nature take its course :) [19:56:49] Ok [19:56:50] :) [20:05:18] 06Labs, 10Tool-Labs: Several hour replag reported by heartbeat_p - https://phabricator.wikimedia.org/T151026#2807066 (10Urbanecm) Yes, it seems to be fixed now! [20:10:22] 06Labs, 06Operations: Kill the labtest $realm - https://phabricator.wikimedia.org/T148717#2730983 (10chasemp) No objection here, {T146150} already existed. [20:10:41] 06Labs: Undo labtest realm hacks - https://phabricator.wikimedia.org/T146150#2652209 (10chasemp) [20:10:43] 06Labs, 06Operations: Kill the labtest $realm - https://phabricator.wikimedia.org/T148717#2807092 (10chasemp) [20:11:35] yuvipanda chasemp hi, when i run puppet agent -tv i get an error on phab-03 [20:11:47] i get [20:11:48] Warning: Unable to fetch my node definition, but the agent run will continue: [20:12:03] let me take a look [20:12:09] https://phabricator.wikimedia.org/P4476 [20:12:10] thanks [20:12:19] paladox: [20:12:21] > The last Puppet run was at Mon Sep 12 20:02:27 UTC 2016 (96483 minutes ago). [20:12:26] Yep [20:12:43] paladox: this instance has been broken for a long time and has a super weird /etc/puppet/puppet.conf [20:12:48] Oh [20:12:57] I just updated puppet [20:13:04] which looks like it was manually modified [20:13:20] paladox: what do you mean by 'updated puppet' [20:13:27] Well i did sudo apt-get update [20:13:29] then [20:13:38] sudo apt-get upgrade and it said there was a puppet upgrade [20:13:46] so i did y for yes [20:13:47] did you get a dialog box or something asking about replacing puppet.conf? [20:13:54] yes [20:14:14] right, so you've clobbered your existing puppet.conf :) [20:14:19] It would have been good to lead with that information when reporting it as broken? [20:14:25] oh sorry [20:14:41] paladox: you'll probably have to just rebuild that instance [20:14:52] yuvipanda i doint think that's possible [20:15:16] you're out of luck then. with root comes responsibilities and we can't really rescue out of this one [20:15:16] sorry! [20:15:30] Ok [20:15:51] yuvipanda i meant i carnt' rebuild because of the new lab resources [20:16:15] ie they were lower, so if i delete the instance and try to recreate it, i will hit the new limit [20:18:51] paladox: hmm I thought we can't lower them below what is currently being utilized. what project is this? let me verify [20:19:33] phabricator [20:19:44] yuvipanda i copied over a working puppet.conf [20:19:51] from the gerrit-test project [20:19:59] seems to work [20:20:04] or gets passt the error now [20:20:15] but i get Error: Could not retrieve catalog from remote server: Error 400 on SERVER: is not an integer, but is used as an index of an array at /etc/puppet/modules/role/manifests/phabricator/main.pp:87 on node phab-03.phabricator.eqiad.wmflab [20:20:16] now [20:21:18] as I said [20:21:31] you're not going to get any support there :) good luck if you want to try to ressurect it [20:21:43] I'm checking quotas to make sure you have enough to delete and recreate it if needed [20:22:13] 06Labs: Rervert: request increased quota (floating ip) for "cvn" labs project - https://phabricator.wikimedia.org/T150209#2807101 (10chasemp) p:05Triage>03Normal [20:22:19] Ok [20:22:20] thanks [20:22:38] 06Labs, 07Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#2807104 (10chasemp) [20:22:40] 06Labs: Rervert: request increased quota (floating ip) for "cvn" labs project - https://phabricator.wikimedia.org/T150209#2777950 (10chasemp) 05Open>03stalled [20:28:15] yuvipanda im wondering do you know how i can remove or add puppet roles, it seems it was removed from wikitech [20:28:22] please? [20:29:52] paladox: you can do that in horizon.wikimedia.org [20:29:59] Oh [20:30:04] paladox: you should have enough quota - you can also check that in horizon.wikimedia.org [20:30:05] ##################################################################### [20:30:05] ##### THIS FILE IS MANAGED BY PUPPET [20:30:06] ##### as template('base/puppet.conf.d/10-main.conf.erb') [20:30:06] ###################################################################### [20:30:06] [main] [20:30:07] logdir = /var/log/puppet [20:30:08] vardir = /var/lib/puppet [20:30:10] ssldir = /var/lib/puppet/ssl [20:30:12] rundir = /var/run/puppet [20:30:14] factpath = $vardir/lib/facter [20:30:16] [agent] [20:30:18] server = labs-puppetmaster-eqiad.wikimedia.org [20:30:20] configtimeout = 960 [20:30:22] usecacheonfailure = false [20:30:24] splay = true [20:30:26] prerun_command = /etc/puppet/etckeeper-commit-pre [20:30:28] postrun_command = /etc/puppet/etckeeper-commit-post [20:30:30] pluginsync = true [20:30:34] report = true [20:30:36] Woops [20:30:38] Sorry [20:30:40] I didnt want to paste that [20:30:42] https://horizon.wikimedia.org/project/puppet/ [20:32:42] paladox: you can also click on an individual instance name under 'instances' and apply puppet roles (or hiera) there [20:33:40] Oh [20:33:43] Thanks [20:35:29] yuvipanda ive fixed phab-03 now [20:35:41] \o/ congratulations [20:35:45] i was applying the main phabricator class, which will break on labs. [20:35:48] :) [20:36:28] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2807134 (10chasemp) [20:40:15] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2807136 (10chasemp) [20:53:21] 06Labs, 06Release-Engineering-Team: Request for CI staging project - https://phabricator.wikimedia.org/T150772#2795973 (10Andrew) Sounds ok to me! [20:55:12] 06Labs, 06Release-Engineering-Team: Request for CI staging project - https://phabricator.wikimedia.org/T150772#2795973 (10chasemp) We talked and this seems reasonable. Thanks @thcipriani for the outline. A modest PSA if you guys have new project requests coming down the line before the end of the year (and... [21:21:00] 10Labs-Team-Backlog: Investigate moving mwoffliner onto a labs-on-real-hardware machine - https://phabricator.wikimedia.org/T117081#2807244 (10Andrew) [21:26:06] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2807255 (10Andrew) [21:26:09] 06Labs, 06Release-Engineering-Team: Request for CI staging project - https://phabricator.wikimedia.org/T150772#2807252 (10Andrew) 05Open>03Resolved a:03Andrew Done, with @thcipriani as the initial project admin. [21:27:53] hey bd808 quick favor for you, when/if you get the time to look at the jouncebot change can you do me a favor and email me [[:w:en:Special:EmailUser/Zppix]] [21:28:00] @link [21:28:00] https://wikitech.wikimedia.org/wiki/:w:en:Special:EmailUser/Zppix [21:28:29] Zppix: well you will get the notice from gerrit right? [21:28:45] bd808 yes but i get tons of notifications from phab and gerrit :P [21:28:52] i'll either +2 it and deploy or toss it back to you for fixes [21:29:06] Zppix: then you have chose poorly ;) [21:29:06] whatever i was just curious if you would or not [21:29:09] no biggy [21:31:01] 06Labs, 06Operations: Kill the labtest $realm - https://phabricator.wikimedia.org/T148717#2807262 (10faidon) Ah! Sorry for missing that! @chasemp, is that something that the #Labs team can and/or will do? [21:31:26] 06Labs, 10Tool-Labs: Hashtag tool 500 internal server error - https://phabricator.wikimedia.org/T150984#2807263 (10chasemp) 05Open>03Resolved [21:46:27] 06Labs, 06Operations: Kill the labtest $realm - https://phabricator.wikimedia.org/T148717#2807288 (10chasemp) sure yeah, it's been hellfire and brimstone for a bit here recently. Post-thanksgiving I expect? We are still untangling knots from (stage 1) of the storage migration, madhu is gone for a month and e... [21:47:02] 06Labs: Revert: request increased quota (floating ip) for "cvn" labs project - https://phabricator.wikimedia.org/T150209#2807289 (10Krinkle) [21:47:29] 06Labs, 10Labs-Infrastructure: Labs: Figure out if disk space quotas can be set per project - https://phabricator.wikimedia.org/T151079#2807290 (10Andrew) 05Open>03Resolved Looks like this can by quota'd via Cinder but not with nova on its own. [21:48:06] 06Labs: Revert: request increased quota (floating ip) for "cvn" labs project - https://phabricator.wikimedia.org/T150209#2777950 (10Krinkle) Progress tracked at . We're almost done. I expect to have the last bits done in the next 24-48 hours. [21:50:32] 06Labs: Revert: request increased quota (floating ip) for "cvn" labs project - https://phabricator.wikimedia.org/T150209#2807296 (10chasemp) thanks @krinkle [21:55:13] 06Labs, 06Operations: Kill the labtest $realm - https://phabricator.wikimedia.org/T148717#2807298 (10faidon) Sure, I'm just making sure that we're not waiting for each other :) [22:36:40] 06Labs, 10Tool-Labs, 13Patch-For-Review: Tools: Migrate puppet roles from ldap to the new puppetbackend - https://phabricator.wikimedia.org/T148683#2807376 (10Andrew) a:05yuvipanda>03Andrew [22:37:30] 06Labs, 10Tool-Labs, 13Patch-For-Review: Tools: Migrate puppet roles from ldap to the new puppetbackend - https://phabricator.wikimedia.org/T148683#2729915 (10Andrew) yuvi says "there's a list of prefixes in tools-clush-generator in ops/puppet" [23:22:40] 06Labs, 10Tool-Labs, 13Patch-For-Review: Tool creation fails? - https://phabricator.wikimedia.org/T150946#2807442 (10Magnus) 05Open>03Resolved a:03Magnus replica.my.cnf has arrived. Thanks!