[00:12:43] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs: Collect and display basic metrics for all tools (service groups) - https://phabricator.wikimedia.org/T129630#2229622 (10bd808) A big hammer method for checking user/tool database sizes: ``` SELECT table_schema , sum( data_length ) as data_bytes , sum(... [00:12:55] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Mmr was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=454193 edit summary: [00:26:32] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Jberkel was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=454224 edit summary: [00:34:05] (03PS1) 10Matthewrbowker: Fixed the tag for xtools IRC. [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284842 [00:34:38] ^ That's my first gerrit commit in over four years... and no it didn't get any easier. [00:35:16] (03PS2) 10Legoktm: Update for renamed xTools project [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284842 (https://phabricator.wikimedia.org/T133364) (owner: 10Matthewrbowker) [00:35:24] (03CR) 10Legoktm: [C: 032] Update for renamed xTools project [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284842 (https://phabricator.wikimedia.org/T133364) (owner: 10Matthewrbowker) [00:36:14] (03Merged) 10jenkins-bot: Update for renamed xTools project [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284842 (https://phabricator.wikimedia.org/T133364) (owner: 10Matthewrbowker) [00:38:09] !log tools.wikibugs Updated channels.yaml to: 98b78de2f209713741863c9dbf5a14eba7164ccd Update for renamed xTools project [00:38:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL, Master [00:39:11] OK that's cool. [00:41:54] Matthew_: hehe, we're super lazy so we spent a decent amount of time building a workflow that does everything for us! [00:42:07] Hehehehehe I can understand that. [00:42:14] So... does it join automagically too? [00:42:42] it'll join whenever it needs to send a message to the channel, and then it will idle [00:42:53] Okay. So I can close the phab task? [00:42:56] yep [00:43:17] Sweet. [00:43:19] Thanks for the info. [00:43:25] 10Wikibugs, 10xTools-on-Labs, 13Patch-For-Review: Add wikibugs to the Wikimedia XTools IRC Channel - https://phabricator.wikimedia.org/T133364#2229654 (10Matthewrbowker) 05Open>03Resolved Resolved with the above changeset. [00:43:48] Yep. That worked. [00:43:50] Thank you. [00:44:04] you know there's probably a relevant xkcd for this, I'm just too lazy to look it up [00:45:53] Hahahahahaha [00:47:26] Krenair: https://xkcd.com/1319/ [00:54:21] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#1717673 (10AlexMonk-WMF) ```krenair@bastion-01:~$ host 10.68.16.66 ;; Truncated, retrying in TCP mode. 66.16.68.10.in-addr.arpa domain name pointer ci-jessie-wikime... [00:56:44] 06Labs, 10Labs-Infrastructure: Switch to horizon/designate/pdns/mysql for labs public dns - https://phabricator.wikimedia.org/T104520#1419432 (10Krenair) Didn't we do this? [01:33:30] Hi. Is there anyone around willing to help me figure out why a script isn't finding the shared pywikibot when submitted with jstart but running python .py on the command line works? (Please excuse the labs noob.) [01:37:47] JJMC89: Does this help? https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Developing#Pywikibot [01:40:11] Matthew_: I did that. When I run python /shared/pywikipedia/core/scripts/version.py I get PYWIKIBOT2_DIR: Not set PYWIKIBOT2_DIR_PWB: Not set PYWIKIBOT2_NO_USER_CONFIG: Not set in the output. [01:40:34] Okay. I'm sorry that's beyond my area of expertise :/ [01:41:31] My script gives Traceback (most recent call last): File "wikinews-importer.py", line 17, in import pywikibot, simplejson ImportError: No module named pywikibot [01:41:44] Thanks for trying [01:42:07] I'm wondering if the grid is running it as a user or in a space that doesn't allow access to the shared pywikibot. [01:44:09] No idea. I'm new to labs and using the grid. [01:47:16] I am too, I haven't dealt with the grid. I don't know if anyone is on who has much experience so... [01:52:37] JJMC89: I'm too jet-lagged to help, but you can try using paws.wmflabs.org (documentation at https://www.mediawiki.org/wiki/Manual:Pywikibot/PAWS_walk-through). Easy to use pywikibot terminal on the web, useful for one off scripting (no need to use the grid). No cronjobs or anything though. [01:52:54] JJMC89: #pywikibot IRC channel also probably has helpful people [01:54:14] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2229722 (10AlexMonk-WMF) So far there's been one instance in testlabs and quite a lot in contintcloud @hashar: please point me to the code that sets these contintcl... [01:54:55] 07YuviPanda01: Thanks. I will need cron though. The script is for a bot. I'll check the #pywikibot channel. [04:46:47] 06Labs, 10Beta-Cluster-Infrastructure, 13Patch-For-Review, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2229791 (10Krenair) a:03mmodell (I went and found the code in puppet... [04:49:29] 06Labs, 10Beta-Cluster-Infrastructure, 13Patch-For-Review, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2229793 (10mmodell) I think I found the race condition: The order of o... [04:50:13] 06Labs, 10Beta-Cluster-Infrastructure, 13Patch-For-Review, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2229794 (10mmodell) I'm gonna cherry pick the patch on beta. We'll see... [04:52:29] 06Labs, 10Beta-Cluster-Infrastructure, 13Patch-For-Review, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2229795 (10mmodell) Actually, come to think of it, I'm not sure if it's... [05:02:13] 06Labs, 10Beta-Cluster-Infrastructure, 13Patch-For-Review, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2206880 (10yuvipanda) See also T120159 [05:56:19] 06Labs, 10Tool-Labs, 10DBA: u2815__old_p (dispenser) database using 20G on labsdb1001 (enwiki) - https://phabricator.wikimedia.org/T133323#2229871 (10valhallasw) Thanks! [08:01:34] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2229924 (10hashar) The system is Nodepool which request creation and deletion of instances via the OpenStack API end point. It creates hundreds of instances per day... [08:34:54] PROBLEM - Host tools-bastion-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.228) [08:48:33] PROBLEM - Puppet staleness on tools-bastion-10 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [43200.0] [10:13:35] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:17:14] 10Tool-Labs-tools-Other: Migrate http://toolserver.org/~dispenser/* to Tool Labs - https://phabricator.wikimedia.org/T68868#2230288 (10jcrespo) [13:17:16] 06Labs, 10Wikimedia-Labs-General, 10DBA, 06Operations, 07Tracking: (Tracking) Database replication services - https://phabricator.wikimedia.org/T50930#2230289 (10jcrespo) [13:17:20] 06Labs, 10Tool-Labs, 10DBA: Provide replication lag as a database function - https://phabricator.wikimedia.org/T50628#2230282 (10jcrespo) 05Open>03Resolved a:03jcrespo Already provided by T71463 (both on labs and on production). If that needs an web API/programing binding, please have a look at https:/... [13:23:39] RECOVERY - Puppet staleness on tools-bastion-10 is OK: OK: Less than 1.00% above the threshold [3600.0] [13:49:04] 06Labs, 10Tool-Labs, 10DBA: u3532__ (=marcmiquel) table using 64G on labsdb1001 - https://phabricator.wikimedia.org/T133322#2230363 (10marcmiquel) Done! I cleant more than 50 GB. I hope it's enough. However, I will clean more in the following weeks. Cheers Marc El dv., 22 abr. 2016 a les 15:33, jcrespo (< n... [14:16:00] PROBLEM - Puppet run on tools-bastion-10 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [14:30:09] 06Labs, 10Labs-Infrastructure: Get labs-ns0 and labs-ns1 service IPs in floating space - https://phabricator.wikimedia.org/T133389#2230419 (10Andrew) [14:36:03] RECOVERY - Puppet run on tools-bastion-10 is OK: OK: Less than 1.00% above the threshold [0.0] [15:00:35] 06Labs, 10Labs-Infrastructure: Switch to horizon/designate/pdns/mysql for labs public dns - https://phabricator.wikimedia.org/T104520#2230533 (10Andrew) 05Open>03Resolved a:03Andrew We did! [15:01:52] 06Labs, 10Labs-Infrastructure: Switch to horizon/designate/pdns/mysql for labs public dns - https://phabricator.wikimedia.org/T104520#2230539 (10Krenair) [15:01:54] 06Labs, 13Patch-For-Review: Switch to using Horizon/Designate for labs public dns - https://phabricator.wikimedia.org/T124184#2230540 (10Krenair) [15:02:23] 06Labs, 10Labs-Infrastructure: Support reverse dns for public labs IPs - https://phabricator.wikimedia.org/T104521#2230543 (10Krenair) [15:02:36] 06Labs, 13Patch-For-Review: Switch to using Horizon/Designate for labs public dns - https://phabricator.wikimedia.org/T124184#1948403 (10Krenair) [15:02:38] 06Labs, 10Labs-Infrastructure: Support reverse dns for public labs IPs - https://phabricator.wikimedia.org/T104521#1419440 (10Krenair) [15:06:07] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:27:58] 06Labs, 10Labs-Infrastructure: Get labs-ns0 and labs-ns1 service IPs in floating space - https://phabricator.wikimedia.org/T133389#2230419 (10BBlack) About constraints, rationales, and paths forward (some of this is recap from IRC earlier): * Upstream public DNS should have at least 2x NS IPs for wmflabs.org... [17:01:02] RECOVERY - Puppet run on tools-webgrid-generic-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [17:12:26] 06Labs, 10DBA: write irc bot to report high replag of s{1,2,3}.labsdb on #wikimedia-labsdb - https://phabricator.wikimedia.org/T106151#2230926 (10Krinkle) Yeah, the use case for the bot (and my bot) is three parts. Two of which would presumably be covered by the bot requested in this ticket: * Proactively rep... [17:18:20] 06Labs, 10Tool-Labs, 13Patch-For-Review: Tools bastions are often unreliable - https://phabricator.wikimedia.org/T131541#2230943 (10Luke081515) [17:19:50] I love horizon. [17:20:06] I feel infinitely more productive and in control of my instances now. [17:21:08] Like when you go to the pharmacy outside the US and don't call for assistance to open the shelve to get something. I can do it myself now instead of asking OSM. [17:24:22] I agree it's pretty nice [17:38:28] 06Labs, 10DBA: write irc bot to report high replag of s{1,2,3}.labsdb on #wikimedia-labsdb - https://phabricator.wikimedia.org/T106151#2230980 (10jcrespo) @Krinkle, https://dbtree.wikimedia.org/ already exists- and it is linked from noc- (but it still uses the old replication lag definition). The next step is... [17:41:19] 06Labs, 10Tool-Labs, 10DBA, 03ToolLabs-Goals-Q4: Show replication lags in Graphite - https://phabricator.wikimedia.org/T50694#2230983 (10jcrespo) [17:45:00] 06Labs, 10DBA: write irc bot to report high replag of s{1,2,3}.labsdb on #wikimedia-labsdb - https://phabricator.wikimedia.org/T106151#2231030 (10jcrespo) BTW, lab hosts are not shown on dbtree on purpose, but they can be exposed (on a separate section outside of the coredb servers) with a simple config change... [17:58:12] 06Labs, 10Horizon: Create puppet backend with REST api for labs instance configuration - https://phabricator.wikimedia.org/T133412#2231107 (10Andrew) [18:01:52] 06Labs, 10Horizon: Create puppet backend with REST api for labs instance configuration - https://phabricator.wikimedia.org/T133412#2231127 (10Andrew) [18:05:31] (03CR) 10Yuvipanda: [C: 032] jsub: Add a ton of comments [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283377 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [18:09:03] PROBLEM - Puppet run on tools-precise-dev is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:13:23] (03CR) 10Rush: [C: 032] jsub: Add a ton of comments [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283377 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [18:15:53] (03CR) 10Yuvipanda: [V: 032] jsub: Add a ton of comments [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283377 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [18:18:42] (03PS3) 10Yuvipanda: jsub: sexier -continuous runner [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283378 (owner: 10BryanDavis) [18:19:05] (03CR) 10Yuvipanda: [C: 032] jsub: sexier -continuous runner [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283378 (owner: 10BryanDavis) [20:50:03] !log rcm.cac Upgrading all repos [20:50:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm.cac/SAL, Master [20:51:51] RECOVERY - Puppet run on tools-webgrid-lighttpd-1405 is OK: OK: Less than 1.00% above the threshold [0.0]