[00:04:01] RECOVERY - Puppet run on tools-docker-builder-05 is OK: OK: Less than 1.00% above the threshold [0.0] [00:10:58] RECOVERY - Puppet run on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [00:29:53] PROBLEM - Puppet run on tools-exec-1441 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [01:01:58] !log tools Built instance tools-package-builder-01 [01:02:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [01:09:53] RECOVERY - Puppet run on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [01:22:15] PROBLEM - Puppet run on tools-exec-1433 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:33:09] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Programming Geek was modified, changed by BryanDavis link https://wikitech.wikimedia.org/w/index.php?diff=1756641 edit summary: [01:38:42] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Rrajasek95 was modified, changed by BryanDavis link https://wikitech.wikimedia.org/w/index.php?diff=1756643 edit summary: [01:41:20] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Streetfog was modified, changed by BryanDavis link https://wikitech.wikimedia.org/w/index.php?diff=1756645 edit summary: [01:45:23] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Valerio Bozzolan was modified, changed by BryanDavis link https://wikitech.wikimedia.org/w/index.php?diff=1756647 edit summary: [01:57:16] RECOVERY - Puppet run on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [02:54:06] PROBLEM - Puppet run on tools-exec-1439 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:16:05] PROBLEM - Puppet run on tools-exec-1435 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:19:07] PROBLEM - Puppet run on tools-exec-1432 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:32:09] 06Labs, 10Tool-Labs, 06translatewiki.net: update node.js on tools.telegrambot - https://phabricator.wikimedia.org/T159368#3188794 (10bd808) [03:32:10] 06Labs, 10Tool-Labs, 15User-bd808: Create Updated NodeJS container for Tool Labs - https://phabricator.wikimedia.org/T155063#3188791 (10bd808) 05Open>03Resolved a:03bd808 The current nodejs version in the Kubernetes images is `v6.9.1`. ``` $ webservice --backend=kubernetes nodejs shell If you don't se... [03:34:05] RECOVERY - Puppet run on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [03:36:26] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Snd96 was modified, changed by BryanDavis link https://wikitech.wikimedia.org/w/index.php?diff=1756649 edit summary: [03:38:53] 06Labs, 10Tool-Labs: Linkwatcher spawns many processes without parent - https://phabricator.wikimedia.org/T123121#3188798 (10Beetstra) @valhallasw Do you mind to make sure that linkwatcher is the only bot on 1403? I had to start it this morning, it apparently crashed. Thanks! [03:42:38] !log tools Made tools-docker-builder-05.tools.eqiad.wmflabs the active docker build host [03:42:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:56:01] RECOVERY - Puppet run on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [03:59:06] RECOVERY - Puppet run on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [04:04:11] !log tools Built and pushed new Docker images based on 82a46b4 (Refactor apt-get actions in Dockerfiles) [04:04:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [04:23:55] !log tools Shutdown tools-docker-builder-04; will wait a bit before deleting [04:23:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [04:25:46] PROBLEM - Host tools-docker-builder-04 is DOWN: CRITICAL - Host Unreachable (10.68.22.217) [04:47:50] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 13Patch-For-Review: Build toollabs trusty 'catch all' container - https://phabricator.wikimedia.org/T152089#3188851 (10bd808) 05Open>03declined [04:52:56] 06Labs, 10Tool-Labs, 10Tools-Kubernetes: Tools with names longer than 24 characters cannot start kubernetes webservices - https://phabricator.wikimedia.org/T141100#3188854 (10bd808) [04:56:24] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 07Tracking: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#3188863 (10bd808) [04:56:27] 06Labs, 10Tool-Labs, 10Tools-Kubernetes: Install jq, sed, grep, sort in k8s images - https://phabricator.wikimedia.org/T141082#3188857 (10bd808) 05Open>03Resolved a:03yuvipanda Resolution noted in T141041#2838338 [05:23:16] PROBLEM - Puppet run on tools-exec-1433 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [05:24:19] 10Tool-Labs-tools-Other: templatecount is displaying very large error messages - https://phabricator.wikimedia.org/T163178#3188870 (10Jc86035) [05:24:51] 10Tool-Labs-tools-Other: templatecount is displaying very large error messages - https://phabricator.wikimedia.org/T163178#3188882 (10Jc86035) [05:58:18] RECOVERY - Puppet run on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [07:06:58] PROBLEM - Puppet run on tools-exec-1436 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [07:41:58] RECOVERY - Puppet run on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [07:47:03] PROBLEM - Puppet run on tools-exec-1435 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:08:02] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189146 (10jcrespo) > according to my logs my bot was failing at the SELECT COUNT(*) FROM $schema.BrokenRedirectDeleter; line for almost all of the time I made it fa... [08:22:05] RECOVERY - Puppet run on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [08:35:30] 06Labs, 10Tool-Labs: s51053 is running unnecessarily long running queries on revision - https://phabricator.wikimedia.org/T163192#3189219 (10jcrespo) [08:37:05] (03CR) 10ArielGlenn: Update README.md file, add .env.example & .gitignore (031 comment) [labs/tools/Wikimedia-Emoji-Bot] - 10https://gerrit.wikimedia.org/r/348010 (owner: 10D3r1ck01) [08:52:34] PROBLEM - Puppet run on tools-exec-1442 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:07:55] 06Labs, 06Developer-Relations (Apr-Jun 2017), 03Google-Summer-of-Code (2017), 10Outreachy (Round-14): Set up a Zulip instance on tool Labs - https://phabricator.wikimedia.org/T163169#3189270 (10Aklapper) @srishakatux: Are there any expectations towards the #Labs team to clarify, or why was that tag added t... [09:44:09] (03PS1) 10Ricordisamoa: Change group 1 item, add alkali-metal class [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/348696 [09:57:34] RECOVERY - Puppet run on tools-exec-1442 is OK: OK: Less than 1.00% above the threshold [0.0] [10:49:32] PROBLEM - Puppet run on tools-exec-1434 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [11:03:45] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Tonitrus was created, changed by Tonitrus link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Tonitrus edit summary: Created page with "{{Tools Access Request |Justification=Converting XML-Files into Mediawiki-Syntax. |Completed=false |User Name=Tonitrus }}" [11:24:33] RECOVERY - Puppet run on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [11:54:23] Hi all, I hope I'm in the right place to ask this question. For a data-intensive project I need to scrape data from wikipedia. Before I get started; the way to go is to build a bot in wikimedia labs, right? Or should I use Tool Labs? [11:55:22] still here btw [11:56:07] What kind of data? [12:06:44] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Fisherinformation was created, changed by Fisherinformation link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Fisherinformation edit summary: Created page with "{{Tools Access Request |Justification=Hi, I am a student at the University of Groningen in the Netherlands. For a thesis project, I would like to build a bot that can scrap..." [12:50:43] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189600 (10Anomie) My point is that there's no reason I can think of besides the DB corruption that would cause that SELECT query to consume 40GB of memory. Do you ha... [12:54:19] PROBLEM - Puppet run on tools-exec-1433 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [13:07:09] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Fisherinformation was modified, changed by Fisherinformation link https://wikitech.wikimedia.org/w/index.php?diff=1756715 edit summary: deleted my credentials [13:20:35] PROBLEM - Puppet run on tools-exec-1434 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [13:21:52] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189677 (10jcrespo) > Do you have a reason in mind? MyISAM caching is way worse and simpler than InnoDB, that is why we suggested converting it. It could have been t... [13:29:18] RECOVERY - Puppet run on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [13:44:30] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189737 (10Anomie) You made two requests of me: 1. Switch to InnoDB. 2. Make sure it doesn't take 40GB of memory. I've already done #1, and [[https://phabricator.wi... [13:47:38] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189741 (10jcrespo) > So is there anything else you'd have me change besides switching to InnoDB? Yes, #2 is monitor that graph and if the spikes come back, avoid th... [13:54:05] 06Labs: IO issues for Tools instances flapping with iowait and puppet failure - https://phabricator.wikimedia.org/T161898#3189755 (10chasemp) >>! In T161898#3180656, @bd808 wrote: > I filed https://github.com/wsexport/tool/issues/127 with the wsexport tool. https://github.com/wsexport/tool/issues/127#issuecomme... [13:55:34] RECOVERY - Puppet run on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [14:40:31] 06Labs, 10Tool-Labs-tools-Other: wsexport tool writing output to $HOME/tool/temp puts load on Tool Labs NFS server - https://phabricator.wikimedia.org/T163208#3189871 (10bd808) [14:44:43] 06Labs, 10Monitoring, 10Shinken: Admin request for user paladox and Luke081515 in the project shinken - https://phabricator.wikimedia.org/T162629#3189890 (10Paladox) @bd808 or @chasemp could i make a new task for extending the git project with another instance so i can create a icinga2 class on the puppet co... [14:45:12] 06Labs, 10Tool-Labs, 10InternetArchiveBot: tools.iabot is overloading the grid by running too many workers in parallel - https://phabricator.wikimedia.org/T161951#3189891 (10bd808) 05Open>03Resolved Apparently the tool had to be restarted manually because it required the current working directory to be `... [14:51:05] PROBLEM - Puppet run on tools-exec-1432 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [14:52:53] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189913 (10Anomie) Ok. For the initial run after restarting, I see no deviation outside the norm in [[https://grafana.wikimedia.org/dashboard/file/server-board.json?r... [15:02:51] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3189944 (10jcrespo) > and it completed in under 0.01 seconds Yeah, that looks much better :-) Sorry I cannot provide you as much time as I would love to on labs appl... [15:08:27] ^ spot checked 1432 and it's the same IO issue [15:10:13] *grumble* [15:10:48] I opened a phab task version of the github issue for wsexport [15:11:34] 06Labs: Request increased quota for git labs project - https://phabricator.wikimedia.org/T163213#3189988 (10Paladox) [15:11:40] I can make time this afternoon to look at the code there and see if I can figure out how to make it use $TMP instead of NFS [15:11:43] 06Labs: Request increased quota for git labs project - https://phabricator.wikimedia.org/T163213#3190004 (10Paladox) [15:11:48] bd808: saw that thanks, I'm debating being a small bit more forgiving on the op that times out within the puppet run [15:12:09] I'm conflicted as we haven't needed it till now but it's also fairly aggressive too (5s and 10s timeout) [15:13:06] paladox: how to you think using another project really makes things better on the icinga2 issues? Am I supposed to be too slow to understand that you are doing the same thing to waste resources in another project? [15:14:04] bd808 i have been using the git project. Using another instance. But i really want to create a puppet class but i am thinking it may break what i am doing on that instance. [15:14:34] nobyd wants the puppet class though. that's kind of my point [15:14:42] Oh ok. [15:21:31] 06Labs, 10Monitoring, 10Shinken: Admin request for user paladox and Luke081515 in the project shinken - https://phabricator.wikimedia.org/T162629#3190050 (10Paladox) This task could resolve T124185 if the outcome of that task is icinga2. [15:26:05] RECOVERY - Puppet run on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [15:27:31] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3190064 (10Anomie) I may have misunderstood your instruction to "make sure it doesn't take 40GB" as meaning "find the cause and fix it before you can restart" rather... [15:29:38] bd808: I just wanted to say thanks for doing T155063 :) [15:29:38] T155063: Create Updated NodeJS container for Tool Labs - https://phabricator.wikimedia.org/T155063 [15:32:39] tarrow: yw! It actually got done a while ago but apparently we didn't close the task. [15:34:07] hehe, no worries; I hadn't noticed but I'm excited to use it over my ugly hack [15:50:19] PROBLEM - Puppet run on tools-exec-1433 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:52:02] 06Labs, 10Tool-Labs: Linkwatcher spawns many processes without parent - https://phabricator.wikimedia.org/T123121#3190124 (10valhallasw) Done! [15:58:32] ^ Error: /Stage[main]/Toollabs/Exec[ensure-grid-is-on-NFS]/returns: change from notrun to 0 failed: /bin/false returned 1 instead of one of [0] [15:58:53] happened at Tue Apr 18 15:34:34 UTC 2017 [15:58:57] 1492529674 [16:14:40] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Codeofdusk was created, changed by Codeofdusk link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Codeofdusk edit summary: Created page with "{{Tools Access Request |Justification=I would like to write an [[w:extended essay|IB Extended Essay]] on Wikipedia's page history, using the replica databases for research. I'..." [16:25:20] 06Labs: Request increased quota for git labs project - https://phabricator.wikimedia.org/T163213#3190225 (10Dzahn) It makes sense to me that gerrit and icinga test setup shouldn't have to share a single instance since they are unrelated. [16:30:19] RECOVERY - Puppet run on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [16:30:58] 06Labs, 10Monitoring, 10Shinken: Admin request for user paladox and Luke081515 in the project shinken - https://phabricator.wikimedia.org/T162629#3190254 (10bd808) 05Open>03stalled >>! In T162629#3190050, @Paladox wrote: > This task could resolve T124185 if the outcome of that task is icinga2. Or better... [16:35:19] 06Labs, 10Monitoring, 10Shinken: Admin request for user paladox and Luke081515 in the project shinken - https://phabricator.wikimedia.org/T162629#3190261 (10Paladox) Some more information. They have abandoned developing the web ui for shinken, see https://github.com/shinken-monitoring/mod-webui which means t... [16:38:57] 06Labs: Request increased quota for git labs project - https://phabricator.wikimedia.org/T163213#3190267 (10bd808) 05Open>03stalled p:05Triage>03Low See also: * {T162542} * {T162629} ** T162629#3187911 Opinion shopping is not helpful for advancing your cause. [16:38:59] 06Labs, 07Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#3190272 (10bd808) [16:39:10] 06Labs, 10Monitoring, 10Shinken: Admin request for user paladox and Luke081515 in the project shinken - https://phabricator.wikimedia.org/T162629#3190277 (10Paladox) Found the core repo https://github.com/naparuba/shinken last release was in march 2016. [16:51:51] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Monitor dhcp/dnsmasq on labnet - https://phabricator.wikimedia.org/T162956#3190352 (10Andrew) 05Open>03Resolved Shinken now monitors this and emails on failure. [17:07:31] bd808 it looks pretty much that shinken project has been abandoned. The web ui is no longer getting updates. and the core was last updated in march 2016. [17:08:20] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Codeofdusk was modified, changed by Codeofdusk link https://wikitech.wikimedia.org/w/index.php?diff=1756765 edit summary: [17:09:03] paladox: that does nothing to inform me as to why icinga2 is the compelling replacement for Labs [17:09:37] Well it's not a reason for icinga2. Its a reason on why not to use shinken and to find an alternitive. [17:11:13] https://wikitech.wikimedia.org/wiki/Monitoring_package_survey -- this is not a new topic of research [17:13:03] Oh. [17:13:14] Never knew that page was there. [17:17:07] PROBLEM - Puppet run on tools-exec-1432 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:17:25] 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-OpenStackManager, 05MW-1.27-release (WMF-deploy-2016-04-05_(1.27.0-wmf.20)), 13Patch-For-Review: Clean up after ldap->mysql keystone migration - https://phabricator.wikimedia.org/T126758#3190428 (10Andrew) Instead, I am running: ``` ldapdelete -x -r... [17:18:36] paladox: the system that we have actually been investing in over the last couple of quarters is https://wikitech.wikimedia.org/wiki/Prometheus [17:20:03] ok [17:23:49] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: request tool runs secWatch job once per minute - https://phabricator.wikimedia.org/T162979#3190436 (10FNDE) 05Open>03Resolved a:03FNDE I scaled down the interval to ``` */5 * * * * jsub -once -l release=trusty -N secWatch python $HOME/FNBot/secWatch/bot_l... [17:28:25] 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-OpenStackManager, 05MW-1.27-release (WMF-deploy-2016-04-05_(1.27.0-wmf.20)), 13Patch-For-Review: Clean up after ldap->mysql keystone migration - https://phabricator.wikimedia.org/T126758#3190458 (10Andrew) ...and now I'm going through and removing all... [17:38:49] 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-OpenStackManager, 05MW-1.27-release (WMF-deploy-2016-04-05_(1.27.0-wmf.20)), 13Patch-For-Review: Clean up after ldap->mysql keystone migration - https://phabricator.wikimedia.org/T126758#3190546 (10Andrew) And I removed a bunch of other spare role def... [17:39:02] 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-OpenStackManager, 05MW-1.27-release (WMF-deploy-2016-04-05_(1.27.0-wmf.20)), 13Patch-For-Review: Clean up after ldap->mysql keystone migration - https://phabricator.wikimedia.org/T126758#3190548 (10Andrew) [17:39:08] 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-OpenStackManager, 05MW-1.27-release (WMF-deploy-2016-04-05_(1.27.0-wmf.20)), 13Patch-For-Review: Clean up after ldap->mysql keystone migration - https://phabricator.wikimedia.org/T126758#2022698 (10Andrew) 05Open>03Resolved [17:40:32] 06Labs, 10Horizon, 13Patch-For-Review: keystonehooks: Figure out about member role removal - https://phabricator.wikimedia.org/T162615#3190564 (10Andrew) 05Open>03Resolved [17:57:04] RECOVERY - Puppet run on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [18:08:00] PROBLEM - Puppet run on tools-exec-1436 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:10:07] 10Tool-Labs-tools-Xtools: Bugs section on articleinfo returns incorrect results - https://phabricator.wikimedia.org/T148046#3190709 (10MusikAnimal) a:03MusikAnimal [18:16:14] 10Tool-Labs-tools-Xtools, 06Community-Tech: [Epic] Rewrite XTools: Articleinfo - https://phabricator.wikimedia.org/T157602#3190746 (10MusikAnimal) [18:16:16] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Bugs section on articleinfo returns incorrect results - https://phabricator.wikimedia.org/T148046#3190742 (10MusikAnimal) 05Open>03Resolved Fixed in the rebirth project with [[ https://github.com/x-tools/xtools-rebirth/commit/cdaab16a6d27bbb5441f5ea1d0845b... [18:30:53] PROBLEM - Puppet run on tools-exec-1441 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:42:56] RECOVERY - Puppet run on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [18:51:18] PROBLEM - Puppet run on tools-exec-1433 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [19:03:22] 06Labs, 10Labs-Infrastructure: Create a new labs flavor available to all project: largedisk - https://phabricator.wikimedia.org/T142166#3190980 (10hashar) We could use a flavor with larger disk and low ram/cpu, for example for a Docker registry and a Swift cluster. `m1.small` has 20G disk, the partitioning sc... [19:04:11] 06Labs, 10Labs-Infrastructure: Create a new labs flavor available to all project: largedisk - https://phabricator.wikimedia.org/T142166#3190987 (10hashar) [19:04:13] 06Labs, 07Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#3190986 (10hashar) [19:05:52] RECOVERY - Puppet run on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [19:23:26] PROBLEM - Puppet run on tools-exec-1437 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [19:24:03] 06Labs, 10Labs-Infrastructure: Create a new labs flavor available to all project: largedisk - https://phabricator.wikimedia.org/T142166#3191112 (10chasemp) I think I oppose this being available by default to all projects as we cannot quota disk space usage effectively. I have no problem with it being availabl... [19:26:17] RECOVERY - Puppet run on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [19:32:17] PROBLEM - Puppet run on tools-exec-1433 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:33:58] PROBLEM - Puppet run on tools-exec-1436 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [19:42:17] RECOVERY - Puppet run on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [19:49:00] RECOVERY - Puppet run on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [19:53:06] PROBLEM - Puppet run on tools-exec-1432 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:00:22] 06Labs, 10Labs-Infrastructure: Create a new labs flavor available to all project: largedisk - https://phabricator.wikimedia.org/T142166#3191379 (10hashar) What happens now is that people needing extra disk space ends up creating an m1.xlarge which also consumes 16GB of RAM/8cpu. Though that goes against their... [20:03:26] RECOVERY - Puppet run on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [20:28:05] RECOVERY - Puppet run on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [20:37:08] !log tools Restarted bigbrother on tools-services-02 [20:37:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [20:38:07] (03PS3) 10D3r1ck01: Update README.md file, add .env.example, .gitignore & CREDIT [labs/tools/Wikimedia-Emoji-Bot] - 10https://gerrit.wikimedia.org/r/348010 [20:39:06] I am pretty sure it wont but, will the datacentre change affect labs/tools? [20:43:45] 06Labs, 10Tool-Labs, 06Community-Tech: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3191572 (10Anomie) Second run, no deviation outside the norm in [[https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&panelId=14&fullscreen&orgI... [20:49:00] (03CR) 10Dereckson: Update README.md file, add .env.example, .gitignore & CREDIT (035 comments) [labs/tools/Wikimedia-Emoji-Bot] - 10https://gerrit.wikimedia.org/r/348010 (owner: 10D3r1ck01) [20:54:01] Zppix: no, other than api requests may be briefly interrupted as things switch from one datacenter to the other [20:54:25] bd808: so nothing directly hosted within tools/labs will be affected just api requests [20:54:37] correct [21:01:16] bd808: ok thanks! [21:05:56] (03PS4) 10D3r1ck01: Add configs, docs and credit contributors [labs/tools/Wikimedia-Emoji-Bot] - 10https://gerrit.wikimedia.org/r/348010 [21:16:18] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: bigbrother not trying to start missing iabot job - https://phabricator.wikimedia.org/T163265#3191684 (10bd808) a:03Cyberpower678 [21:17:39] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: bigbrother not trying to start missing iabot job - https://phabricator.wikimedia.org/T163265#3191699 (10bd808) a:05Cyberpower678>03None that's a dumb herald rule :/ [21:26:43] 06Labs, 10Labs-Infrastructure: Shorter token life for novaobserver/novaadmin - https://phabricator.wikimedia.org/T163259#3191735 (10Andrew) [21:30:12] bd808: i think herald is fighting with you [21:30:51] yeah. there is a very 'interesting' herald rule firing there. [21:31:36] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: bigbrother not trying to start missing iabot job - https://phabricator.wikimedia.org/T163265#3191773 (10Cyberpower678) a:05Cyberpower678>03None >>! In T163265#3191699, @bd808 wrote: > that's a dumb herald rule :/ I've removed that bit. [22:41:07] Hey guys, I'm looking in the archive table, and all of the comments are null. Should they not be the comment of the original revision? [22:42:30] PROBLEM - Puppet run on tools-exec-1430 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:58:49] (03PS5) 10D3r1ck01: Add configs, docs and credit contributors [labs/tools/Wikimedia-Emoji-Bot] - 10https://gerrit.wikimedia.org/r/348010 [23:05:00] PROBLEM - Puppet run on tools-exec-1436 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [23:08:01] 10Tool-Labs-tools-Xtools, 06Community-Tech: Set up XTools routing to support individual Tool Labs accounts - https://phabricator.wikimedia.org/T163283#3192212 (10MusikAnimal) [23:13:16] 10Tool-Labs-tools-Xtools, 06Community-Tech: Optimize edit count queries in XTools - https://phabricator.wikimedia.org/T163284#3192230 (10kaldari) [23:17:29] RECOVERY - Puppet run on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0] [23:18:03] PROBLEM - Puppet run on tools-exec-1435 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [23:19:12] 10Tool-Labs-tools-Xtools, 06Community-Tech: Set up XTools routing to support individual Tool Labs accounts - https://phabricator.wikimedia.org/T163283#3192212 (10Matthewrbowker) This might require further investigation. From an initial look, an easy way might be to install an individual copy of xtools into ea... [23:21:14] 10Tool-Labs-tools-Xtools, 06Community-Tech: Optimize edit count queries in XTools - https://phabricator.wikimedia.org/T163284#3192230 (10kaldari) p:05Triage>03Normal [23:27:54] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3192305 (10kaldari) [23:29:03] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3192212 (10kaldari) p:05Triage>03Normal [23:39:57] RECOVERY - Puppet run on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [23:45:33] PROBLEM - SSH on tools-exec-1442 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:50:22] RECOVERY - SSH on tools-exec-1442 is OK: SSH OK - OpenSSH_6.9p1 Ubuntu-2~trusty1 (protocol 2.0) [23:53:02] RECOVERY - Puppet run on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [23:54:04] PROBLEM - Puppet run on tools-exec-1432 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [23:58:15] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Srishakatux was created, changed by Srishakatux link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Srishakatux edit summary: Created page with "{{Tools Access Request |Justification=I would like to set up Zulip instance (https://zulip.readthedocs.io/en/latest/prod-install.html) on Labs as part of the Developer Relatio..."