[00:35:42] 06Labs, 10Labs-Infrastructure, 10DBA: labsdb* has no High Availability solution - https://phabricator.wikimedia.org/T141097#2491036 (10bd808) >>! In T141097#2489365, @jcrespo wrote: > I think the title could misslead readers that are not on the loop. Please improve the title as needed. I just expanded the "... [01:42:50] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:47:40] PROBLEM - Puppet run on tools-bastion-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [01:49:26] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:53:09] 10Quarry: Mutiple columns with the same name will cause the result to not be shown - https://phabricator.wikimedia.org/T141233#2491058 (10Huji) [02:22:51] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [02:35:34] PROBLEM - Puppet run on tools-bastion-05 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:38:15] PROBLEM - Puppet run on tools-precise-dev is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:43:53] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:27:41] RECOVERY - Puppet run on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:29:25] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [03:40:36] RECOVERY - Puppet run on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:15] RECOVERY - Puppet run on tools-precise-dev is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:51] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:41] What's the enwiki_p table I can hit for category membership? [06:46:07] 06Labs, 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: madhuvishy is moving to operations on 7/18/16 - https://phabricator.wikimedia.org/T140422#2491222 (10MoritzMuehlenhoff) I'll drop those people with an expired PGP key later on, so that we can add her to pwstore. [07:31:39] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:42] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:43] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:44] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:45] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:46] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:47] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:48] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:49] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:50] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:50] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:51] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:52] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:53] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:54] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:55] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:56] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:57] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:58] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:31:59] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:00] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:01] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:02] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:04] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:05] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:06] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:07] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:08] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:09] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:10] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:11] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:12] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:13] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:14] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:15] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:16] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:17] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:18] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:19] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:20] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:21] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:23] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:24] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:25] slow night I guess [07:32:25] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:26] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:28] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:29] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:31] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:32] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:33] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:34] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:35] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:36] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:37] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:38] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:39] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:40] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:45] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:46] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:47] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:48] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:49] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:50] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:51] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:52] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:53] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:54] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:55] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:56] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:57] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:58] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:32:59] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:00] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:01] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:02] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:03] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:04] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:05] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:06] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:07] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:08] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:09] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:10] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:11] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:13] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:14] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:15] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:16] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:17] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:18] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:19] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:20] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:21] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:23] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:25] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:26] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:27] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:28] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:29] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:30] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:31] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:32] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:36] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:37] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:38] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:40] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:41] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:42] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:48] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:50] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:50] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:58] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:33:59] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:00] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:02] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:03] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:04] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:05] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:06] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:07] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:08] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:09] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:34] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:35] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:37] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:38] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:45] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:49] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:49] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:52] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:34:53] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:00] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:01] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:02] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:03] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:05] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:06] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:07] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:08] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:10] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:11] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:12] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:13] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:14] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:16] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:17] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:18] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:19] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:20] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:21] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:22] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:23] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:24] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:25] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:26] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:27] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:28] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:29] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:30] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:31] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:32] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:33] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:35] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:36] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:37] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:38] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:40] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:41] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:42] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:43] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:44] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:45] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:46] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:47] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:48] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:49] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:50] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:52] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:53] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:54] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:55] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:56] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:57] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:58] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:35:59] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:36:00] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:36:01] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:36:02] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:36:03] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:36:04] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:36:05] !flames BASURAS...... MATARE A A SIMOV...... HAHAHAHAHA [07:50:16] 06Labs, 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: madhuvishy is moving to operations on 7/18/16 - https://phabricator.wikimedia.org/T140422#2491273 (10MoritzMuehlenhoff) @madhuvishy : Please upload your PGP key to the public keyserver network by running gpg --send-key FINGERPRINT_OF_YOUR_KE... [08:40:52] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [08:43:17] 06Labs, 10MediaWiki-Vagrant: performance of vagrant in labs instances - https://phabricator.wikimedia.org/T141111#2491351 (10Physikerwelt) p:05Triage>03Normal [09:20:54] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:20:56] 06Labs, 10Labs-Infrastructure, 06Operations: investigate slapd memory leak - https://phabricator.wikimedia.org/T130593#2491420 (10fgiunchedi) indeed, looks like serpens started leaking memory even with an updated slapd, and as soon as it took over from seaborgium. [10:04:55] 10Tool-Labs-tools-Pageviews, 07I18n: [[Wikimedia:Pageviews-api-incomplete-data/en]] i18n issue - https://phabricator.wikimedia.org/T135817#2311923 (10Nemo_bis) To clarify: saying that $1 is always plural means nothing for the languages where there are multiple plural rules. If $1 has a limited set of possible... [10:05:14] 10Tool-Labs-tools-Pageviews, 07I18n: [[Wikimedia:Pageviews-api-incomplete-data/en]] i18n issue - https://phabricator.wikimedia.org/T135817#2491519 (10Nemo_bis) p:05Triage>03Normal [10:40:59] 10Tool-Labs-tools-Pageviews, 07I18n: [[Wikimedia:Pageviews-api-incomplete-data/en]] i18n issue - https://phabricator.wikimedia.org/T135817#2491551 (10Purodha) Unless anything is special here, using the usual PLURAL will do. [11:27:13] PROBLEM - Puppet run on tools-webgrid-lighttpd-1203 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:15:12] 06Labs, 10Labs-Infrastructure, 06Discovery, 06Maps, and 2 others: Update coastline data in OSM postgres db (osmdb.eqiad.wmnet) - https://phabricator.wikimedia.org/T140296#2491701 (10akosiaris) 05Open>03Resolved a:03akosiaris So, I 've opted for using the already present puppet code to do the update.... [13:30:13] !log tools.heritage Deployed latest from Git: ff61234, 112c14f, 7e98153, b614e78, aafc788 & 3222df9 (T140795), baad88c, ac388ee [13:30:15] T140795: Map sources using Wikidata to wd_item - https://phabricator.wikimedia.org/T140795 [13:30:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [13:50:05] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2491914 (10chasemp) [13:50:07] 06Labs: Two small instances: for WikiToLearn development - https://phabricator.wikimedia.org/T115282#2491913 (10chasemp) 05Open>03Invalid [13:58:19] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2491923 (10MarcoAurelio) [13:59:14] 06Labs, 10Phabricator: Applying role role::phabricator::main causes errors on instances - https://phabricator.wikimedia.org/T138881#2491935 (10chasemp) p:05Triage>03Low [13:59:56] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2491923 (10Paladox) You carn't change your username in gerrit I doint think. It may be possible as it is ldap but not sure since in gerrit doing it through there test signup warns you that when you set... [14:00:14] 06Labs, 10Tool-Labs, 13Patch-For-Review: Add appropriate timeouts to maintain-kubeusers - https://phabricator.wikimedia.org/T141203#2491942 (10chasemp) p:05Triage>03Normal a:03yuvipanda [14:00:37] 06Labs, 10Tool-Labs: Created tool does not show up, re-creation impossible - https://phabricator.wikimedia.org/T141178#2491944 (10chasemp) p:05Triage>03Normal [14:00:44] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2491945 (10chasemp) p:05Triage>03Normal [14:01:14] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Install jq, sed, grep, sort - https://phabricator.wikimedia.org/T141082#2491946 (10chasemp) p:05Triage>03Normal [14:01:25] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Create failover host for docker registry - https://phabricator.wikimedia.org/T141030#2491947 (10chasemp) p:05Triage>03Normal [14:01:56] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2491960 (10chasemp) p:05Triage>03High [14:03:47] 06Labs, 10Tool-Labs: Write diamond collector for gridengine job count stats - https://phabricator.wikimedia.org/T140999#2491966 (10chasemp) p:05Triage>03Normal a:03chasemp [14:03:55] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Monitor kube2proxy failures - https://phabricator.wikimedia.org/T140988#2491969 (10chasemp) p:05Triage>03Normal [14:04:04] 06Labs, 10Labs-Infrastructure: Recapture unused floating IPs - https://phabricator.wikimedia.org/T140985#2491970 (10chasemp) p:05Triage>03Normal [14:04:26] 06Labs, 10Tool-Labs, 10MediaWiki-Interwiki, 10MediaWiki-extensions-Interwiki, 10Wikimedia-Interwiki-links: Toollabs interwiki doesn't work when url parameters needed - https://phabricator.wikimedia.org/T140981#2491971 (10chasemp) p:05Triage>03Low [14:04:39] PROBLEM - SSH on tools-grid-master is CRITICAL: Server answer [14:04:47] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788#2491972 (10chasemp) p:05Triage>03Normal [14:05:01] 06Labs, 10Labs-Infrastructure, 10Labs-project-extdist: legoktm unable to log into extdist-02.eqiad.wmflabs - https://phabricator.wikimedia.org/T140711#2491973 (10chasemp) 05Open>03Resolved [14:05:14] 06Labs, 10Tool-Labs: Normalize kernel on all Debian Jessie nodes in tools - https://phabricator.wikimedia.org/T140611#2491974 (10chasemp) p:05Triage>03Normal [14:07:13] well I cannot in fact ssh into the master but qstat works for it seems [14:07:15] confusing atm [14:08:47] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2491978 (10chasemp) note we are upgrading all kernels to be consistent T140611 [14:09:09] 06Labs, 10Labs-Infrastructure, 10DBA: Add pp_propname/pp_value index to Labs replica - https://phabricator.wikimedia.org/T140609#2491981 (10chasemp) p:05Triage>03Normal [14:09:31] 06Labs, 10Labs-Infrastructure, 10Adminbot: Get a cloak for morebots & labs-morebots - https://phabricator.wikimedia.org/T140547#2491983 (10chasemp) p:05Triage>03Normal [14:10:18] 06Labs, 06WMF-Legal: Potential ambiguities in the Labs Terms of Use - https://phabricator.wikimedia.org/T140486#2491984 (10chasemp) p:05Triage>03High [14:11:01] 06Labs, 06Operations: Create an NFS mount manager - https://phabricator.wikimedia.org/T140483#2491986 (10chasemp) a:03chasemp [14:11:06] 06Labs, 10Labs-Infrastructure, 10DBA, 13Patch-For-Review: Setup and provision labsdb1009, labsdb1010 and labsdb1011 - https://phabricator.wikimedia.org/T140452#2491987 (10chasemp) p:05Triage>03High [14:11:16] 06Labs, 10Horizon: Allow users to edit proxies - https://phabricator.wikimedia.org/T140391#2491988 (10chasemp) p:05Triage>03Normal [14:11:26] 06Labs: Review resource usage for projects with quotas over the default. - https://phabricator.wikimedia.org/T140381#2491991 (10chasemp) p:05Triage>03High [14:12:01] I am going ot have to reboot the master node and I'm not sure how that'll work out as it's unresponsive [14:12:07] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2491998 (10MarcoAurelio) >>! In T141261#2491938, @Paladox wrote: > You carn't change your username in gerrit I doint think. > > It may be possible as it is ldap but not sure since in gerrit doing it th... [14:13:28] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs: Add kubernetes status page to admin tool - https://phabricator.wikimedia.org/T140255#2492006 (10chasemp) p:05Triage>03Normal [14:13:39] 06Labs, 10Labs-Infrastructure: Investigate rabbitmq tcp_listen_options setting (and others) - https://phabricator.wikimedia.org/T140175#2492007 (10chasemp) p:05Triage>03Normal [14:13:48] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Install dependencies for python-lxml in python container - https://phabricator.wikimedia.org/T140117#2492008 (10chasemp) p:05Triage>03Normal [14:13:55] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Install libmysqlclient-dev in tools python2 kubernetes containers - https://phabricator.wikimedia.org/T140112#2492009 (10chasemp) p:05Triage>03Normal [14:14:01] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 07Tracking: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#2492010 (10chasemp) p:05Triage>03Normal [14:14:35] 06Labs: Creating a instance with precise fails - https://phabricator.wikimedia.org/T140099#2492014 (10chasemp) p:05Triage>03Normal [14:14:43] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2491923 (10matmarex) It can be done, but it's a very manual and fairly error-prone process (several different places need to be changed at the same time) so no one likes doing it often. I had mine chang... [14:15:30] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2492019 (10matmarex) Oh, hm, or are you talking about the shell name? I was talking about the login and display name. No idea about shell. [14:15:36] 06Labs, 10Tool-Labs: No permission after creating a new tool - https://phabricator.wikimedia.org/T140004#2492021 (10chasemp) 05Open>03Resolved a:03chasemp race condition was resolved [14:15:44] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs: Add publicly-editable tag system to http://tools.wmflabs.org/?list - https://phabricator.wikimedia.org/T139991#2492024 (10chasemp) p:05Triage>03Normal [14:16:01] 06Labs, 06Operations, 13Patch-For-Review: access_new_install role vs. Labs vs. the future - https://phabricator.wikimedia.org/T139971#2492025 (10chasemp) p:05Triage>03Normal [14:17:39] !log tools nova reboot 64f01f90-c805-4a2e-9ed5-f523b909094e (grid master) [14:17:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [14:18:23] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Improvements to cold-migrate script - https://phabricator.wikimedia.org/T139272#2492037 (10chasemp) p:05Triage>03Normal [14:19:02] 06Labs: social-tools2 instance in SHUTOFF state - https://phabricator.wikimedia.org/T139265#2492042 (10chasemp) 05Open>03Resolved a:03chasemp should be this was caused by the under resource issue at the time, resolving as followup in T139272 seems sufficient [14:23:37] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2492052 (10MarcoAurelio) >>! In T141261#2492019, @matmarex wrote: > Oh, hm, or are you talking about the shell name? I was talking about the login and display name. No idea about shell. Hello. Yes, I t... [14:24:12] PROBLEM - Puppet run on tools-grid-master is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:24:42] RECOVERY - SSH on tools-grid-master is OK: SSH OK - OpenSSH_6.9p1 Ubuntu-2~trusty1 (protocol 2.0) [14:28:20] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, 07Documentation: Run a documentation sprint for Labs - https://phabricator.wikimedia.org/T101659#2492066 (10Aklapper) [14:28:58] PROBLEM - Puppet staleness on tools-grid-master is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [43200.0] [14:34:01] RECOVERY - Puppet staleness on tools-grid-master is OK: OK: Less than 1.00% above the threshold [3600.0] [14:38:11] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2492081 (10MarcoAurelio) To clarify, I'd like that whenever I upload new patches to gerrit, it shows that the author is MarcoAurelio, not maurelio. I don't mind SSH-ing with maurelio (which in fact I fi... [14:47:07] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2492102 (10matmarex) In that case, I think it's your local config. Run this to view your current author name: git config --global user.name And to change it: git config --global user.name "Marco... [14:59:11] RECOVERY - Puppet run on tools-grid-master is OK: OK: Less than 1.00% above the threshold [0.0] [15:05:00] 06Labs, 10Gerrit, 06Operations: Gerrit username change request - https://phabricator.wikimedia.org/T141261#2492127 (10MarcoAurelio) 05Open>03Resolved p:05Triage>03Low a:03MarcoAurelio Hi @matmarex -- I did what you said and it's working well: https://gerrit.wikimedia.org/r/#/c/300880 as example. Th... [15:36:17] 06Labs, 10Labs-Infrastructure, 06Discovery, 06Maps, and 2 others: Update coastline data in OSM postgres db (osmdb.eqiad.wmnet) - https://phabricator.wikimedia.org/T140296#2492229 (10dschwen) 05Resolved>03Open Uuuuaaahhhh, now I'm getting `ERROR: permission denied for relation coastlines` [15:40:48] 06Labs, 10Labs-Infrastructure, 06Discovery, 06Maps, and 2 others: Update coastline data in OSM postgres db (osmdb.eqiad.wmnet) - https://phabricator.wikimedia.org/T140296#2492245 (10akosiaris) >>! In T140296#2492229, @dschwen wrote: > Uuuuaaahhhh, now I'm getting `ERROR: permission denied for relation coa... [15:42:15] 06Labs, 10Labs-Infrastructure, 06Operations: investigate slapd memory leak - https://phabricator.wikimedia.org/T130593#2492249 (10yuvipanda) p:05Normal>03High a:05MoritzMuehlenhoff>03None (moving to high since this caused a couple more outages) [15:42:26] 06Labs, 10Labs-Infrastructure, 06Operations: investigate slapd memory leak - https://phabricator.wikimedia.org/T130593#2492252 (10yuvipanda) Have we considered giving it more RAM? [15:50:08] 06Labs, 10Phabricator: Applying role role::phabricator::main causes errors on instances - https://phabricator.wikimedia.org/T138881#2492310 (10mmodell) the production role (`role::phabricator::main`) isn't expected to work on labs instances. [15:51:37] 06Labs, 10Labs-Infrastructure, 06Operations: investigate slapd memory leak - https://phabricator.wikimedia.org/T130593#2492321 (10MoritzMuehlenhoff) >>! In T130593#2492252, @yuvipanda wrote: > Have we considered giving it more RAM? Won't help much, only stretching the interval until it OOMs at some point ag... [15:51:57] 10Gerrit-Patch-Uploader, 10Gerrit: Gerrit-patch-uploader fails under git 1.9 - https://phabricator.wikimedia.org/T86304#2492322 (10demon) Is this fixed with new gerrit? [15:58:23] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Striker: Deploy "Striker" Tool Labs console to WMF production - https://phabricator.wikimedia.org/T136256#2492355 (10chasemp) @andrew and I talked and we would like to put this on `californium` for now as a similar function to horizon with that node being... [16:08:15] 06Labs, 10Labs-Other-Projects, 10Tool-Labs, 06WMDE-Analytics-Engineering: Add http://tools.wmflabs.org/grafana-json-datasource as a datasource to labs grafana instance - https://phabricator.wikimedia.org/T141265#2492394 (10Addshore) [16:16:57] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Aggregate system logs from kubernetes nodes - https://phabricator.wikimedia.org/T141270#2492444 (10yuvipanda) [16:18:04] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Aggregate system logs from kubernetes nodes - https://phabricator.wikimedia.org/T141270#2492458 (10chasemp) p:05Triage>03High [16:18:43] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2492460 (10chasemp) because of the log periods missing during the issue (but some procs are still sending info like diamond) we are going to try a temporary log aggregatio... [16:20:08] 06Labs, 07Graphite, 13Patch-For-Review: Setup "official labs grafana" instance - https://phabricator.wikimedia.org/T120295#2492478 (10Addshore) >>! In T120295#2420973, @hashar wrote: > I welcome the idea of a Grafana instance dedicated to labs, on the other hand I would like to keep the labs datasource on th... [16:22:16] YuviPanda: that is an oddd thing with the labs grafanana.... [16:23:52] 06Labs, 07Graphite, 13Patch-For-Review: Setup "official labs grafana" instance - https://phabricator.wikimedia.org/T120295#2492508 (10yuvipanda) @addshore indeed, it is very odd.. [16:24:33] 06Labs, 07Graphite, 13Patch-For-Review: Setup "official labs grafana" instance - https://phabricator.wikimedia.org/T120295#2492510 (10yuvipanda) @hashar the labs grafana also has prod graphite as a data source... [16:26:36] hmm YuviPanda the response from the api literally reports nad age of 0 each request too https://grafana-labs.wikimedia.org/api/search?query=&starred=false ... [16:27:05] addshore yeah, no caching for grafana [16:27:23] are both domains getting different dbs or something super odd? [16:28:25] nope, it's the same backend [16:53:10] 06Labs, 07Graphite, 13Patch-For-Review: Setup "official labs grafana" instance - https://phabricator.wikimedia.org/T120295#2492673 (10hashar) >>! In T120295#2492510, @yuvipanda wrote: > @hashar the labs grafana also has prod graphite as a data source... Yup I am well aware about that and I am making use of... [16:54:26] 06Labs, 07Graphite, 13Patch-For-Review: Setup "official labs grafana" instance - https://phabricator.wikimedia.org/T120295#2492682 (10yuvipanda) @hashar no, I meant the opposite - that the grafana at grafana-labs.wikimedia.org also has access to prod graphite - so you can move dashboards that are using labs... [17:00:43] 06Labs, 10Labs-Infrastructure, 06Operations: Investigate failover failure of LDAP servers - https://phabricator.wikimedia.org/T141277#2492736 (10yuvipanda) [17:22:49] Is there JDK 8 on tool labs? [17:25:32] hi OH- [17:25:38] hai [17:25:41] there's jdk8 in the new kubernetes setup [17:25:47] not on gridengine [17:26:03] Cool, I'll look into it, thanks [17:26:16] yw. tom29739 has been playing with it as well [17:27:43] OH-, yeah, it's pretty easy to set up [17:28:16] * YuviPanda made golang containers yesterday [17:28:24] I should probably also make perl6 and rust containers [17:28:41] Tools supports all those languages? [17:29:14] depends on what you mean by support [17:29:19] there'll be containers available and people will be ok using them. [17:29:40] I can't actually help with rust questions :) [17:30:15] What nice is that I can configure the settings for kubernetes [17:30:52] Overall tom29739 likes the kubernetes setup very much :) [17:31:35] :) [17:33:37] YuviPanda, the containers are Jessie [17:33:37] So why do we have a Trusty bastion? [17:34:11] tom29739 because it needs to run gridengine clients as well [17:34:28] and not going to port gridengine to jessie :) [17:34:31] also did you see 'webservice shell' [17:35:24] I did [17:35:36] It worked, kinda [17:35:53] I ended up making a starter script for my bot [17:36:06] Put some pip commands in it [17:36:28] It now has auto-updating :) [17:36:43] what do you mean 'kinda'? the terminal size issues? [17:36:49] ah, can you write that up? :) [17:36:55] (auto updating, that is) [17:37:05] I had some problems with the virtualenv creating [17:37:39] quick question [17:37:43] It created it, but then pip complained [17:38:06] tom29739 did you create it in jessie or trusty? [17:38:19] if I want to build a play project and run it on tool labs, I can put the build command into the run script, and then tell kubernetes to use that script so it runs in a container with jdk8? [17:38:20] YuviPanda, jessie [17:38:26] OH-, yep [17:38:31] yeah [17:38:32] oooh cool [17:38:35] wait [17:38:37] waaaait [17:38:45] is this for the WLM stuff? [17:39:10] You can have post start and pre stop too [17:39:25] No, I'm working on some scala-based tools so I can learn scala and maybe make something useful at the same time [17:39:28] you should probably not put your build scripts there though [17:39:49] since if those take a while your restarts are going to take a while [17:40:20] there are better ways to do it [17:41:21] webservice shell is the way to do it IMO but there's no jdk8 support there just yet, so you have to use k8s directly just now [17:41:26] but i'm adding jdk8 support for it now [17:44:08] 06Labs, 06WMF-Legal: Potential ambiguities in the Labs Terms of Use - https://phabricator.wikimedia.org/T140486#2492952 (10ZhouZ) Thanks for creating this task @tom29739. I will try to clarify this in the upcoming draft of the revised Labs Terms of Use. I think in this instance, username might not be categori... [17:45:38] 06Labs, 10Labs-Infrastructure, 06Operations: Investigate failover failure of LDAP servers - https://phabricator.wikimedia.org/T141277#2492957 (10MoritzMuehlenhoff) There is no failover in that sense, various LDAP clients allow to use multiple servers and depending on their configuration they may use round-ro... [17:45:57] great, thanks [17:54:24] OH-, there's a yaml file in /shared It's called /shared/piagetbot/sopelbot.yaml [17:54:44] OH-, copy it to your tool's home dir, and open it for editing [17:57:07] use /data/project/shared please. it's the same thing, just a nicer path [18:03:51] 06Labs, 10Labs-Infrastructure: Switch DNS to gdnsd - https://phabricator.wikimedia.org/T48824#526848 (10AlexMonk-WMF) @andrew, do we still want to do this? It sounds possible: http://docs.openstack.org/developer/designate/backends/gdnsd_agent.html [18:04:27] I want to run WPcleaner's code on labs and it need more than 5g Vmem hove can i increase the memory limitaion? [18:04:28] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2493052 (10AlexMonk-WMF) I've been thinking maybe we should try nodepool in labtest (running at a much smaller scale) so we can take a closer look at this... [18:04:51] tom29739, got it, thanks [18:05:32] how can i increase the memory limitation? [18:06:22] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Cannot SSH to a few CI slaves due to DNS failure - https://phabricator.wikimedia.org/T129640#2111479 (10AlexMonk-WMF) It's difficult to do anything with the instances now gone... I suggest this be closed [18:07:17] 06Labs, 10Horizon: DNS Domains view in Horizon for Tools project displays only one domain - https://phabricator.wikimedia.org/T131334#2164011 (10AlexMonk-WMF) This is designate's domain ownership system working as intended, should probably be closed [18:09:51] OH-, you'll need to edit it to your needs, e.g. change the image, command, container names, etc [18:11:58] Reza1615 why does it need more than 5g 'vmem'? also vmem is a gridengine concept that's only very loosely bound to actual memory usage [18:12:25] YuviPanda: it uses dump [18:12:53] can you file a phabricator ticket with more details? [18:13:01] is it reading all of a dump into memory? [18:13:41] I think yes it is not my code it is the WPcleaning's code [18:14:15] I see. [18:14:15] it is java code which make these tables https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=fawiki&view=all&orderby=id&sort=asc [18:14:30] can you file a task with the request? thanks. [18:14:47] Yes now I will do it [18:18:45] 06Labs, 10Tool-Labs: Add SSHFP dns records to bastions - https://phabricator.wikimedia.org/T132225#2191980 (10AlexMonk-WMF) This should be trivial for bastion (and tools, which has it's own bastion?) projectadmins to do: ```krenair@bastion-01:~$ ssh-keygen -r bastion.wmflabs.org bastion.wmflabs.org IN SSHFP 1... [18:20:10] 06Labs: Request for increasing lab's account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493146 (10Yamaha5) [18:20:20] YuviPanda: https://phabricator.wikimedia.org/T141288 [18:21:00] 06Labs: Request for increasing labs' account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493161 (10Yamaha5) [18:21:20] PROBLEM - Puppet run on tools-docker-builder-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:21:20] 06Labs: Request for increasing labs' account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493146 (10yuvipanda) Can you change 'java -Xmx5g' to 'java -Xmx4g' and see if that works? Also what's the jsub command you are using to submit this? Are you trying to run this on the bastions directly? [18:23:07] 06Labs: Request for increasing labs' account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493178 (10Yamaha5) I run it directly to have a test. the dump is more than 4g so -Xmx4g doesn't work [18:25:08] 06Labs: Request for increasing labs' account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493185 (10yuvipanda) Ah, I think the bastions in general have stricter memory limits than that, so you don't make it unusable for other people. For something that requires an entire 4g of RAM you'll have t... [18:32:49] 06Labs: Request for increasing labs' account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493216 (10Yamaha5) I tried to submit and it shows this error ``` jsub -once -N wpclean -mem 5g java -Xmx4g -cp /data/project/rezabot/WikipediaCleaner.jar org.wikipediacleaner.Bot fa Rezabot yugiyugi Lis... [18:36:19] 06Labs, 10Labs-Infrastructure, 06Operations: Investigate failover failure of LDAP servers - https://phabricator.wikimedia.org/T141277#2493220 (10chasemp) @andrew and I were discussing whether LVS would make sense in front of LDAP with the ability to more intelligently depool/handle complex failure cases. [18:37:20] 06Labs: Request for increasing labs' account's Vmem limitation - https://phabricator.wikimedia.org/T141288#2493221 (10yuvipanda) Right - in this case I think the grid *did* provide 5G of VMEM (we don't have upper limits on general girdengine usage, only on webservice usage), and JVM still refused to start. This... [18:38:01] 06Labs, 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: madhuvishy is moving to operations on 7/18/16 - https://phabricator.wikimedia.org/T140422#2493224 (10madhuvishy) @MoritzMuehlenhoff Done! http://keys.gnupg.net/pks/lookup?op=get&search=0xA4D1DAC73B947C4D [18:45:58] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Investigate moving docker to use direct-lvm devicemapper storage driver - https://phabricator.wikimedia.org/T141126#2493279 (10yuvipanda) I tried this and am at: ``` daemon: error initializing graphdriver: devmapper: Device docker-thinpool is not a thin pool" ``` I f... [19:09:00] 10Tool-Labs-tools-Pageviews, 07I18n: [[Wikimedia:Pageviews-select2-max-chars/en]] needs PLURAL support - https://phabricator.wikimedia.org/T129442#2493349 (10MusikAnimal) @Macofe @Purodha sorry for the super late reply... For these messages we are using the JavaScript i18n library so PLURAL support should be i... [19:19:30] 10Tool-Labs-tools-Pageviews, 07I18n: [[Wikimedia:Pageviews-api-incomplete-data/en]] i18n issue - https://phabricator.wikimedia.org/T135817#2493413 (10MusikAnimal) In qqq it states `$1` is "the list of dates lacking data". I have further clarified that this value is not numerical, so I don't think PLURAL will w... [19:37:15] 10Tool-Labs-tools-Pageviews, 07I18n: [[Wikimedia:Pageviews-url-structure-projects/fi]] i18n issue - https://phabricator.wikimedia.org/T139899#2493530 (10MusikAnimal) 05Open>03Resolved a:03MusikAnimal Thanks! I've clarified with [[ https://github.com/MusikAnimal/pageviews/commit/035c3875f4d05a0dc3b0ae403d... [20:32:52] Can someone confirm whether https://wikitech.wikimedia.org/wiki/Nova_Resource:Librarybase-reston-01.librarybase.eqiad.wmflabs is turned on? [20:46:54] | 4bd57688-ab78-43c6-8f6d-a2dd75b5f48f | librarybase-reston-01 | ACTIVE | - | Running | public=10.68.18.95 | [20:47:02] harej, nova thinks it is [20:47:26] <|---|> Krenair: lol, your paste pinged my client : [20:47:26] harej, and I can ping it, so yes [20:47:27] <|---|> *:D [20:49:42] harej, why do you ask? problems getting in? [20:55:04] Krenair: indeed [20:55:04] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2493811 (10hashar) >>! In T115194#2493052, @AlexMonk-WMF wrote: > I've been thinking maybe we should try nodepool in labtest (running at a much smaller scale) so we... [20:55:24] also, https://librarybase.wmflabs.org wasn't loading, but after you said the server was up I tried SSHing and that doesn't seem to work either [20:57:02] ok [20:58:29] harej, are you sure that domain is set up as a proxy? [20:58:36] I mean it worked before [20:59:55] well [21:00:01] thing is the domain points to the proxy [21:00:45] but the proxy doesn't have a route for it [21:01:33] ok so I can't get into it with root key [21:01:59] looking at http://tools.wmflabs.org/nagf/?project=librarybase [21:02:19] with the month filter on [21:02:24] can you get in with salt? [21:02:24] it's been dead since 07/07 [21:02:31] I doubt it [21:02:32] I'll try [21:02:55] nope [21:04:06] [8255281.190387] INFO: task nscd:4103 blocked for more than 120 seconds. [21:04:07] [8255281.191125] Not tainted 3.16.0-4-amd64 #1 [21:04:08] [8255281.191653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [21:04:10] andrewbogott chasemp ^ [21:04:17] non tools, non k8s project hung with same symptoms [21:04:30] sqlite> select route.domain from route, project where route.project_id = project.id and project.name = 'librarybase'; [21:04:30] sparql.librarybase.wmflabs.org. [21:04:31] librarybase-web.wmflabs.org [21:04:31] sqlite> [21:04:34] nscd, ntp and others. [21:04:51] totally different kernel to [21:04:52] *too [21:04:55] harej, ^ that's what the proxy has routes for [21:05:13] andrewbogott also an entirely different labvirt node [21:05:32] harej so the instance is fucked in a recoverable way. [21:05:39] :O [21:05:48] what happened? [21:06:09] we don't actually know, this has been happening to tool nodes for a while, first time we've caught it in a non-tools-kubernetes node [21:13:30] harej so the question now is if I recover them immediately, or hold for a day for forensics [21:14:02] I don't think it needs to be restored immediately. Librarybase isn't exactly a production service. (Yet.) If keeping it on hold for forensics would help you, go ahead. [21:17:25] harej ok! [21:17:26] thank you [21:26:16] RECOVERY - Puppet run on tools-docker-builder-01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:27:05] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Cannot SSH to a few CI slaves due to DNS failure - https://phabricator.wikimedia.org/T129640#2493921 (10hashar) 05Open>03Resolved a:03hashar Agreed. I guess that was a transient issue related to DHCP/DNS. I havent encountered that... [21:27:11] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Cannot SSH to a few CI slaves due to DNS failure - https://phabricator.wikimedia.org/T129640#2493924 (10hashar) a:05hashar>03None [21:28:40] musikanimal around? [21:51:00] YuviPanda: yes but on mobile. What's up? [21:51:33] musikanimal ah, nothing urgent. wanted to create a proper supported environment for running ruby and wanted ask for help, can do later [21:52:00] Cool I'll ping you when I'm back home :) [21:53:32] cool [22:35:13] bd808 I have python3 working straight up now! \o/ http://tools.wmflabs.org/lolrrit-wm/ is hello world [22:35:35] nice [22:38:35] 06Labs, 10Tool-Labs: dplbot webservice on Tools Labs fails repeatedly - https://phabricator.wikimedia.org/T115231#2494138 (10Gorthian) This problem is continuing, intermittently, every day. I haven't been chiming in because it would get irritating, yet I want to keep this on the radar. [22:39:52] madhuvishy what was the thing you were trying to get working on py3? [22:40:20] the tool? readmore [22:40:31] madhuvishy did you ever get it working? [22:40:32] haven't poked it since [22:40:39] problem wasn't py3 [22:40:58] right [22:41:09] anyway, I have a much nicer py3 setup on k8s just now [22:41:27] aah [22:41:32] i didn't try k8s [22:41:44] because docs at that point only seemed to support php [22:41:53] will try with k8s engine [22:42:29] madhuvishy see https://dpaste.de/HSs7 [22:42:30] :D [22:42:40] the power of wheeeels [22:42:48] Naice [22:43:03] have to try it now [22:43:07] (not now now) [22:43:14] right [22:43:17] I'll write docs now [22:51:33] YuviPanda: oh sweet. So I can convert my py3 flask app to not use uwsgi-plain? [22:52:19] legoktm yeah. which one? [22:52:43] YuviPanda: https://tools.wmflabs.org/reviewers/ [22:52:54] legoktm ok, do you have some time now? [22:53:04] sure! [22:54:58] legoktm ok, let's try to follow https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Web/Kubernetes#python_.28uwsgi_.2B_python3.4.29 [22:55:46] legoktm since you already have a venv, can you move that out of the way first? [22:55:48] mv venv venv.old [22:55:57] and also move away your uwsgi.ini if you had one [22:55:59] ok [22:56:46] dalba around? [22:58:18] legoktm and since you're doing this from gridengine, you'll have to do a 'webservice --backend=gridengine stop' before the last step [22:58:35] ok [22:59:32] YuviPanda: why do I need stty cols 100 rows 50 ? [22:59:48] legoktm otherwise your terminal is stuck at 40 cols or something tiny [22:59:49] maybe 80? [22:59:57] and it wraps and it feels weird [23:00:04] oh heh [23:00:10] my terminal is 80 cols! [23:00:27] sure, then set it to sttyp cols 80 rows whatever [23:00:47] bd808 is there a way for me to set terminal size via env variables? [23:01:36] oh yeah, it's still kinda screwy [23:02:09] legoktm yeah that's why I set it pretty high [23:02:21] https://github.com/kubernetes/kubernetes/pull/25273 should make it much better but I can't rebase easily [23:02:33] YuviPanda: hmmm.. by passing them to stty but I don't think any other way [23:02:47] https://tools.wmflabs.org/reviewers/ 502 :( [23:03:17] looking [23:03:30] how would I debug this? does uwsgi.log still get populated? [23:03:39] YuviPanda: As I recall the problem with the tty size in k8s is that it isn't catching SIGWINCH right? [23:03:45] hmm, pod is in pending status [23:04:01] bd808 yeah, but it also doesn't set the original size properly [23:04:11] bash sets $LINES and $COLUMNS but that's based on what SIGWINCH tells it [23:04:13] legoktm so I did 'kubectl get pod' and see the pod is in pending status [23:04:39] so it's not ready yet? [23:04:56] then I do a kubectl describe pod to see which node it's on [23:05:05] legoktm no, it should be working fine, I'm debugging the system now [23:06:44] legoktm ok, I just deleted that pod and it's now in ContainerCreating, which is where yu want it o be (should be active in a few secs) [23:06:55] legoktm is up now [23:07:32] eh, https://tools.wmflabs.org/reviewers/?repo=mediawiki%2Fextensions%2FUploadWizard is 502 now :/ [23:08:02] oh, but that looks like an application error [23:08:06] I see it in uwsgi.log [23:08:08] thanks YuviPanda :D [23:08:12] legoktm check uwsgi.log now [23:08:18] was the pod error a fluke or something? [23:08:26] legoktm looks like node failure. [23:08:30] docker is stuck on that node [23:08:35] legoktm does this require git? [23:08:38] or any external tools? [23:08:50] no, just makes API requests to gerrit [23:08:57] which probably changed with the upgrade [23:09:01] ah right [23:09:01] ok [23:09:02] cool [23:09:05] debugging it locally now [23:09:57] legoktm cool [23:10:32] is the deploy process the same? just git pull and do a webservice restart? [23:11:55] legoktm yup [23:12:05] legoktm except your webservice restart will be much faster [23:12:21] legoktm if you want to interact with the virtualenv you have to do it from inside webservice shell though [23:12:29] that part kinda sucks... [23:12:42] legoktm I'm wondering what could be solution to that [23:13:02] why does it have to happen inside that shell? [23:13:04] one is you just pass the command you wanna run as second param, so you just prefix virtuaenv stuff with 'webservice shell ~/www/python/venv/bin/pip install -r requirements.txt' [23:13:14] legoktm because your bastion is trusty and the containers are jessie [23:13:56] and if we make bastions jessie then we have to port gridengine to jessie, and if we have separate k8s/gridengine bastions I've to maitnain a set of containers and an equivalent set of puppet changes... [23:14:09] legoktm I could probably just put a ton of utilities inside webservice shell [23:14:15] git, less, vim, emacs, etc [23:14:31] nano!!!! [23:14:38] hm but this is all python specific right? [23:14:39] right, right, all the things [23:14:45] and then basically you just become, then do webservice shell [23:14:46] right [23:14:58] can we make the requirements.txt file processed on every restart? [23:15:10] so you'll get a base of utilities (git, nano, vim, etc) and then whatever you want on top (python/node) [23:15:11] so dependency updates happen automatically? [23:15:18] the problem with that isthat that can take a long long time [23:15:24] and if you need custom stuff in your venv, you have to use webservice shell? [23:15:53] kubernetes has 'pre-start' stuff so we can technically do it, but I feel somewhat weird about it. [23:15:54] store and diff the old requirements.txt state? I guess this is getting pretty magical now [23:15:59] partially it's also veering into PaaS territory [23:16:00] right [23:16:05] openshift/deis and others do all these things [23:16:11] in a nice, well documented, community supported way [23:16:19] and I don't want to reinvent that wheel. [23:16:24] that's how we got here in the first place [23:17:51] legoktm I'm more in favor of putting utilities into webservice shell [23:18:33] legoktm: help fill in T136265 and get us moving towards finding a really PaaS that will let you make custom images! [23:18:34] T136265: Develop evaluation criteria for comparing PaaS solutions - https://phabricator.wikimedia.org/T136265 [23:19:10] bd808: should I just dump on the task? [23:19:17] A PaaS (platform as a service) will almost certainly deal with your requirements.txt file [23:19:22] sure. [23:19:37] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2494274 (10yuvipanda) I caught 1005 in the act! It got a pod scheduled on it, but was stuck in Pending. I was able to ssh in, and docker ps was hung. It's still hung (even... [23:19:43] we can make some cleanup once there is actually some info to oraganize [23:20:24] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 06Community-Tech-Tool-Labs: Develop evaluation criteria for comparing Platform as a Service (PaaS) solutions - https://phabricator.wikimedia.org/T136265#2494275 (10bd808) [23:20:38] andrewbogott chasemp I found a live about-to-be-stuck instance! [23:22:19] YuviPanda: what's it doing? [23:22:30] weird stuff [23:22:34] docker's a zombie process [23:22:48] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2494276 (10yuvipanda) docker is a zombie process! ``` root 1789 5.1 0.0 0 0 ? Zsl Jul15 774:50 [docker] ``` [23:25:11] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2494289 (10yuvipanda) [23:25:23] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2484363 (10yuvipanda) Can confirm, it's got: ``` [656761.057813] INFO: task lighttpd:15169 blocked for more than 120 seconds. [656761.062759] Not tainted 4.4.0-1-am... [23:26:40] woohoo, tool working now :) [23:29:45] legoktm :D congratulations [23:30:21] legoktm restarts are about 10x faster now [23:36:01] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Kubernetes worker nodes hanging - https://phabricator.wikimedia.org/T141017#2494314 (10yuvipanda) I lookeed at iostat graphs and see big spike, but only for vda and dm-0. You can find out what corresponds to 'dm-0' by doing: ``` dmsetup ls ``` I... [23:36:08] andrewbogott I have some real hypothesis now [23:36:34] YuviPanda: like what? [23:37:01] andrewbogott I wrote that up in the ticket, kinda. This is still restricted to only the worker nodes (nfi about the etcd ones yet) [23:37:08] but mostly just the docker storage driver could be causing tehse [23:37:14] * andrewbogott reads [23:38:05] Hm… would be nice to have a theory that explains the non-k8s deaths [23:38:14] not that I have one [23:38:28] * YuviPanda nods [23:40:10] andrewbogott so the good thing about that hypotheses is that we have things we can try to make that better! [23:40:22] bad thing is that all of the options there kinda suck in one way or other (docker storage drivers) [23:41:24] also I found this because of legoktm! [23:41:38] :D [23:42:46] In Horizon for image sources when you create a new instance there's an option for snapshots (none available) [23:43:01] Does that mean I can create snapshots of my instances? [23:43:13] tom29739: nope, we don't support snapshots because they gobble disk space [23:43:34] Why does it say it in the list? [23:44:25] Horizon is an upstream project, it's designed for every possible use case [23:44:42] Oh [23:45:18] we might be able to hack visibility of snapshots out of the UI [23:48:42] yeah, could remove it if it upsets people [23:49:44] It just seemed like something useful. [23:50:02] I see it often so I thought I'd ask what it was [23:50:23] how do I make a wikitext link to another section on the same page? [23:51:39] YuviPanda: [[#Section]] I think [23:52:04] If that doesn't work try [[page name#section]] [23:52:10] yup seems to work. thanks tom29739 [23:52:32] legoktm I added a howto for moving exiting python3 webservice projects to k8s: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Web/Kubernetes#For_new_projects [23:53:41] YuviPanda: do you want me to write up a guide on how to run arbitrary commands on Kubernetes? [23:53:52] tom29739 I'd appreciate that yes! [23:53:53] tom29739: it's a reasonable question… I don't know if we could enable it but quota people so they can only keep 1 or 2 snapshots, if so it might be worth turning on [23:54:15] andrewbogott -1 :P I think we should just hide it [23:54:39] andrewbogott: +1 it means I wouldn't have to write a salt config for everything [23:54:49] Takes up loads of my time that [23:55:38] And it means I can shelve/delete instances and get them back with minimum effort [23:58:45] YuviPanda: super silly and lazy question...can we make --backend=k8s work? or add tab complete of some kind? [23:59:03] I can probably make --backend=k8s work yeah [23:59:10] kubernetes too long to type? [23:59:14] legoktm although, you only need to do it once [23:59:22] for restarts you can just od 'webservice restart' [23:59:32] you can see the default in service.manifest - webservice will respect that [23:59:40] I wish kubectl had tab complete [23:59:57] That's probably my biggest whine about kubernetes