[01:19:15] PROBLEM - Puppet run on tools-exec-1420 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:59:13] RECOVERY - Puppet run on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:54] 06Labs, 10DBA, 13Patch-For-Review, 07Regression: Tool Labs: Add skin, language, and variant to user_properties_anon - https://phabricator.wikimedia.org/T152043#3165618 (10Krinkle) >>! In T152043#3161912, @chasemp wrote: >>>! In T152043#3152818, @Andrew wrote: >> I merged the puppet change, but maybe this n... [03:18:49] 06Labs, 10Striker, 10Tool-Labs: Implement Tool Labs membership application and processing in Striker - https://phabricator.wikimedia.org/T162508#3165619 (10bd808) [03:19:19] 06Labs, 10wikitech.wikimedia.org: Get rid of SemanticMediaWiki/SRF/SF from wikitech.wikimedia.org - https://phabricator.wikimedia.org/T53642#545071 (10bd808) [03:19:20] 06Labs, 10Striker, 10Tool-Labs: Implement Tool Labs membership application and processing in Striker - https://phabricator.wikimedia.org/T162508#3165635 (10bd808) [03:49:38] 06Labs, 10DBA, 13Patch-For-Review, 07Regression: Tool Labs: Add skin, language, and variant to user_properties_anon - https://phabricator.wikimedia.org/T152043#3165657 (10bd808) "for now it does" == the script needs to be run by someone with access to labsdb1001/1003/1009/1010/1011. [06:33:48] PROBLEM - Puppet run on tools-bastion-05 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [07:13:46] RECOVERY - Puppet run on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [07:15:11] 06Labs, 10DBA, 15User-Urbanecm: Prepare and check storage layer for wbwikimedia - https://phabricator.wikimedia.org/T162513#3165738 (10Urbanecm) [07:15:21] 06Labs, 10DBA: Prepare and check storage layer for wbwikimedia - https://phabricator.wikimedia.org/T162513#3165752 (10Urbanecm) [11:21:20] 06Labs, 10Tool-Labs, 10DBA: labsdb1001 and labsdb1003 short on available space - https://phabricator.wikimedia.org/T132431#3165926 (10jcrespo) 05Resolved>03Open We just had a spike on temporary tables being created, causing service disruption to all users. [11:30:12] 06Labs, 10Tool-Labs, 10DBA: s51362 has been rate limited to 2 concurrent connections for creating hundreds of 1400-second queries to labsdb1001 and labsdb1003 every 10 seconds - https://phabricator.wikimedia.org/T162519#3165938 (10jcrespo) [11:31:59] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other, 10DBA: s51362 has been rate limited to 2 concurrent connections for creating hundreds of 1400-second queries to labsdb1001 and labsdb1003 every 10 seconds - https://phabricator.wikimedia.org/T162519#3165938 (10jcrespo) a:05jcrespo>03None It is believed this... [11:41:53] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other, 10DBA: s51362 has been rate limited to 2 concurrent connections for creating hundreds of 1400-second queries to labsdb1001 and labsdb1003 every 10 seconds - https://phabricator.wikimedia.org/T162519#3165959 (10jcrespo) For lab admins: This incident https://graf... [13:14:42] 06Labs, 10Tool-Labs, 10DBA: labsdb1001 and labsdb1003 short on available space - https://phabricator.wikimedia.org/T132431#2198300 (10Marostegui) Do you think it is still worth compressing whatever is on InnoDB (big wikis) not compressed? [13:23:00] PROBLEM - Puppet run on tools-worker-1008 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:03:00] RECOVERY - Puppet run on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [14:11:44] !log tools.drtrigonbot Killed 7 stale jobs in qwait [14:11:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.drtrigonbot/SAL [15:19:09] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Samuele2002 was created, changed by Samuele2002 link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Samuele2002 edit summary: Created page with "{{Tools Access Request |Justification=For Create New Tool |Completed=false |User Name=Samuele2002 }}" [16:58:56] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166228 (10hashar) [16:59:46] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166241 (10Paladox) p:05Triage>03High [17:23:47] PROBLEM - Puppet run on tools-webgrid-lighttpd-1404 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:48:45] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166249 (10chasemp) ```d35ae797-aa11-4a47-8236-e42e24c96abd dd4a2779-b256-4d71-9059-d7b65380f3d0 de6b9ef9-e531-4567-8090-621c35176b9d d6c32ccb-b68b-4... [17:52:46] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166250 (10Zppix) a:03chasemp Thanks Chase! [18:03:46] RECOVERY - Puppet run on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [18:05:11] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166253 (10Zppix) 05Open>03Resolved Appears to be fixed according to integration.wikimedia.org/ci [18:06:28] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166255 (10Paladox) 05Resolved>03Open Should leave it open for investigation or until @chasemp or @hasher says it is good. [18:33:53] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166304 (10Andrew) 05Open>03Resolved This looks resolved to me. The main action item that I can think of is https://phabricator.wikimedia.org/T9... [18:38:58] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack instances stuck in deletion state - https://phabricator.wikimedia.org/T162529#3166308 (10hashar) Indeed it is resolved. There are some components in OpenStack that end up being stuck from time to time, that is a known issue an... [23:00:12] 06Labs, 10WikiApiary: Requesting more disk space - https://phabricator.wikimedia.org/T162534#3166400 (10DeepBlue) [23:06:49] 06Labs, 10WikiApiary: Requesting more disk space - https://phabricator.wikimedia.org/T162534#3166414 (10DeepBlue) Forgot to mention this but the server is the wikiapiary db1 server currently using ~30gb of memory