[00:14:14] 10scap, 06Operations, 06Performance-Team, 07Epic: During deployment old servers may populate new cache URIs - https://phabricator.wikimedia.org/T47877#2606312 (10Krinkle) [00:35:43] PROBLEM - Puppet run on deployment-ms-fe01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [00:53:47] Yippee, build fixed! [00:53:47] Project performance-webpagetest-wpt-org build #1861: 09FIXED in 21 min: https://integration.wikimedia.org/ci/job/performance-webpagetest-wpt-org/1861/ [01:15:42] RECOVERY - Puppet run on deployment-ms-fe01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:36] PROBLEM - Puppet run on deployment-kafka05 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:32:38] RECOVERY - Puppet run on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [04:06:20] Yippee, build fixed! [04:06:20] Project selenium-MultimediaViewer » safari,beta,OS X 10.9,contintLabsSlave && UbuntuTrusty build #130: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=contintLabsSlave%20&&%20UbuntuTrusty/130/ [04:18:05] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #130: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/130/ [05:15:42] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [05:50:41] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [07:09:58] 10Beta-Cluster-Infrastructure, 03Scap3: Fixup beta scap3 keyholder problems - https://phabricator.wikimedia.org/T144647#2606489 (10hashar) For ssh host keys there is {T72792}. Could get the puppet agent to collect them on each host and on the deployment server generate the global ssh know_hosts file. That i... [07:11:45] 06Release-Engineering-Team, 15User-greg: Make a table of access levels per service RelEng maintains per person - https://phabricator.wikimedia.org/T135187#2606493 (10hashar) Daniel Zahn had the same use case and has landed the script in puppet.git :] [07:15:55] 06Release-Engineering-Team, 06Community-Tech: updateCollation.php on terbium still run code from 1.28.0-wmf.16 against enwiki ( LoadBalancer::reallyOpenConnection: 402+ connections made (master=db1057) LoadBalancer.php line 850 ) - https://phabricator.wikimedia.org/T144580#2606495 (10hashar) The too many conne... [07:16:09] 06Release-Engineering-Team, 06Community-Tech: updateCollation.php on terbium still run code from 1.28.0-wmf.16 against enwiki ( LoadBalancer::reallyOpenConnection: 402+ connections made (master=db1057) LoadBalancer.php line 850 ) - https://phabricator.wikimedia.org/T144580#2606500 (10hashar) [07:30:22] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:10:23] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [08:19:26] (03PS2) 10Hashar: Add jenkins job for apertium-ita, apertium-srd, apertium-srd-ita [integration/config] - 10https://gerrit.wikimedia.org/r/308176 (owner: 10KartikMistry) [08:20:47] (03CR) 10Hashar: [C: 032] Add jenkins job for apertium-ita, apertium-srd, apertium-srd-ita [integration/config] - 10https://gerrit.wikimedia.org/r/308176 (owner: 10KartikMistry) [08:21:25] (03Merged) 10jenkins-bot: Add jenkins job for apertium-ita, apertium-srd, apertium-srd-ita [integration/config] - 10https://gerrit.wikimedia.org/r/308176 (owner: 10KartikMistry) [08:29:01] (03PS2) 10Hashar: elasticsearch-tool: adding project to continuous integration [integration/config] - 10https://gerrit.wikimedia.org/r/308175 (owner: 10Gehel) [08:29:05] (03CR) 10Hashar: [C: 032] elasticsearch-tool: adding project to continuous integration [integration/config] - 10https://gerrit.wikimedia.org/r/308175 (owner: 10Gehel) [08:30:04] (03Merged) 10jenkins-bot: elasticsearch-tool: adding project to continuous integration [integration/config] - 10https://gerrit.wikimedia.org/r/308175 (owner: 10Gehel) [08:53:14] PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [09:33:14] RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0] [10:02:27] 06Release-Engineering-Team, 10DBA, 10MediaWiki-Maintenance-scripts, 06Operations, 07RfC: Add section for long-running tasks on the Deployment page (specially for database maintenance) - https://phabricator.wikimedia.org/T144661#2606542 (10jcrespo) [12:02:23] Project selenium-RelatedArticles » chrome,beta-mobile,Linux,contintLabsSlave && UbuntuTrusty build #133: 04FAILURE in 1 min 22 sec: https://integration.wikimedia.org/ci/job/selenium-RelatedArticles/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta-mobile,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/133/ [12:21:57] 10Continuous-Integration-Config: maps/kartotherian/deploy does not go via jenkins - https://phabricator.wikimedia.org/T142740#2606635 (10hashar) Added `noop` job which is always a success with 6b38faa08e662a4a923574666fcf81c827beb69e Would be nice to align it with how it is done for Graphoid or Mathoid. [12:22:17] 10Continuous-Integration-Config: maps/kartotherian/deploy does not go via jenkins - https://phabricator.wikimedia.org/T142740#2545192 (10hashar) p:05Triage>03Normal [12:22:25] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2606640 (10hashar) p:05Triage>03Normal [12:23:19] 10Continuous-Integration-Config, 07JavaScript: Ensure that all repos have sufficient .jshintrc spec - https://phabricator.wikimedia.org/T136457#2606641 (10hashar) p:05Triage>03Normal [12:23:27] 10Continuous-Integration-Config, 10BlueSpice, 13Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2606642 (10hashar) p:05Triage>03Normal [12:23:39] 10Continuous-Integration-Config, 06Operations, 13Patch-For-Review, 07Puppet: post build failures for operations/puppet on operations-puppet-doc - https://phabricator.wikimedia.org/T143233#2606643 (10hashar) p:05Triage>03High [12:23:50] 10Continuous-Integration-Config: Update jenkins job builder - https://phabricator.wikimedia.org/T143731#2606644 (10hashar) p:05Triage>03Low [12:24:24] 10Continuous-Integration-Config, 10MediaWiki-Unit-tests, 07Regression: jenkins no longer outputs list and reasons of why tests are being skipped - https://phabricator.wikimedia.org/T141308#2606645 (10hashar) 05Open>03declined [12:25:21] (03PS4) 10Paladox: Add rm -fR "$WORKSPACE/modules/*/bin" to jenkins job operations-puppet-doc [integration/config] - 10https://gerrit.wikimedia.org/r/307654 (https://phabricator.wikimedia.org/T143233) [12:29:59] 10Continuous-Integration-Infrastructure, 07Epic: Provide infrastructure to store files by project/branch post-merge to compare with pre-merge - https://phabricator.wikimedia.org/T101545#2606652 (10hashar) [12:30:01] 10Continuous-Integration-Infrastructure: Store Jenkins build output outside Jenkins (e.g. static storage) - https://phabricator.wikimedia.org/T53447#2606653 (10hashar) [12:30:30] 10Continuous-Integration-Infrastructure, 07Epic: Provide infrastructure to store files by project/branch post-merge to compare with pre-merge - https://phabricator.wikimedia.org/T101545#1342113 (10hashar) Has been done for the beta cluster via {T64835}. Probably want to do the same for the `integration` lab... [12:30:36] 10Continuous-Integration-Infrastructure, 07Epic: Provide (pre-merge) performance reports on patchsets - https://phabricator.wikimedia.org/T101543#2606659 (10hashar) [12:30:38] 06Release-Engineering-Team, 07Epic, 07Tracking: [EPIC] Provide pre-merge reports on patchsets (tracking) - https://phabricator.wikimedia.org/T101542#2606660 (10hashar) [12:30:40] 10Continuous-Integration-Infrastructure: Preview generated documentation in test pipeline for review - https://phabricator.wikimedia.org/T72945#2606661 (10hashar) [12:30:42] 10Continuous-Integration-Infrastructure, 05Testing-Initiative: Jenkins: Set up perceptual diffs (visual regression testing) - https://phabricator.wikimedia.org/T64633#2606662 (10hashar) [12:30:44] 10Continuous-Integration-Infrastructure, 07Epic: Provide infrastructure to store files by project/branch post-merge to compare with pre-merge - https://phabricator.wikimedia.org/T101545#2606657 (10hashar) 05stalled>03Open [12:40:02] 06Release-Engineering-Team, 10DBA, 10MediaWiki-Maintenance-scripts, 06Operations, 07RfC: Add section for long-running tasks on the Deployment page (specially for database maintenance) - https://phabricator.wikimedia.org/T144661#2606680 (10hashar) I am definitely a fan of having a list of operations that... [12:42:31] (03PS1) 10MarcoAurelio: Archive Extension:ApiSandbox on ZUUL [integration/config] - 10https://gerrit.wikimedia.org/r/308325 (https://phabricator.wikimedia.org/T127012) [12:44:13] (03CR) 10Paladox: Archive Extension:ApiSandbox on ZUUL (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/308325 (https://phabricator.wikimedia.org/T127012) (owner: 10MarcoAurelio) [12:46:06] (03PS2) 10MarcoAurelio: Archive Extension:ApiSandbox on ZUUL [integration/config] - 10https://gerrit.wikimedia.org/r/308325 (https://phabricator.wikimedia.org/T127012) [12:46:37] (03CR) 10Paladox: [C: 031] "Thanks." [integration/config] - 10https://gerrit.wikimedia.org/r/308325 (https://phabricator.wikimedia.org/T127012) (owner: 10MarcoAurelio) [12:46:45] :D [12:46:48] np [12:46:54] :) [12:48:40] (03CR) 10Hashar: "The jobs have been removed back in March and the repository is marked as read-only in Gerrit. I would rather not add the project back in " [integration/config] - 10https://gerrit.wikimedia.org/r/308325 (https://phabricator.wikimedia.org/T127012) (owner: 10MarcoAurelio) [12:49:08] 10Continuous-Integration-Config, 13Patch-For-Review: Add "fail-archived-repositories" to commits to mediawiki/extensions/ApiSandbox in Gerrit - https://phabricator.wikimedia.org/T127012#2029554 (10hashar) p:05Triage>03Low a:03MarcoAurelio [12:49:13] .oO (job in a botomless pit) [12:51:24] (03CR) 10MarcoAurelio: "> The jobs have been removed back in March and the repository is" [integration/config] - 10https://gerrit.wikimedia.org/r/308325 (https://phabricator.wikimedia.org/T127012) (owner: 10MarcoAurelio) [12:53:11] hashar isn't here, is he? [12:54:09] nope [12:54:53] since it is the weekend, he is doing it on a volunteer basis on the weekend, he will be back on, on monday mornning i belive but not sure [13:23:04] 10Continuous-Integration-Config, 06Operations, 13Patch-For-Review, 07Puppet: post build failures for operations/puppet on operations-puppet-doc - https://phabricator.wikimedia.org/T143233#2606790 (10Paladox) @hashar the fix you wrote in T143233#2580487 I uploaded it to https://gerrit.wikimedia.org/r/30765... [13:27:56] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure: Update puppet-lint to 2.* - https://phabricator.wikimedia.org/T144667#2606791 (10Paladox) [14:11:08] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:46:16] Yippee, build fixed! [15:46:17] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #145: 09FIXED in 24 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/145/ [16:15:04] PROBLEM - Puppet run on deployment-db1 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [16:50:04] RECOVERY - Puppet run on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [16:51:12] RECOVERY - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is OK: OK: Less than 100.00% above the threshold [0.0] [17:08:01] PROBLEM - Puppet run on deployment-elastic07 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:10:53] PROBLEM - Puppet run on deployment-elastic05 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:16:36] PROBLEM - Puppet run on deployment-elastic08 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:20:10] PROBLEM - Puppet run on deployment-elastic06 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:02:09] RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.46 ms [18:06:44] PROBLEM - Host deployment-parsoid05 is DOWN: CRITICAL - Host Unreachable (10.68.16.120) [19:46:41] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:21:45] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [21:07:52] PROBLEM - Puppet run on deployment-redis01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:16:05] PROBLEM - Puppet run on deployment-conf03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:26:06] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:42:53] RECOVERY - Puppet run on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:06:07] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0]