[00:00:03] 10Continuous-Integration-Infrastructure, 10Wikipedia-Android-App-Backlog: Create a Wikimedia fork of the Jenkins android emulator plugin - https://phabricator.wikimedia.org/T170904#3446832 (10Mholloway) [00:07:42] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [00:08:04] RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [00:09:32] RECOVERY - Puppet errors on deployment-pdfrender02 is OK: OK: Less than 1.00% above the threshold [0.0] [00:14:29] RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [00:18:52] Project selenium-Flow » firefox,beta,Linux,BrowserTests build #455: 04FAILURE in 2 min 51 sec: https://integration.wikimedia.org/ci/job/selenium-Flow/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/455/ [00:19:39] Project selenium-Flow » chrome,beta,Linux,BrowserTests build #455: 04FAILURE in 3 min 37 sec: https://integration.wikimedia.org/ci/job/selenium-Flow/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/455/ [00:30:26] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Jenkins: Assert no PHP errors (notices, warnings) were raised or exceptions were thrown - https://phabricator.wikimedia.org/T50002#3446881 (10Krinkle) [00:31:30] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MinervaNeue: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3446884 (10Krinkle) [00:32:24] (03CR) 10Krinkle: [C: 031] test-skins should also run testsuite extensions [integration/config] - 10https://gerrit.wikimedia.org/r/365796 (https://phabricator.wikimedia.org/T170880) (owner: 10Jdlrobson) [00:32:37] (03CR) 10Krinkle: [C: 032] test-skins should also run testsuite extensions [integration/config] - 10https://gerrit.wikimedia.org/r/365796 (https://phabricator.wikimedia.org/T170880) (owner: 10Jdlrobson) [00:34:11] (03Merged) 10jenkins-bot: test-skins should also run testsuite extensions [integration/config] - 10https://gerrit.wikimedia.org/r/365796 (https://phabricator.wikimedia.org/T170880) (owner: 10Jdlrobson) [00:34:18] (03CR) 10Krinkle: [C: 032] "Compiled and deployed updated mw-testskin and mw-testskin-non-voting" [integration/config] - 10https://gerrit.wikimedia.org/r/365796 (https://phabricator.wikimedia.org/T170880) (owner: 10Jdlrobson) [00:39:14] (03PS1) 10Krinkle: Revert "test-skins should also run testsuite extensions" [integration/config] - 10https://gerrit.wikimedia.org/r/365881 [00:39:17] (03CR) 10Krinkle: [C: 032] Revert "test-skins should also run testsuite extensions" [integration/config] - 10https://gerrit.wikimedia.org/r/365881 (owner: 10Krinkle) [00:39:35] RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [00:40:45] (03Merged) 10jenkins-bot: Revert "test-skins should also run testsuite extensions" [integration/config] - 10https://gerrit.wikimedia.org/r/365881 (owner: 10Krinkle) [01:05:01] twentyafterfour we should maybe backport https://secure.phabricator.com/D18230 ? [01:16:56] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [01:28:41] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [01:40:31] PROBLEM - Puppet errors on deployment-kafka04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [01:50:56] 10MediaWiki-Codesniffer, 10Upstream: PHP_CodeSniffer 3.x breaks when prepend-autoloader: false is set (like it is in MediaWiki core) - https://phabricator.wikimedia.org/T167168#3446962 (10Legoktm) a:03Legoktm This has been fixed in 3.0.2. Patch incoming. [01:51:35] (03PS1) 10Legoktm: Update PHP_CodeSniffer to 3.0.2 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/365886 (https://phabricator.wikimedia.org/T167168) [01:51:56] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [01:53:43] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [02:01:04] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:24:37] 10Release-Engineering-Team (Kanban), 10Phabricator (Upstream), 10Upstream: Add support for task types - https://phabricator.wikimedia.org/T93499#3446984 (10mmodell) a:03mmodell [02:32:42] PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:35:30] RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [02:39:42] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 10Release-Engineering-Team (Next): deployment-imagescaler01 has no mwdeploy user - https://phabricator.wikimedia.org/T166013#3282853 (10mmodell) @hashar: Nothing wrong with the patch in gerrit, as far as I can see. I don't know where puppet is trying to ch... [02:46:03] RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:50:15] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 10Release-Engineering-Team (Next): deployment-imagescaler01 has no mwdeploy user - https://phabricator.wikimedia.org/T166013#3446996 (10mmodell) So the user's home directory is set in [[ https://phabricator.wikimedia.org/source/operations-puppet/browse/pro... [02:52:25] (03CR) 10Jforrester: [C: 032] Update PHP_CodeSniffer to 3.0.2 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/365886 (https://phabricator.wikimedia.org/T167168) (owner: 10Legoktm) [02:53:09] (03Merged) 10jenkins-bot: Update PHP_CodeSniffer to 3.0.2 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/365886 (https://phabricator.wikimedia.org/T167168) (owner: 10Legoktm) [02:53:20] \o/ [02:53:55] 10MediaWiki-Codesniffer, 10Patch-For-Review, 10Upstream: PHP_CodeSniffer 3.x breaks when prepend-autoloader: false is set (like it is in MediaWiki core) - https://phabricator.wikimedia.org/T167168#3446998 (10Legoktm) 05Open>03Resolved We should be good now. I'm planning to do a bugfix release at the end... [02:55:40] PROBLEM - Puppet errors on deployment-ores-redis-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:02:44] RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:27:09] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 10Release-Engineering-Team (Next), 10Patch-For-Review: deployment-imagescaler01 has no mwdeploy user - https://phabricator.wikimedia.org/T166013#3447048 (10mmodell) cherry picked https://gerrit.wikimedia.org/r/#/c/365891/ on beta puppetmaster. This works... [03:27:40] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 10Release-Engineering-Team (Next), 10Patch-For-Review: deployment-imagescaler01 has no mwdeploy user - https://phabricator.wikimedia.org/T166013#3447050 (10mmodell) p:05Triage>03Normal [03:28:30] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3447053 (10mmodell) fixed (or rather, worked around) T166013 by cherry picking https://gerrit.wikimedia.org/r/#/c/365891/ [03:28:43] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3447054 (10mmodell) p:05Triage>03Normal [03:29:38] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3431671 (10mmodell) @marktraceur: does this resolve the issue for you? If so then we just need to get the patch merged so that it doesn't have to remai... [03:30:50] PROBLEM - Puppet errors on deployment-kafka03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:37:11] RECOVERY - Puppet errors on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:38:27] PROBLEM - Puppet staleness on deployment-eventlogging03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [43200.0] [03:55:42] RECOVERY - Puppet errors on deployment-ores-redis-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:59] !log Fixed deployment-imagescaler01 by cherry-picking https://gerrit.wikimedia.org/r/#/c/365891/ on deployment-puppetmaster02 [03:58:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [03:58:27] Yippee, build fixed! [03:58:28] Project selenium-MultimediaViewer » firefox,mediawiki,Linux,BrowserTests build #456: 09FIXED in 2 min 26 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=BrowserTests/456/ [04:05:43] PROBLEM - Puppet errors on deployment-salt02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [04:12:22] PROBLEM - Puppet staleness on deployment-kafka01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [43200.0] [04:34:44] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Cloud-VPS, 10Nodepool, and 2 others: figure out if nodepool is overwhelming rabbitmq and/or nova - https://phabricator.wikimedia.org/T170492#3447095 (10bd808) [04:35:50] RECOVERY - Puppet errors on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [04:45:45] RECOVERY - Puppet errors on deployment-salt02 is OK: OK: Less than 1.00% above the threshold [0.0] [05:06:01] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [05:15:17] PROBLEM - Puppet errors on deployment-sentry01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [05:22:46] PROBLEM - Puppet errors on deployment-jobrunner02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [05:31:43] PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [05:36:12] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [05:42:40] PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [05:45:23] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:02:41] RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0] [06:03:33] PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:04:18] 10Continuous-Integration-Config, 10BlueSpice, 10Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#3447141 (10Osnard) Awesome, thank you. Really the double quotes? [06:05:18] RECOVERY - Puppet errors on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [06:11:44] RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [06:18:33] RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [06:20:23] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [06:26:00] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [06:32:47] RECOVERY - Puppet errors on deployment-jobrunner02 is OK: OK: Less than 1.00% above the threshold [0.0] [06:36:11] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [06:45:48] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Cloud-Services, 10Cloud-VPS, and 2 others: Lower rate of Nodepool requests to OpenStack API - https://phabricator.wikimedia.org/T167803#3447148 (10hashar) And on July 12th RabbitMQ apparently exploded possibly due to the rate c... [06:52:43] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [07:09:04] PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [07:11:50] PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:12:23] RECOVERY - Puppet staleness on deployment-kafka01 is OK: OK: Less than 1.00% above the threshold [3600.0] [07:22:44] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [07:23:24] RECOVERY - Puppet staleness on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [3600.0] [07:26:15] hashar: hey, is it okay to get back phab and gerrit account? https://phabricator.wikimedia.org/T170801 [07:26:25] I want to review things for my teammates [07:29:04] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: deployment-imagescaler01 has no mwdeploy user - https://phabricator.wikimedia.org/T166013#3447165 (10hashar) a:03mmodell [07:31:22] 10MediaWiki-Codesniffer, 10Patch-For-Review, 10Upstream: PHP_CodeSniffer 3.x breaks when prepend-autoloader: false is set (like it is in MediaWiki core) - https://phabricator.wikimedia.org/T167168#3447170 (10hashar) Awesome! Thank you Kunal. [07:37:54] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [07:40:22] 10Scap (Scap3-Adoption-Phase1), 10Trebuchet: Cleanup /srv/deployment - https://phabricator.wikimedia.org/T170881#3447175 (10hashar) [07:40:48] 10Scap (Scap3-Adoption-Phase1), 10Trebuchet: Cleanup /srv/deployment - https://phabricator.wikimedia.org/T170881#3446038 (10hashar) [07:41:52] Amir1: operations team is taking care of it [07:42:14] hashar: okay, I thought they are working on the prod access part [07:42:16] Amir1: they are responsible for granting access to the cluster and checking whatever process :D [08:02:56] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [08:06:50] RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:11:18] (03CR) 10Hashar: [C: 032] "\O/" [integration/config] - 10https://gerrit.wikimedia.org/r/364804 (owner: 10Ejegg) [08:12:27] (03Merged) 10jenkins-bot: Get rid of npm test for CiviCRM [integration/config] - 10https://gerrit.wikimedia.org/r/364804 (owner: 10Ejegg) [08:19:02] (03CR) 10Hashar: [C: 031] Add Squiz.Classes.SelfMemberReference to ruleset [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/363525 (owner: 10Legoktm) [08:24:56] PROBLEM - Puppet errors on deployment-apertium02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [08:37:00] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:45:37] PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:49:54] RECOVERY - Puppet errors on deployment-apertium02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:52:00] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [08:59:20] 10Continuous-Integration-Infrastructure, 10Wikipedia-Android-App-Backlog: Create a Wikimedia fork of the Jenkins android emulator plugin - https://phabricator.wikimedia.org/T170904#3447253 (10hashar) Surely we could create a repo in Gerrit `integration/jenkinsci/android-emulator-plugin`, fork from git hub and... [09:04:31] !log deleted integration-slave-trusty-1006 [09:04:36] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:08:15] PROBLEM - Host integration-slave-trusty-1006 is DOWN: CRITICAL - Host Unreachable (10.68.17.118) [09:25:55] 10MediaWiki-Releasing, 10MediaWiki-Containers, 10Services (doing), 10User-mobrovac, 10Wikimedia-Hackathon-2015: Ready-to-use Docker package for MediaWiki - https://phabricator.wikimedia.org/T92826#3447286 (10Pastakhov) Take a look at my solution, maybe you'll find there something useful. https://github.c... [09:43:27] PROBLEM - Puppet errors on deployment-mediawiki04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [09:45:17] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Technical-Debt: Migrate CI labs slaves to use /srv instead of /mnt - https://phabricator.wikimedia.org/T146381#3447319 (10hashar) I have quickly talked about it with @Joe . The migration plan is: * mark the... [09:50:36] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:00:04] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Extension unit tests do not run due to not being able to load entry file - https://phabricator.wikimedia.org/T71247#3447329 (10hashar) 05Open>03Resolved a:03hashar Compared to 3 years ago, most extensions... [10:03:20] 10Continuous-Integration-Infrastructure, 10Goal, 10Technical-Debt: All repositories should pass jshint test - https://phabricator.wikimedia.org/T62619#3447340 (10hashar) [10:03:22] 10Continuous-Integration-Config, 10ArticleFeedbackv5, 10Brickimedia: ArticleFeedbackv5 should pass jshint - https://phabricator.wikimedia.org/T63588#3447335 (10hashar) 05Open>03Resolved a:05SamanthaNguyen>03Fomafix @Fomafix added eslint support 3 months ago via 6c0ebf2f6e18ab69e01792a6697bb384b6b7b20e [10:03:25] RECOVERY - Puppet errors on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0] [10:04:50] 10Continuous-Integration-Config: Prevent the addition of files with names that aren't supported on Windows - https://phabricator.wikimedia.org/T67140#3447341 (10hashar) 05Open>03declined I am declining this one, mostly because I have no clue how we can enforce it on all repositories. [10:05:50] 10Continuous-Integration-Config, 10Documentation: Jenkins: Set up generated php documentation for MediaWiki extensions - https://phabricator.wikimedia.org/T27978#3447344 (10hashar) [10:05:52] 10Continuous-Integration-Config, 10Collaboration-Team-Triage, 10Flow, 10Documentation: Generate Doxygen documentation for Flow PHP classes to doc.wikimedia.org - https://phabricator.wikimedia.org/T93107#3447343 (10hashar) 05Open>03stalled [10:13:38] 10Continuous-Integration-Config, 10BlueSpice, 10Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#3447351 (10Paladox) your welcome and yep :) [10:23:43] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [10:29:59] 10Continuous-Integration-Infrastructure, 10Wikipedia-Android-App-Backlog: Create a Wikimedia fork of the Jenkins android emulator plugin - https://phabricator.wikimedia.org/T170904#3447386 (10hashar) From upstream developer (orrc): > The SdkConstants file doesn't actually need to be up-to-date in order to use... [10:37:41] (03PS1) 10Hashar: Add unstable status to browser tests jobs [integration/config] - 10https://gerrit.wikimedia.org/r/365933 (https://phabricator.wikimedia.org/T94684) [10:38:01] 10Browser-Tests-Infrastructure, 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Browser test jobs should use xUnit publisher instead of Junit - https://phabricator.wikimedia.org/T94684#3447406 (10hashar) a:03hashar Cleaning up the attic! [10:39:06] 10Release-Engineering-Team (Watching / External), 10Wikidata, 10Story: [Story] Use composer-merge-plugin to include Wikidata components in mediawiki-vendor - https://phabricator.wikimedia.org/T95663#3447410 (10Addshore) >>! In T95663#3444616, @Legoktm wrote: > I think we should start this by doing it one ext... [10:39:49] (03CR) 10Hashar: "There is some additional details on the parent task T94212 "accommodate for flappy tests"" [integration/config] - 10https://gerrit.wikimedia.org/r/365933 (https://phabricator.wikimedia.org/T94684) (owner: 10Hashar) [10:41:14] 10Continuous-Integration-Infrastructure, 10Wikipedia-Android-App-Backlog: Create a Wikimedia fork of the Jenkins android emulator plugin - https://phabricator.wikimedia.org/T170904#3447411 (10hashar) Another point mentionned by orrc: we already use `android-24` which the plugin does not know about :] [10:45:10] 10Continuous-Integration-Infrastructure, 10Wikipedia-Android-App-Backlog: Create a Wikimedia fork of the Jenkins android emulator plugin - https://phabricator.wikimedia.org/T170904#3447425 (10hashar) So our job https://integration.wikimedia.org/ci/job/apps-android-wikipedia-periodic-test/ actually got changed... [10:58:42] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [11:04:16] (03CR) 10Zfilipin: "I am still against introducing a new state to a job. It either passes or fails. If it fails it should be fixed or deleted. There is no add" [integration/config] - 10https://gerrit.wikimedia.org/r/365933 (https://phabricator.wikimedia.org/T94684) (owner: 10Hashar) [11:04:29] PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [11:21:35] PROBLEM - Puppet errors on deployment-zookeeper02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [11:31:39] RECOVERY - Puppet errors on deployment-zookeeper02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:34:29] RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:34:50] (03PS1) 10Aude: Update Wikidata to wmf/1.30.0-wmf.10 [tools/release] - 10https://gerrit.wikimedia.org/r/365945 [11:36:08] (03CR) 10Aude: [C: 032] Update Wikidata to wmf/1.30.0-wmf.10 [tools/release] - 10https://gerrit.wikimedia.org/r/365945 (owner: 10Aude) [11:38:36] (03Merged) 10jenkins-bot: Update Wikidata to wmf/1.30.0-wmf.10 [tools/release] - 10https://gerrit.wikimedia.org/r/365945 (owner: 10Aude) [12:11:33] PROBLEM - Puppet errors on deployment-kafka04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [12:21:06] 10Continuous-Integration-Config: Allow running old tests for old branches that don't support npm - https://phabricator.wikimedia.org/T106790#3447658 (10hashar) 05Open>03declined By now, most of the extensions/skins should have a npm packages in almost all branches. If there is some backports patch blocked o... [12:24:43] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [12:25:15] 10Continuous-Integration-Config, 10commit-message-validator, 10Pywikibot-core: Check the style of the commit message - https://phabricator.wikimedia.org/T109119#3447667 (10hashar) Ops asked for some kind of commit message validation for `operations/puppet`. I went with adding a new env in tox which installs... [12:29:42] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [12:32:08] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Update gerrit to 2.14.1 - https://phabricator.wikimedia.org/T156120#3447670 (10Paladox) 2.14.2 will be released tomorrow. Will update the description tomorrow. :) [12:32:12] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [12:32:35] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10MathSearch: mwext-testextension-zend should load extension mathsearch after math - https://phabricator.wikimedia.org/T117659#3447672 (10hashar) 05Open>03Resolved a:03hashar That is quite an old bug. The PHPUnit job seems to pass ju... [12:34:20] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Enable CI jobs on the new CiviCRM repos - https://phabricator.wikimedia.org/T118604#3447682 (10hashar) 05Open>03Resolved a:03hashar This task is from November 2015. @Eileenmcn... [12:38:38] 10Continuous-Integration-Config, 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Continuous integration for wikimedia/fundraising/civicrm-buildkit - https://phabricator.wikimedia.org/T120044#3447698 (10hashar) 05Open>03declined Seems to have been related to {T118604} The repo never had Jenkins j... [12:41:29] RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [12:43:54] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:47:26] 10Continuous-Integration-Config, 10Patch-For-Review: Set up composer-test for all MW extensions where it isn't broken - https://phabricator.wikimedia.org/T124342#1953205 (10hashar) Note that now composer test is run from the job running PHPUnit whenever there is a composer.json file. So I guess it is all abou... [12:50:47] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10MW-1.30-release-notes (WMF-deploy-2017-07-11_(1.30.0-wmf.9)), 10Patch-For-Review, 10User-zeljkofilipin: Run WebdriverIO tests in CI for extensions - https://phabricator.wikimedia.org/T164721#3447711 (10zeljkofilipin) Big progress, Rel... [12:57:14] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:00:29] 10Continuous-Integration-Config: generalize extension submodule handling - https://phabricator.wikimedia.org/T130966#3447740 (10hashar) [13:00:31] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other: LinkSuggest2 test failing due to missing files located in sub repo - https://phabricator.wikimedia.org/T155773#3447739 (10hashar) [13:02:53] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10MediaWiki-extensions-Other: GooglePlaces tests failing due to missing files located in sub repo - https://phabricator.wikimedia.org/T154848#3447747 (10hashar) [13:02:56] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10MediaWiki-extensions-Other: PagesList tests failing due to missing files located in sub repo - https://phabricator.wikimedia.org/T154930#3447746 (10hashar) [13:02:59] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other: FlickrAPI test failing due to missing files located in sub repo - https://phabricator.wikimedia.org/T154847#3447748 (10hashar) [13:03:02] 10Continuous-Integration-Config, 10Brickimedia, 10MediaWiki-Core-Tests, 10Refreshed: Skin Refreshed sub repo does not handled in test config - https://phabricator.wikimedia.org/T154806#3447749 (10hashar) [13:03:03] 10Continuous-Integration-Config: generalize extension submodule handling - https://phabricator.wikimedia.org/T130966#3447744 (10hashar) [13:03:38] 10Continuous-Integration-Config: generalize extension submodule handling - https://phabricator.wikimedia.org/T130966#2152304 (10hashar) [13:03:40] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Librarization, 10WorkType-NewFunctionality, 10Zuul: Zuul-cloner should check out submodules - https://phabricator.wikimedia.org/T84942#3447752 (10hashar) [13:04:43] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [13:07:12] 10Continuous-Integration-Config: generalize extension submodule handling - https://phabricator.wikimedia.org/T130966#2152304 (10hashar) We had T84942 to add submodule support in zuul-cloner. We do process the submodule for mediawiki/extensions/VisualEditor via a hack in the Jenkins job configuration: ``` lang=y... [13:07:22] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): generalize extension submodule handling - https://phabricator.wikimedia.org/T130966#3447775 (10hashar) [13:07:59] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [13:11:37] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [13:11:46] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): generalize extension submodule handling - https://phabricator.wikimedia.org/T130966#3447779 (10hashar) [13:19:44] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [13:23:00] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [13:23:44] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:43:41] PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [13:46:38] Project selenium-VisualEditor » firefox,beta,Linux,BrowserTests build #463: 04FAILURE in 2 min 37 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/463/ [13:54:38] 10Scap (Scap3-Adoption-Phase1), 10Trebuchet: Cleanup /srv/deployment - https://phabricator.wikimedia.org/T170881#3448050 (10fgiunchedi) [13:56:39] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [13:58:43] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:05:42] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:05:44] PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:13:44] RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0] [14:23:27] PROBLEM - Puppet errors on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:28:35] Hi, is this the right place to ask about git-ssh.wmo keys? [14:30:43] awight phabricator? [14:30:47] yes [14:31:46] :D exactly. paladox do you happen to know how I would add myself a second public key? [14:32:18] yes [14:32:26] awight go to your settings in phabricator [14:32:44] then click on ssh public key [14:33:03] paladox: lol thank you for saving my sanity. [14:33:03] then click on the dropdown on the right. [14:33:08] your welcome :) [14:33:38] Project selenium-WikiLove » firefox,beta,Linux,BrowserTests build #458: 04FAILURE in 1 min 37 sec: https://integration.wikimedia.org/ci/job/selenium-WikiLove/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/458/ [14:40:45] RECOVERY - Puppet errors on deployment-fluorine02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:42:37] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:48:21] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3448269 (10MarkTraceur) @mmodell it looks like https://commons.wikimedia.beta.wmflabs.org/wiki/File:Crystal-38.stl still has a broken thumbnail... [14:49:55] Finally! Difficulties worth asking about in here. I'm getting this error when trying to scap ORES from tin.wmflabs: [14:49:58] > 14:47:54 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'ores/deploy', '-g', 'worker', 'fetch', '--refresh-config'] on deployment-sca03.deployment-prep.eqiad.wmflabs returned [255]: Agent admitted failure to sign using the key. [14:50:42] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [14:51:27] awight try scap deploy-log -v? [14:51:36] nice, ty [14:51:59] your welcome :). It will tell us the exact error. thcipriani taught me that one :) [14:52:12] Same messages, sad to report. [14:52:23] does it show any thing else? [14:52:28] it should be verbose [14:52:36] PROBLEM - Puppet errors on deployment-zookeeper02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:52:38] ie like a mist match key. [14:52:43] awight: agent admitted failure to sign key means you're not allowed to use the key in keyholder [14:52:46] It does show a few more lines, but no new info [14:53:14] thcipriani|afk: Is that a permission I need to request? [14:53:18] try: SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -l deploy-service [some-host-to-which-you-are-trying-to-deploy] [14:53:21] just to verify [14:53:38] * awight is impressed by these afk skills :p [14:54:07] http://www.theonion.com/article/man-invisible-on-gchat-observes-world-from-impregn-32701 [14:54:50] thcipriani: Verified same error [14:55:23] gotcha, ok...lemme see if I can fix this... [14:55:56] 10Continuous-Integration-Config, 10Analytics-Kanban, 10Analytics-Wikistats: Set up continuos integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3448318 (10fdans) [14:56:00] rad, thanks! [14:57:41] \o/ [14:58:20] awight: try the command: SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -l deploy-service deployment-sca03.deployment-prep.eqiad.wmflabs [14:58:31] if that works now you should be able to deploy in beta [14:59:48] thcipriani: It now works--well played :D [14:59:53] :D [15:00:20] fwiw, I had the same error on real tin. Does your mystery fix happen to cover both environments? [15:00:45] my mystery fix is just: usermod -a -G deploy-service awight [15:00:54] since deploy-service is a local group on deployment-tin [15:01:14] you'll need to add yourself to the deploy-service group via admin.yaml in puppet to get it working in production [15:01:49] awight: https://github.com/wikimedia/puppet/blob/production/modules/admin/data/data.yaml#L410-L417 [15:01:54] aha! nice, k I can do that [15:02:06] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:05:45] 10Continuous-Integration-Config, 10Analytics-Kanban, 10Analytics-Wikistats: Set up continuos integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3448332 (10fdans) As discussed with @hashar, we'd like to add Continuous Integration to the UI part of the new Wikistats, which is hosted i... [15:12:36] RECOVERY - Puppet errors on deployment-zookeeper02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:13:32] 10Continuous-Integration-Config, 10Analytics-Kanban, 10Analytics-Wikistats: Set up continuous integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3448343 (10Milimetric) [15:14:26] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3434437 (10Milimetric) a:03Milimetric [15:16:16] Question: if we use diffusion do commits not show in phabricator tickets? [15:16:26] for any releng folk [15:17:41] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [15:19:52] nuria_: if they're associated they do. See eg https://phabricator.wikimedia.org/D600 and https://phabricator.wikimedia.org/T125629 [15:21:32] greg-g: k [15:21:33] I think what nuria may be asking is if there's something analogous to writing "Bug: T123456" in the commit message (without using arc). I think the answer is you have to use arc, right? [15:21:34] T123456: Special:CentralAuth reports account attachment, which - being standalone - is confusing, report accout creation as well - https://phabricator.wikimedia.org/T123456 [15:21:42] hi thanks for the CR hashar! oughtta speed things up for CRM patches [15:22:35] milimetric: are you using Differential for code review? [15:23:19] we're on our first repo, my plan is to use Differential, yeah, but right now it's early enough in the project that we're just pushing [15:23:30] we should use Differential though [15:24:30] (the code reviews so far were just informal and in-person) [15:30:33] milimetric: right, we need more formal crs going forward, let's do those for 2nd release [15:33:05] PROBLEM - Puppet errors on integration-slave-docker-1000 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:42:04] RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:29] PROBLEM - Puppet errors on deployment-kafka04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:59:32] 10Continuous-Integration-Infrastructure, 10Wikipedia-Android-App-Backlog: Create a Wikimedia fork of the Jenkins android emulator plugin - https://phabricator.wikimedia.org/T170904#3448514 (10Mholloway) >>! In T170904#3447425, @hashar wrote: > So our job https://integration.wikimedia.org/ci/job/apps-android-wi... [16:01:36] ejegg: you are welcome :] [16:03:34] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Analytics-Kanban, 10Analytics-Wikistats: Set up continuous integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3432047 (10hashar) [16:06:18] hashar! Could you get me a copy of the LocalSettings used during a job run? [16:06:37] For mwext-testextension-hhvm-jessie for https://gerrit.wikimedia.org/r/#/c/364253/ [16:07:05] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10MathSearch: mwext-testextension-zend should load extension mathsearch after math - https://phabricator.wikimedia.org/T117659#3448544 (10Physikerwelt) 05Resolved>03Open No, currently the tests for mathsearch do not run at all. cf. htt... [16:11:25] addshore: I thought the jobs were made to capture the generated LocalSettings.php! [16:11:41] I thought so too, but I dont see them in the list of stuff that is archived [16:12:29] As can be seen @ https://integration.wikimedia.org/ci/job/mwext-testextension-hhvm-jessie/11172/ [16:13:11] PROBLEM - Puppet errors on deployment-ms-be03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:17:02] addshore: it used to but I found the issue :] [16:17:10] :D [16:19:59] !log manually installed "aspell-el" on deployment-sca03 (work around for ongoing puppet issues) [16:20:04] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:20:32] PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:20:33] addshore: so that is not so trivial to do in the limited time I have [16:20:49] addshore: but in short the LocalSettings.php is generated by mediawiki install.php [16:20:56] oh really? [16:20:57] hmmm [16:21:00] okay [16:21:02] addshore: then some code append to it the snippets from https://github.com/wikimedia/integration-jenkins/tree/master/mediawiki/conf.d [16:21:17] ack, that should be enough for what I need to look into now [16:21:18] addshore: should we capture the one from https://gerrit.wikimedia.org/r/#/c/364253/ ? [16:21:32] If you could do a one off capture that would be great! [16:22:24] doing [16:22:33] I am waiting for an instance to be available :D [16:26:10] addshore: ok I am on the instance [16:26:14] :D [16:26:48] !log manually restarted uwsgi-ores and celery-ores-worker on deployment-sca03 [16:26:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:27:54] addshore: I did a cp manually and the file is thus attached to https://integration.wikimedia.org/ci/job/mwext-testextension-hhvm-jessie/11181/ [16:28:00] Thanks! [16:28:31] addshore: you also have debug logs captured to mw-debug-cli.log [16:28:45] so you can add a bunch of poor wfDebug() in the code to dump stuff :d [16:30:46] I am off, dinner time [16:30:51] See ya! [16:32:31] RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [16:44:25] PROBLEM - Puppet errors on deployment-mediawiki04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [16:49:37] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3448704 (10Gilles) Thumbor has no support for STL yet, and it can't get it until 3d2png is available. The first step is getting 3d2png to be on the ser... [16:58:17] PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:04:06] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3448768 (10Gilles) I think I've found the actual command in mediawiki-config, but xvfb-run is missing: ``` gilles@deployment-imagescaler01:~$ /usr/bin... [17:11:15] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3448798 (10Gilles) OK, finally found the right command to invoke it: ``` gilles@deployment-imagescaler01:~$ DISPLAY=:99 /usr/bin/xvfb-run -a -s "-ac -... [17:13:27] 10Beta-Cluster-Infrastructure, 10Multimedia, 10Thumbor: On beta commons, thumbnailing of 3D files is broken still - https://phabricator.wikimedia.org/T170444#3448809 (10Gilles) [17:14:24] PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:19:24] RECOVERY - Puppet errors on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0] [17:20:13] * yuvipanda waves [17:31:56] jdlrobson: I just put wmf.10 on testwiki. Wanna give it a look and make sure MinervaNeue is all sorted properly? [17:32:59] so I've been setting up a brand new, super standard kubernetes cluster inside tools for PAWS [17:33:16] and I wanted to have Jenkins do builds and deploys for me [17:33:29] so I've set up a Jenkins installation on the cluster [17:33:33] ! [17:33:47] https://jenkins.paws.tools.wmflabs.org/blue - you can create accounts or just use mine (yuvipanda pwd: plasmafury) [17:33:50] it has the new blueocean UI too! [17:35:14] (can't tell if Dan or Tyler are here from Matrix) [17:36:20] * RainbowSprinkles chuckles [17:37:10] RainbowSprinkles is Chad right? I forget :) [17:38:14] Matrix doesn't have /whois either? [17:38:30] :p [17:39:19] I forgot I came here to be made fun of. [17:39:26] Hey, it's me :) [17:39:51] I think thcipriani is about tho, to answer your earlier question :) [17:40:09] * thcipriani waves [17:41:08] RainbowSprinkles: :) [17:41:32] yuvipanda: that's kind of awesome :) [17:42:14] thcipriani: :D yeah! am happy and surprised at how easy it was to set up [17:43:16] RECOVERY - Puppet errors on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:43:26] (am installing more plugins and restarting and what not) [17:46:22] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:51:20] RainbowSprinkles: hey and sorry [17:51:21] looking [17:51:53] No worries. I'm not rolling to the rest of group0 until noon :) [17:52:07] Just wanted to double check, since wmf.10 was supposed to be pristine here [17:52:59] RainbowSprinkles: just catching up. wmf10 is looking good on test 1001 [17:53:40] Cool cool [17:53:41] :) [18:00:24] RainbowSprinkles: so is the wmf9 fix synced / testable yet? [18:01:33] I went afk while jenkins was going, gonna pull to tin now [18:02:49] RECOVERY - Puppet errors on deployment-prometheus01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:03:40] Pulled to mwdebug1001 [18:08:13] RainbowSprinkles: tessstttinngggg [18:08:35] RainbowSprinkles: verified! [18:15:00] RainbowSprinkles: any chance we can get https://gerrit.wikimedia.org/r/#/c/366019/ and https://gerrit.wikimedia.org/r/#/c/366020/ merged too? [18:15:03] it's minor but fugly [18:15:39] Yeah. I already started scap, but that'd just be a quick sync-file [18:15:43] <3 [18:15:51] merge & backport [18:16:00] (go ahead and press +2 on the backports I'll pull it when ready) [18:17:06] 10Release-Engineering-Team (Kanban): Create code health mailing list - https://phabricator.wikimedia.org/T170963#3449127 (10Jrbranaa) [18:18:08] 10Release-Engineering-Team (Kanban): Create code health mailing list - https://phabricator.wikimedia.org/T170963#3449146 (10Jrbranaa) p:05Triage>03Normal [18:19:28] RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [18:21:15] 10Release-Engineering-Team (Kanban): Create code health mailing list - https://phabricator.wikimedia.org/T170963#3449179 (10greg) [18:21:22] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:25:44] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:26:54] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MW-1.30-release-notes, 10MinervaNeue, and 2 others: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3449204 (10Jdlrobson) I'm pulling this into the sprint... [18:27:09] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MW-1.30-release-notes, 10MinervaNeue, and 2 others: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3445966 (10Jdlrobson) p:05Triage>03Normal [18:30:04] jdlrobson: Ok scap done, so messages everywhere now. Did you merge + backport the last extension.json bits? [18:34:01] Ah, merging now [18:34:48] RainbowSprinkles: thanks [18:35:44] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [19:13:58] Yippee, build fixed! [19:13:59] Project selenium-MinervaNeue » chrome,beta,Linux,BrowserTests build #14: 09FIXED in 24 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/14/ [19:14:27] PROBLEM - Puppet staleness on deployment-eventlogging03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [43200.0] [19:15:33] PROBLEM - Puppet errors on deployment-pdfrender02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [19:22:33] Yippee, build fixed! [19:22:33] Project selenium-MinervaNeue » firefox,beta,Linux,BrowserTests build #14: 09FIXED in 33 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/14/ [19:38:06] 10Release-Engineering-Team, 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Find way to exclude php 5.4 files from vendor lint task - https://phabricator.wikimedia.org/T170641#3449655 (10Ejegg) [19:45:05] 10Continuous-Integration-Config, 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM, 10FR-Email, and 8 others: Get CI working on Omnimail - https://phabricator.wikimedia.org/T169593#3449714 (10Ejegg) 05Open>03Resolved [19:51:44] (03CR) 10Jdlrobson: [C: 031] "> I am still against introducing a new state to a job. It either passes or fails. If it fails it should be fixed or deleted. There is no a" [integration/config] - 10https://gerrit.wikimedia.org/r/365933 (https://phabricator.wikimedia.org/T94684) (owner: 10Hashar) [19:55:33] RECOVERY - Puppet errors on deployment-pdfrender02 is OK: OK: Less than 1.00% above the threshold [0.0] [19:56:45] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [20:04:34] (03CR) 10Hashar: [C: 032] "That is a bit hacky and not perfect but good enough for an experiment. Will polish up as we find issues on it." [integration/config] - 10https://gerrit.wikimedia.org/r/362309 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [20:06:29] (03Merged) 10jenkins-bot: R based job for wikimedia/discovery/ortiz [integration/config] - 10https://gerrit.wikimedia.org/r/362309 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [20:11:13] 10Release-Engineering-Team, 10MediaWiki-Containers, 10Operations, 10Epic, and 3 others: FY2017/18 Program 6 - Outcome 2 - Objective 3: Integrated, container-based development environment - https://phabricator.wikimedia.org/T170456#3449809 (10GWicke) a:03mobrovac [20:16:29] (03PS1) 10Umherirrender: [UserExport] Add npm job [integration/config] - 10https://gerrit.wikimedia.org/r/366042 [20:22:43] (03PS1) 10Hashar: Rename ortiz job to remove -jessie suffix [integration/config] - 10https://gerrit.wikimedia.org/r/366045 (https://phabricator.wikimedia.org/T153856) [20:24:20] (03CR) 10Hashar: [C: 032] "I am renaming the job due to some oddity in CI :D" [integration/config] - 10https://gerrit.wikimedia.org/r/366045 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [20:26:20] (03Merged) 10jenkins-bot: Rename ortiz job to remove -jessie suffix [integration/config] - 10https://gerrit.wikimedia.org/r/366045 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [20:31:45] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [20:35:45] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MW-1.30-release-notes, 10MinervaNeue, and 2 others: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3445966 (10pmiazga) I'm looking at it now, as a first s... [20:47:19] 10Release-Engineering-Team (Kanban), 10Phabricator, 10Reading-Web-Backlog (Tracking): Create Phame blog for Readers department - https://phabricator.wikimedia.org/T170793#3449963 (10mmodell) 05Open>03Resolved a:03mmodell {icon check} [[ /phame/blog/manage/9/ | Leave it to the prose ]] [21:00:41] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3449984 (10Nuria) 05Open>03Resolved [21:02:54] RainbowSprinkles i've added translation support to polygerrit https://gerrit-review.googlesource.com/#/c/114092/ :) [21:04:08] Storing languages in the HTML like that looks like it'd get a little hard to manage [21:04:13] Once you get more than a few strings and languages [21:04:17] But cool proof of concept! [21:04:17] yeh [21:04:23] thinking of putting it into a file [21:05:00] it took me all day to do that (lol). I experimented with a few librarys which turns out made it even more complicated to support this. [21:05:01] Like json or something [21:05:05] yeh [21:05:09] Makes sense if you're using JS to load it [21:05:15] yeh [21:05:20] i will look into that now [21:05:42] the question is how can you recursivly load files from a folder in js. [21:06:02] You'd want to only load the languages on demand as needed [21:06:22] And then use fallback behavior (basically, all things eventually become english if they're not translated) [21:06:31] ah i see [21:06:57] Also, you'll need support for adding parameters to strings. So like "New user: $1" [21:06:58] etc. [21:07:04] It's a rabbit hole :) [21:07:17] lol yeh [21:07:28] I wonder what libraries you could use here. Building your own is pretty complicated. There's a *lot* of code in MW just for this [21:07:45] theres i18next [21:07:59] I built an i18n system in python! It probably sucks [21:08:53] i18n is very hard [21:08:57] though i tryed a couple today, i18next looks even more complicated. [21:09:07] The entire problem domain is nothing but edge cases and exceptions [21:09:28] yeh [21:10:13] paladox: Anyway, I don't wanna discourage you but this is a pretty hard problem to solve "properly." I'm thinking it's something bigger than upstream is willing to take on [21:10:30] upstream plan to do this in 2018 [21:10:41] according to one of the maintainers of polygerrit [21:10:51] i just did this a year earlier [21:10:52] lol [21:10:54] Oh. [21:10:58] Well then why bother? [21:11:03] Just let them do the work hah [21:11:11] I filled it and began working on it [21:11:17] before they said it's on there plan [21:11:45] RainbowSprinkles: but I ended up building my own because the alternative was this thing that involved compiled message files and I was just like, no. [21:11:45] https://bugs.chromium.org/p/gerrit/issues/detail?id=6765#c7 [21:12:18] harej: In python? What was it using, gettext? [21:12:24] (gettext is cool, but kinda a pain to work with) [21:12:28] I think so [21:12:38] Some really oldschool software package where you have to compile the message files [21:12:44] Yep, that's gettext [21:12:45] .po files [21:12:47] and Python was just an abstraction layer on top of it [21:13:01] My own library, Worldly, uses YAML files. It's nifty. [21:13:02] There's probably something better out there in python land [21:13:18] harej: Yaml is always the answer. [21:13:18] I looked! [21:13:42] Oh I don't doubt it, I just don't believe the only implementation of l10n in the python ecosystem is gettext :) [21:13:53] https://github.com/harej/worldly [21:17:22] PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:21:11] thunder and lighting outside heh [21:26:26] 10MediaWiki-Releasing, 10Release-Engineering-Team (Watching / External), 10Architecture, 10Parsoid, and 2 others: Evaluate and decide on a distribution strategy targeted at VMs - https://phabricator.wikimedia.org/T87774#3450150 (10GWicke) [21:26:31] 10MediaWiki-Releasing, 10Architecture, 10Parsoid, 10Service-Architecture, 10Services (later): Distribution strategy option: Use Vagrant puppet modules - https://phabricator.wikimedia.org/T88151#3450149 (10GWicke) 05Open>03declined [21:27:21] RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:28:07] RainbowSprinkles if i introduce a basic version, more likly someone will come and make it advanced. Basically kicking off translation support :) [21:29:22] 10MediaWiki-Releasing, 10Release-Engineering-Team (Watching / External), 10Architecture, 10Parsoid, and 2 others: Evaluate and decide on a distribution strategy targeted at VMs - https://phabricator.wikimedia.org/T87774#3450155 (10GWicke) Current work focused on Docker containers and Kubernetes is happenin... [21:29:58] greg-g: for my life .... i cannot free space on this machine deployment-eventlogging03.eqiad.wmflabs [21:30:10] greg-g: i have deleted logs but size still displays as same [21:30:29] greg-g: any ideas or rather advice in swaping this machine by a different one? [21:32:03] nuria_: Tried doing `apt-get clean` and `apt-get autoclean` ? [21:32:15] Reedy: no, will try [21:32:31] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Discovery, 10Operations, 10Discovery-Analysis (Current work): Setup a mirror for R language dependencies (CRAN) - https://phabricator.wikimedia.org/T170995#3450163 (10hashar) [21:32:42] Reedy: trying now, thanks [21:33:00] If a machine is doing unattended upgrades... It can end up caching a lot of crappy packages [21:33:23] Reedy: we should cron an "apt-get clean" :D [21:33:32] Probably [21:33:37] nuria_: It's broken beta machines a few times :( [21:33:50] then it runs out of disk but nothing left for a quick fix :P [21:34:57] nuria_: a somehow helpful command: sudo du -m -d 1 -x / [21:35:13] Reedy: nothing still /dev/vda1 is full ( iam guessing mysql) but cannot even get to drop db [21:35:25] crawls directories from / , -x to stay on the same disk, then it spurts the size in MBytes [21:35:31] and I usually then crawl from dir to dir [21:36:01] hashar:nice, let me write that one in my secret stash of useful stuff [21:36:13] nuria_: du is super helpful :] [21:36:36] looks like madhuvishy has a 2GBytes home on deployment-eventlogging03 , probably something can be cleaned up there [21:37:14] and /var/log/upstart has 2.2G of logs [21:37:33] hashar: which i tried to clean to no avail [21:38:01] hashar: let me retry ( i just rebooted) [21:38:06] well I guess something is erroring out in a loop :/ [21:38:21] PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:38:55] hashar: ok, now we are getting somewhere, one of those fixed something [21:39:23] and /var/lib/myql has 6300 MBytes used [21:40:01] hashar: but that one you cannot ahem delete just like that no? [21:40:09] the issue in short is the instance has 40G, 20G are allocated to / and 20G to /srv [21:40:10] hashar: i know issue steams from mysql so it makes sense [21:40:29] so usually we point software to write their data under /srv to save disk space on the / partition [21:40:40] has mysql been unhappy there? [21:40:42] -rw-rw---- 1 mysql mysql 33554432 Jul 9 12:34 tokudb.rollback [21:40:56] nuria_: I avoid touching myql files :D [21:41:01] it is full due to some huge table after some GOOD testing [21:41:41] Sun Jul 9 18:29:05 2017 TokuFT No space when writing 48461 bytes to /var/lib/mysql/log000000000316.tokulog27�l��) retry in 1 second [21:41:41] Sun Jul 9 18:30:05 2017 To [21:41:43] lol [21:41:49] so i was trying drop tables on my sql but machine was all halted, now * i think* thanks to you r guanderful cmds i think it will be able to free space [21:42:04] \O/ [21:42:33] try starting mysql [21:42:38] let it do whatever it wants for a while [21:43:08] I am off to bed [21:43:24] hashar: gracias senior [21:43:25] have fun cleaning out the disk! [21:43:28] Reedy: will do! [21:44:26] RECOVERY - Puppet staleness on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [3600.0] [21:47:32] There's another 722M in /var/lib/mysql.5.5.46-0ubuntu0.14.04.2 [21:47:45] Which hasn't been modified since 2015 [21:47:47] So I bet that can go [21:47:50] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#3450232 (10Krinkle) [21:49:47] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#3450240 (10hashar) We also have to update the deployment section on https://wikitech.wiki... [21:51:31] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#2096570 (10Krinkle) Just now I tried to deploy an update to `mediawiki/services/jobrunner... [21:52:17] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10Nuria) Thanks to help from some releng folks looks like this are getting better: root@deployment-eventlogging03:~# df -h Filesystem Size... [21:52:59] RECOVERY - Free space - all mounts on deployment-eventlogging03 is OK: OK: All targets OK [21:53:49] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10Reedy) There's around 700MB in `/var/lib/mysql.5.5.46-0ubuntu0.14.04.2` that hasn't been touched since 2015 [21:58:33] 10Release-Engineering-Team, 10Operations, 10Epic, 10Services (watching): FY2017/18 Program 6 - Outcome 2: Developers are able to develop and test their applications through a unified pipeline towards production deployment. - https://phabricator.wikimedia.org/T170480#3450280 (10GWicke) [21:58:34] 10Release-Engineering-Team, 10MediaWiki-Containers, 10Operations, 10Epic, and 3 others: FY2017/18 Program 6 - Outcome 2 - Objective 3: Integrated, container-based development environment - https://phabricator.wikimedia.org/T170456#3450281 (10GWicke) [21:58:41] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Services (watching), 10services-tooling: RFC: Streamline NodeJS testing+deployment - https://phabricator.wikimedia.org/T147581#3450279 (10GWicke) [22:07:39] feel free to nuke whatever I have on deployment-eventlogging03 [22:13:22] RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:19:00] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3450342 (10Nuria) Dropped tables and logs and run apt-get clean and apt-get autoclean, we are now at 80% space. Restarted eventloging again. [22:19:56] madhuvishy: i did nuke things left and right and finally box is alive again, thanks [22:20:13] cool, thanks! [23:05:23] RainbowSprinkles lol https://google.qualtrics.com/jfe/form/SV_0JKI9MBgY4eY4x7 [23:05:27] gerrit feedback form [23:05:33] google's interested in input heh [23:07:22] "What are your biggest pain points related to Gerrit code review?" lol [23:07:29] not sure what that's meant to refer too [23:12:11] "Actions on the dashboard (e.g. apply label, add reviewer, email reviewers)" that would be a nice one. Woulden't have to leave the dashbored to +2 heh [23:14:59] 10Continuous-Integration-Config, 10Wikipedia-Android-App-Backlog, 10Wikipedia-App-MobileApp-extension: CI: mwext-testextension-hhvm-jessie job is suddenly failing - https://phabricator.wikimedia.org/T171006#3450585 (10Mholloway) [23:15:56] 10Continuous-Integration-Config, 10Wikipedia-Android-App-Backlog, 10Wikipedia-App-MobileApp-extension: CI: mwext-testextension-hhvm-jessie job is suddenly failing - https://phabricator.wikimedia.org/T171006#3450585 (10Mholloway) [23:16:06] lol phabricator's on the list at the end [23:16:07] 10Continuous-Integration-Config, 10Wikipedia-Android-App-Backlog, 10Wikipedia-App-MobileApp-extension: CI: mwext-testextension-hhvm-jessie job is suddenly failing - https://phabricator.wikimedia.org/T171006#3450585 (10Mholloway) [23:16:10] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MW-1.30-release-notes, 10MinervaNeue, and 3 others: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3450608 (10Jdlrobson) [23:18:07] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MW-1.30-release-notes, 10MinervaNeue, and 3 others: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3445966 (10Jdlrobson) [23:18:09] 10Continuous-Integration-Config, 10Wikipedia-Android-App-Backlog, 10Wikipedia-App-MobileApp-extension: CI: mwext-testextension-hhvm-jessie job is suddenly failing - https://phabricator.wikimedia.org/T171006#3450646 (10Jdlrobson) [23:18:17] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10MW-1.30-release-notes, 10MinervaNeue, and 3 others: Not possible for a skin to have PHPUnit tests (PHPunit tests for skin do not seem to be run) - https://phabricator.wikimedia.org/T170880#3450648 (10Jdlrobson) p:05Normal>03High This is imp... [23:23:16] "What are your biggest pain points related to Gerrit code review?" Answer: "Gerrit code review." [23:26:45] harej: Shush. Gerrit might get angry with you [23:31:30] Wouldn't want to make him angry, oh no.