[00:04:58] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: mediawiki-extensions-qunit-jessie tests are failing - https://phabricator.wikimedia.org/T153597#2891639 (10Paladox) @tgr was right, this https://gerrit.wikimedia.org/r/#/c/328443/2 does fix it. [00:05:01] tgr thanks for fixing the tests. [00:05:23] it's working now https://gerrit.wikimedia.org/r/#/c/328441/ [00:05:41] now needs someone to do an emaergancy merge and then backport to wikidata. [00:54:34] (03Abandoned) 10Paladox: Temporarily make mediawiki-extensions-qunit-jessie non voting [integration/config] - 10https://gerrit.wikimedia.org/r/328238 (https://phabricator.wikimedia.org/T153597) (owner: 10Paladox) [00:55:10] (03PS2) 10Paladox: Rename npm-node-4 test to npm-node-4-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/325635 (https://phabricator.wikimedia.org/T152552) [01:03:01] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: mediawiki-extensions-qunit-jessie tests are failing - https://phabricator.wikimedia.org/T153597#2891697 (10Tgr) 05Open>03Resolved a:03Tgr Seems fixed. Not sure about the `mediawiki-core-qunit-jessie` tests men... [01:08:20] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: mediawiki-extensions-qunit-jessie tests are failing - https://phabricator.wikimedia.org/T153597#2891707 (10Paladox) It seems this still fails in https://gerrit.wikimedia.org/r/#/c/328113/ But passes in other patches. [01:08:42] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: mediawiki-extensions-qunit-jessie tests are failing - https://phabricator.wikimedia.org/T153597#2891708 (10Tgr) (I still don't understand why https://gerrit.wikimedia.org/r/#/c/328327/ passed the tests, even though... [02:55:49] 10Gerrit, 06Release-Engineering-Team: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2891825 (10Paladox) I think this may be fixed by https://gerrit-review.googlesource.com/#/c/91583/ If that is the case we should get it backportrd into stable... [02:56:06] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2891826 (10Paladox) [03:19:44] 03Scap3: Make scap plugins generally useful - https://phabricator.wikimedia.org/T151470#2891862 (10mmodell) [03:19:47] 10Deployment-Systems, 06Release-Engineering-Team (Long-Lived-Branches), 10releng-201617-q1, 07Epic: Merge to deployed branches instead of cutting a new deployment branch every week. - https://phabricator.wikimedia.org/T89945#2891863 (10mmodell) [03:19:50] 06Release-Engineering-Team (Long-Lived-Branches), 03Scap3 (Scap3-MediaWiki-MVP), 13Patch-For-Review: Create `scap swat` command to automate patch merging & testing during a swat deployment - https://phabricator.wikimedia.org/T142880#2891861 (10mmodell) 05Open>03Resolved [03:20:17] 06Release-Engineering-Team (Long-Lived-Branches), 03Scap3 (Scap3-MediaWiki-MVP), 13Patch-For-Review: Create `scap swat` command to automate patch merging & testing during a swat deployment - https://phabricator.wikimedia.org/T142880#2549692 (10mmodell) [03:20:19] 10Deployment-Systems, 06Release-Engineering-Team (Long-Lived-Branches), 13Patch-For-Review: create `scap branch` command (the successor to make-wmf-branch) - https://phabricator.wikimedia.org/T140918#2891864 (10mmodell) 05Open>03Resolved [05:57:24] !log Jenkins "Collapsing Console Sections" for PHPUnit was broken since "-d zend.enable_gc=0" was added to phpunit.php invocation. Updated pattern in Jenkins system configuration. [05:57:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [06:31:12] 10Gerrit: Remove "WYSIWYG" extension from Gerrit - https://phabricator.wikimedia.org/T71378#2892163 (10Aklapper) 05Open>03Resolved a:03Aklapper I'd call this sufficiently resolved, even though it was "only" marked as read-only. [07:51:07] (03PS3) 10Phedenskog: Run WebPageTest jobs every hour [integration/config] - 10https://gerrit.wikimedia.org/r/328159 (https://phabricator.wikimedia.org/T151197) [07:53:06] (03CR) 10jerkins-bot: [V: 04-1] Run WebPageTest jobs every hour [integration/config] - 10https://gerrit.wikimedia.org/r/328159 (https://phabricator.wikimedia.org/T151197) (owner: 10Phedenskog) [07:58:36] (03PS1) 10Aklapper: Archive OAI extension [integration/config] - 10https://gerrit.wikimedia.org/r/328475 (https://phabricator.wikimedia.org/T129864) [08:48:11] (03PS1) 10Aklapper: Archive PrefStats extension [integration/config] - 10https://gerrit.wikimedia.org/r/328477 (https://phabricator.wikimedia.org/T134441) [08:49:37] (03CR) 10jerkins-bot: [V: 04-1] Archive PrefStats extension [integration/config] - 10https://gerrit.wikimedia.org/r/328477 (https://phabricator.wikimedia.org/T134441) (owner: 10Aklapper) [10:00:43] 06Release-Engineering-Team, 07Security-General: Download of composer cweiske/php-sqllint requires to disable https security - https://phabricator.wikimedia.org/T153842#2892714 (10hashar) [10:06:07] 06Release-Engineering-Team, 07Security-General: Download of composer cweiske/php-sqllint requires to disable https security - https://phabricator.wikimedia.org/T153842#2892714 (10Lokal_Profil) I'm reaching or to inform the host [10:11:41] 06Release-Engineering-Team, 07Security-General: Download of composer cweiske/php-sqllint requires to disable https security - https://phabricator.wikimedia.org/T153842#2892810 (10JeanFred) Makes sense. Thanks @hashar for flagging this and @Lokal_Profil for trackling this :) [10:14:39] 06Release-Engineering-Team, 07Security-General: Download of composer cweiske/php-sqllint requires to disable https security - https://phabricator.wikimedia.org/T153842#2892814 (10hashar) I have mailed the author to the email listed on http://cweiske.de/feedback.htm [10:58:48] (03CR) 10Tobias Gritschacher: [C: 04-1] Add functionality to email Wikidata mailing list upon build failure (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/328315 (https://phabricator.wikimedia.org/T152495) (owner: 10Zppix) [10:59:00] 10Continuous-Integration-Config, 10MediaWiki-Unit-tests, 13Patch-For-Review: Karma qunit proxy fails setting Host: header - https://phabricator.wikimedia.org/T153757#2892912 (10hashar) a:03hashar [11:01:50] 06Release-Engineering-Team, 07Security-General: Download of composer cweiske/php-sqllint requires to disable https security - https://phabricator.wikimedia.org/T153842#2892930 (10hashar) @Lokal_Profil I filled this task and then sent him an email. It seems you did the same, at worse he will get two emails :-} [12:49:41] 10Continuous-Integration-Config, 06Release-Engineering-Team, 06Discovery: Add CI to all wikimedia/discovery repositories that are active - https://phabricator.wikimedia.org/T153856#2893206 (10hashar) [13:07:45] 10Continuous-Integration-Config, 06Release-Engineering-Team, 06Discovery: Add CI to all wikimedia/discovery repositories that are active - https://phabricator.wikimedia.org/T153856#2893252 (10hashar) [13:23:32] 10Continuous-Integration-Config, 06Release-Engineering-Team, 06Discovery: Add CI to all wikimedia/discovery repositories that are active - https://phabricator.wikimedia.org/T153856#2893294 (10hashar) Updated the task description to better describe each of the repositories. Most are `R` applications without... [13:46:50] Yippee, build fixed! [13:46:51] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #250: 09FIXED in 2 min 49 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/250/ [14:04:28] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:39:28] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [14:55:43] 10Gerrit, 10BlueSpice: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#2893482 (10Osnard) I've no idea :smile: [15:09:40] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2886349 (10hashar) p:05Triage>03Low [15:10:29] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2886349 (10hashar) Might be :-} At least we have a reproducible test case. I am not sure it is serious enough to ask for a backport to 2.13. [15:12:16] 06Release-Engineering-Team: Change notification email from jenkins-bot@wikimedia.org to releng internal list - https://phabricator.wikimedia.org/T151642#2893547 (10hashar) Nothing has been done. It is a random idea I had a couple months ago which I filled as a task to make sure it is not forgotten. I haven't e... [15:17:54] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2893564 (10Paladox) @hashar the patch It gets stuck on is https://gerrit.wikimedia.org/r/#/c/52603/ and it seems patchset 1 is not even that change. [15:18:58] as mediawiki/core works with that command [15:19:15] it seems it is because patchset one is not that change instead it is another change. [15:19:19] hashar ^^ [15:23:29] 10Gerrit, 06Release-Engineering-Team: Support redis as a cache store - https://phabricator.wikimedia.org/T152802#2893589 (10hashar) [15:24:11] 10Gerrit, 10BlueSpice: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#2893592 (10Paladox) The only thing I see we could try is to git mirror everything from GitHub to gerrit to s... [15:24:15] 10Gerrit, 06Release-Engineering-Team: Gerrit lacks a 'change' cache - https://phabricator.wikimedia.org/T153645#2886392 (10hashar) I have merged the tasks :} Mine being about lack of changes cache and T152802 about adding such cache :-} [15:32:58] 10Gerrit, 10BlueSpice: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#2893600 (10Osnard) If it helps, we can backup the outstanding changes, remove `REL1_27` and recreate it... [15:33:37] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893601 (10hashar) From Zuul log on contint1001.wikimedia.org: ``` $ grep -c 'Gerrit error.* --message "Post-merge' /var/log/zuul/error.l... [15:38:00] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2893603 (10Paladox) Cherry picked it here https://gerrit-review.googlesource.com/#/c/93282/ for the stable-2.13 branch. [15:38:59] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893620 (10Paladox) @hashar but it works for me on gerrit.git.wmflabs.org which uses the same version of gerrit as prod. [15:39:47] PROBLEM - Puppet run on deployment-phab02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:40:00] 10Gerrit, 10BlueSpice: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#2893622 (10Paladox) @Osnard that may help, could you do a backup please and wait for @demon to agree as it i... [15:47:24] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2893631 (10Paladox) Found the patch https://gerrit.wikimedia.org/r/#/c/52603/ which the command fails on. [15:49:27] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2893633 (10Paladox) Could this be stale cache in the cache folder? Gerrit 2.13 supports deleting stale cache now. [15:59:40] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893672 (10hashar) The Zuul post-merge pipeline always pass `--verified 0` ``` lang=yaml trigger: gerrit: - event: change-merged... [16:00:16] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Gerrit java.lang.ArrayIndexOutOfBoundsException when querying changes - https://phabricator.wikimedia.org/T153642#2893674 (10Paladox) ah, it's this change ddaecdda39627570f7d5cdac1aa71781ae125637 (the parent returns an internal error 500) [16:00:58] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893677 (10hashar) T52300 shows that --force-message has been removed from Gerrit. [16:01:14] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893679 (10Paladox) @hashar but I did that on gerrit.git.wmflabs.org and it worked perfectly [16:03:01] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893682 (10hashar) @paladox did what? Can you share the postmerge pipeline configuration you are using? I only have assumption right no... [16:05:12] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893683 (10Paladox) @hashar it is how Wikimedia uses it. I have not changed it. The only thing I did was added a new project to zuul as I... [16:07:33] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893690 (10Paladox) Uh, strange I can reproduce this on gerrit-test (instance) but doing it on gerrit-test3 works. [16:12:08] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893705 (10Paladox) Oh, I was doing it on an open change on gerrit.git.wmflabs.org but postmerge worked on http://gerrit-test.wmflabs.org... [16:12:11] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893706 (10Paladox) Oh, I was doing it on an open change on gerrit.git.wmflabs.org but postmerge worked on http://gerrit-test.wmflabs.org... [16:16:16] 10Continuous-Integration-Infrastructure, 10Gerrit, 13Patch-For-Review, 07Zuul: jenkins-bot not able to post postmerge comment - https://phabricator.wikimedia.org/T153737#2893711 (10Paladox) @hashar yep remove --verified. [16:18:40] (03PS1) 10Paladox: Jenkins do not report verified 0 on postmerge changes [integration/config] - 10https://gerrit.wikimedia.org/r/328534 (https://phabricator.wikimedia.org/T153737) [16:18:43] hashar ^^ [16:18:57] (03PS2) 10Paladox: Jenkins do not report verified 0 on postmerge changes [integration/config] - 10https://gerrit.wikimedia.org/r/328534 (https://phabricator.wikimedia.org/T153737) [16:19:56] (03CR) 10jerkins-bot: [V: 04-1] Jenkins do not report verified 0 on postmerge changes [integration/config] - 10https://gerrit.wikimedia.org/r/328534 (https://phabricator.wikimedia.org/T153737) (owner: 10Paladox) [16:23:38] (03PS1) 10Hashar: (WIP) Ensure postmerge reports to Gerrit even with no votes [integration/config] - 10https://gerrit.wikimedia.org/r/328535 (https://phabricator.wikimedia.org/T153737) [16:24:28] (03CR) 10jerkins-bot: [V: 04-1] (WIP) Ensure postmerge reports to Gerrit even with no votes [integration/config] - 10https://gerrit.wikimedia.org/r/328535 (https://phabricator.wikimedia.org/T153737) (owner: 10Hashar) [16:25:18] (03CR) 10Paladox: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/328534 (https://phabricator.wikimedia.org/T153737) (owner: 10Paladox) [16:26:13] (03CR) 10jerkins-bot: [V: 04-1] Jenkins do not report verified 0 on postmerge changes [integration/config] - 10https://gerrit.wikimedia.org/r/328534 (https://phabricator.wikimedia.org/T153737) (owner: 10Paladox) [16:28:01] hashar oh the tests started failing, other patches have the same problem [16:28:02] https://integration.wikimedia.org/ci/job/integration-zuul-layoutvalidation/9099/console [16:30:44] paladox: verified: [] yeah maybe that will work [16:30:59] experimental pipeline is doing that [16:31:17] then there is no test at https://gerrit.wikimedia.org/r/#/c/328535/ [16:31:20] so really I am not sure :} [16:31:35] 10Gerrit, 06Release-Engineering-Team, 10Wikimedia-Logstash, 13Patch-For-Review, 07Technical-Debt: Look into shoving gerrit logs into logstash - https://phabricator.wikimedia.org/T141324#2893742 (10Paladox) Even with logstash down gerrit still works. [16:31:41] hashar it could be a pbr bug? [16:31:42] OHHH [16:31:45] experimental has the same thing [16:31:47] neat [16:31:52] yep [16:32:00] so the test failures are unrelated [16:32:05] as it is happening for others [16:32:24] Like https://gerrit.wikimedia.org/r/#/c/328477/ [16:32:35] (03CR) 10Paladox: [C: 031] "Test failures unrelated." [integration/config] - 10https://gerrit.wikimedia.org/r/328477 (https://phabricator.wikimedia.org/T134441) (owner: 10Aklapper) [16:33:09] hashar this looks like https://review.openstack.org/#/c/116775/ a fix. [16:33:36] hurra [16:33:36] can you report on task ? [16:33:43] I am on audio right now then go out for shopping [16:33:51] will be back later this evening :D [16:34:00] Ok [16:34:05] I will create a task [16:38:04] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893745 (10Paladox) [16:38:34] hashar ^^ [16:38:35] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893758 (10Paladox) p:05Triage>03Unbreak! Setting to unbreak since the problem just happened and it could happen anytime in... [16:42:03] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893771 (10Paladox) upstream zuul uses pbr 1.1.0 https://phabricator.wikimedia.org/diffusion/CIZU/browse/debian%252Fjessie-wiki... [16:42:04] PROBLEM - Puppet run on integration-slave-precise-1002 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [16:45:20] hashar: i just did an apt-get update and upgrade and found there is a python-pbr update waiting [16:45:28] from 0.8 to 1.10 [16:47:57] (03Draft1) 10Paladox: Update python-pbr to 1.10.0 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328538 (https://phabricator.wikimedia.org/T153877) [16:50:23] (03PS2) 10Paladox: Update python-pbr to 1.10.0 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328538 (https://phabricator.wikimedia.org/T153877) [16:52:14] hashar tests are failing totally now, it's spreading https://integration.wikimedia.org/ci/job/debian-glue-non-voting/475/console all tests that do zuul --version will fail. [16:53:12] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893791 (10Paladox) The failures are spreading, now https://integration.wikimedia.org/ci/job/debian-glue-... [16:54:12] on jessie [16:55:51] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893797 (10Paladox) strange nodepool unaffected but instances are. [16:59:21] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893800 (10greg) @Andrew this is probably due to the python package upgrades, fyi [16:59:28] nice [16:59:56] andrewbogott: ^ :) [17:00:02] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on the integration/config repo - https://phabricator.wikimedia.org/T153877#2893803 (10Paladox) Yep, but strange as I did an update on one of my test instances and zuul --version wo... [17:02:07] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893814 (10Paladox) [17:04:03] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893745 (10Paladox) Um, I managed to reproduce on a test slave. root@jenkins-slave-01:/home/pala... [17:06:11] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893827 (10Paladox) root@jenkins-slave-01:/home/paladox# pbr -v 1.10.0 [17:09:56] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893830 (10Paladox) Why do we have two files like this -rw-r--r-- 1 root root 26708 Sep 21 14:... [17:11:16] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893835 (10Paladox) at 06:44 am this update was installed for pbr. [17:11:32] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893837 (10hashar) @Andrew pbr 1.10.0 is broken it fails to recognize a version such as the Zuul one `2... [17:13:42] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2893846 (10Paladox) @hashar but it works on another test instance for me. gerrit-test. root@gerrit-te... [17:14:10] hashar: can you tell me more about what broke? [17:14:25] I wouldn't have expected any change, since I didn't actually explicitly upgrade things anyplace [17:16:10] ah, unattended upgrades [17:16:26] andrewbogott: pbr 1.10.0 fails to recognize the version string Zuul is using [17:16:37] and yeah on the CI permanent slaves, we have unattended upgrades [17:16:45] it was too much of an hassle to manually catch up with all the upgrade :D [17:16:49] unattended upgrades always do this [17:16:55] there must be a fix in pbr [17:17:00] I am looking at the code right now [17:17:07] hashar: ok, let me know what I can do [17:17:20] (I'm at the doctor and distracted but should be back home in not all that long) [17:22:04] RECOVERY - Puppet run on integration-slave-precise-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [17:39:58] ah [17:40:22] hashar i know why i could not reproduce it on gerrit-test, it's still using 0.8 (pbr) in pip [17:40:56] also pbr 1.1.0 is broken too, tested with that [17:41:08] too [17:41:47] hashar: where can I log in to see the issue? (I'm back but haven't caught up on the ticket yet) [17:42:31] andrewbogott: integration-slave-jessie-1001.integration.eqiad.wmflabs [17:42:52] trying to reproduce on my local machine [17:43:12] hashar: ok, and is there a trivial test to show the problem? [17:43:21] zuul --version [17:44:31] whoah, that's graceless [17:45:07] AHHH [17:45:08] I reproduce :} [17:45:26] hashar https://phabricator.wikimedia.org/T153877#2893846 [17:45:28] woops [17:45:32] https://github.com/openstack-dev/pbr/commit/2465a4cac7570cbf8e61456faaf44ce67a8bbc0b [17:45:53] "All versions have been made PEP-440 compatible, because of our deep roots in Python. Pre-release versions are now separated by . not -, and use a/b/c rather than alpha/beta etc. [17:45:53] " [17:46:16] so by changing it to . and not - should hopefully fix it. [17:48:34] hashar, looks like there's a fix in newer pbr [17:48:35] https://git.openstack.org/cgit/openstack-dev/pbr/commit/?id=85ba9600b2009f47552ce35a7e7de02cda0c179e [17:48:51] hm, maybe, I can't tell if that's the same bug [17:49:19] andrewbogott that is included in pbr 0.11.0+ [17:49:20] https://github.com/openstack-dev/pbr/commit/85ba9600b2009f47552ce35a7e7de02cda0c179e [17:49:31] yeah, I'm wrong, if anything that introduced the issue [17:49:39] oh [17:51:11] hashar: did you build that zuul package? Or did I? [17:51:51] I did [17:52:06] so maybe just a zuul rebuild with a different version string? [17:52:22] It's idiotic that pbr broke so much backwards compatibility stuff… but I don't see anyone offering to fix it in pbr :( [17:54:30] the thing is 1.8.0 SemanticVersion yields the same error locally [17:54:30] :( [17:55:22] …I don't now what that means [17:55:36] is SemanticVersion an alternative to pbr, or another tool we're using in CI? [17:55:57] it is included in pbr.version [17:57:34] ok, so changing the zuul version string should address that part as well? [17:59:12] potentially [17:59:21] but I am not going to rebuild a zuul package this week for sure [17:59:32] there are too many moving parts and good know what it is going to break [17:59:57] You don't have the build env it came from? [18:00:14] (I was imagining, like, change a string in setup.py, run the build command again) [18:00:19] yeah ideally [18:00:30] the thing is zuul depends on python modules that are not available in apt.wikimedia.org [18:00:52] so we ended up having the deb packaging to craft a virtualenv, download missing deps from pypi and embed them in the .deb [18:01:08] the list of packages are not pinned, so there are moving parts [18:01:12] ah, ok [18:01:25] and then , we got to upgrade the zuul scheduler on contint1001 which sounds like a bad idea :D [18:01:33] so we need to pin an older version of pbr on those hosts [18:01:55] reallyI would prefer to revert [18:02:27] we should rethink how we manage the .deb package. I don't think it is still sustainable to upgrade the whole cluster [18:04:36] what is the base puppet role used for those CI hosts? [18:05:48] depends on the host I guess [18:06:18] but the zuul install should be handled by modules/zuul/manifests/init.pp [18:06:19] There's not a shared base role or class for all CI slaves? [18:06:24] na [18:06:27] we have multiple roles [18:06:37] and even multiple puppet repos [18:06:45] do you want to pin pbr? [18:07:08] probably, I'm going to see what I can figure out [18:07:17] 10Deployment-Systems: Investigate what changes are needed to deploy MW+Extensions by percentage of users (instead of by domain/wiki) - https://phabricator.wikimedia.org/T104398#2894031 (10mmodell) https://etherpad.wikimedia.org/p/rolling-deployments [18:08:03] oh hey, look at that old task from me [18:08:18] I guess when os_version is jessie, we could get an apt::pin for python-pbr to prefer jessie-backports/main over jessie-wikimedia/backports [18:08:21] the [18:08:57] then there is an unattended upgrade to prefer *-wikimedia repos [18:09:11] $ cat /etc/apt/apt.conf.d/51unattended-upgrades-wikimedia [18:09:11] Unattended-Upgrade::Origins-Pattern:: "origin=Wikimedia,codename=${distro_codename}-wikimedia"; [18:09:49] Hm, if possible we should do limit it to the one package [18:09:53] but, let's see what works... [18:10:36] 10Deployment-Systems: Deploy MW+Extensions by percentage of users (instead of by domain/wiki) - https://phabricator.wikimedia.org/T104398#2894034 (10thcipriani) [18:14:57] (03Abandoned) 10Paladox: Update python-pbr to 1.10.0 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328538 (https://phabricator.wikimedia.org/T153877) (owner: 10Paladox) [18:15:14] (03Draft1) 10Paladox: test [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328554 [18:15:52] (03PS2) 10Paladox: test [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328554 [18:16:09] 10Deployment-Systems: Deploy MW+Extensions by percentage of users (instead of by domain/wiki) - https://phabricator.wikimedia.org/T104398#1415863 (10thcipriani) I think the main points from a recent discussion were: in order to deploy to a percentage of traffic we need to do so quickly. In order to be able to mo... [18:17:30] OH my god I am tired [18:17:41] so pbr 1.10.0 has the issue [18:17:47] I was trying to figure out why 1.8.0 had it as well [18:17:57] but we used 0.8.2 :D [18:25:27] (03Abandoned) 10Paladox: test [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328554 (owner: 10Paladox) [18:27:27] pinning drives me crazy [18:27:49] https://www.irccloud.com/pastebin/Zbs7zT3a/ [18:27:59] so clearly I want the version from jessie/main... [18:28:22] and yet this does nothing [18:28:24] https://www.irccloud.com/pastebin/MLwtdpCa/ [18:28:49] ah [18:29:02] oh [18:29:05] (03Restored) 10Paladox: test [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328554 (owner: 10Paladox) [18:29:17] (03PS3) 10Paladox: test [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328554 [18:30:59] oh [18:31:10] guess changing - to . still dosent work [18:31:30] kind of looks like all 3 have priority 1002 maybe [18:31:38] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2894082 (10hashar) We had version 0.8.2 The CI instance have an unattended-upgrade for repositories *-... [18:31:39] in that version table [18:31:51] mutante: yeah :( [18:32:08] I don't understand how pinning works at all, clearly [18:32:47] andrewbogott: are you using the apt::pin stuff from puppet ? [18:32:51] yep [18:33:29] pinning + unattended upgrade rules, that is a lot of moving parts unfortunately: ( [18:33:58] andrewbogott: and in your apt::pin code did you use priority => '1002' then? [18:34:26] try 1001 so that the prio is lower than the ohters [18:34:31] for the one you want [18:34:34] mutante: https://gerrit.wikimedia.org/r/#/c/328555/ [18:34:40] 10Gerrit, 07LDAP: Gerrit username rename request - https://phabricator.wikimedia.org/T153884#2894086 (10SamanthaNguyen) [18:34:43] oh, lower priority is higher priority? [18:34:46] yea [18:34:56] i think 1002 is like default [18:35:01] and you want 1001 for your choice [18:35:06] looking at another one: [18:35:25] https://www.irccloud.com/pastebin/pF20iNZ1/ [18:35:34] yea, the rest looks good [18:35:40] just make it 1001 instead of 1002 and try again [18:36:12] but the standard wmf repo has priority 1001 [18:36:53] does it? then even lower than that i guess [18:36:59] mutante: if you want, log into pbrpinning-1.testlabs.eqiad.wmflabs and see what you can get [18:37:06] modules/systemtap/manifests/runtime.pp has an example [18:37:07] I've tried a priority of 8, 1001, 1002, and 2000 [18:37:09] all the same behavior [18:37:44] ok, but i need to finish a server install first, so in a little while [18:39:38] andrewbogott: you can probably try hacking the apt.conf directly on integration-slave-jessie-1001.integration.eqiad.wmflabs ? [18:39:59] hashar: how is that different from what I'm doing now? [18:40:07] not sure :) [18:40:48] meanwhile I am trying to find a semantic version for zuul that pleases pbr [18:41:25] i'll hurry and take a look, it's just that i was touching the installserver, brb [18:41:55] 10Continuous-Integration-Config, 06Release-Engineering-Team, 06Discovery: Add CI to all wikimedia/discovery repositories that are active - https://phabricator.wikimedia.org/T153856#2894103 (10Deskana) Quite a few of these are related to analysis, @mpopov and @chelsyx should comment on those. [18:55:08] apt-get install python-pbr=0.8.2-1 [18:55:11] this worked [18:55:16] on pbrpinning-1 [18:55:25] ii python-pbr 0.8.2-1 [18:55:51] which instances do we need to downgrade [18:55:51] hashar: [18:59:38] mutante: yeah hmm [18:59:44] is the pinning working? [19:00:39] on integration-slave-jessie-1001 python-pbr is now at 0.8.2 [19:00:56] but the candidate is still 1.10 [19:01:16] it is being debugged in security channel [19:01:20] while we talk [19:01:31] ok ok [19:01:59] I think the best place to add the pinning would be in the modules/zuul/manifests/init.pp which install the package [19:02:07] with an os_version == jessie guard [19:03:38] that sounds right [19:16:08] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Operations, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2894226 (10hashar) Cherry picked https://gerrit.wikimedia.org/r/328555 on the CI... [19:17:16] hashar i got zuul working with 2.5.0.dev8 [19:17:34] i doint known how it came up with that one as i did not do it that way [19:25:36] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Operations, 13Patch-For-Review, 07Zuul: Zuul has started failing on some repo's in gerrit.wikimedia.org - https://phabricator.wikimedia.org/T153877#2894252 (10hashar) 05Open>03Resolved a:03Andrew Ran puppet on contint1001 /... [19:29:41] paladox: magic? Maybe you run zuul from git [19:29:48] Yep [19:29:50] and thus it would get the version string from `git describe` [19:29:56] Oh. [19:30:00] so 2.5.0-8-g120391209 [19:30:08] get recognized as 2.5.0.8 somehow [19:30:15] oh [19:30:16] get recognized as 2.5.0.dev8 somehow [19:30:26] oh [19:30:28] ah [19:30:53] i wonder how we should change 2.5.0.8-wmf4jessie1 [19:31:02] i mean 2.5.0-8-gcbc7f62-wmf3jessie1 [19:31:08] 2.5.0-8-gcbc7f62-wmf4jessie1 [19:31:31] yeah maybe [19:31:33] to be more compatible with pbr as they removed support for - but building in debian it requires - otherwise it fails for me. [19:32:07] I don't even know how pbr get the version string when the package is installed [19:32:58] Oh [19:33:07] anyway gotta prepare some dinner [19:33:12] hashar it all broke in https://github.com/openstack-dev/pbr/commit/2465a4cac7570cbf8e61456faaf44ce67a8bbc0b [19:34:57] paladox: the issue is the introduction of semantic versioning by https://github.com/openstack-dev/pbr/commit/5957364887da51d1133370b82d1d7d137ce85631 [19:35:12] that is version 0.11.0 [19:35:23] and others mentioned they had the same oddity on our infra [19:35:26] yep [19:36:24] hashar i think i can come up with a fix [19:36:35] just tested changing - to . and adding - on the end [19:37:06] It will be a temp solution as pbr will need to be fixed to support our semver [19:38:44] ah [19:40:55] (03PS4) 10Paladox: Make zuul semver compatible with pbr 1.10.0 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/328554 (https://phabricator.wikimedia.org/T153877) [19:41:00] hashar ^^ [19:42:52] that ^^ worked for me the package will be 2.5.0.8.gcbc7f62-wmf4jessie1 but zuul --version will produce Zuul version: 2.5.0.dev8 [19:46:39] 10Continuous-Integration-Config, 06Release-Engineering-Team, 06Discovery, 06Discovery-Analysis: Add CI to all wikimedia/discovery repositories that are active - https://phabricator.wikimedia.org/T153856#2894303 (10debt) [19:49:39] hashar: All good for now? I might go get lunch [19:51:02] We just need to get backword support back in pbr, so like if semantic version does not work then it should fallback to trying it from package. [19:51:23] would be nice [19:51:56] andrewbogott: yeah all good [19:51:59] having dinner now [19:52:01] cool [19:52:12] paladox: if you are curious, pbr version system is https://www.python.org/dev/peps/pep-0440/#public-version-identifiers [19:52:17] I'm going on vacation tomorrow, so let me know if you have last-minute requests :) [19:53:25] Semantic versions containing a hyphen (pre-releases - clause 10) or a plus sign (builds - clause 11) are not compatible with this PEP and are not permitted in the public version field. [19:53:26] :D [19:53:46] oh thanks [19:53:48] andrewbogott: guess I will revisit tomorrow. And if there is some actions needed I can always poke eu ops :} [19:54:03] yep! [19:54:08] ok, have a good evening [19:57:02] and you too [20:06:25] 10Continuous-Integration-Infrastructure, 07Jenkins, 07Upstream, 07WorkType-NewFunctionality: Jenkins trilead-ssh2 doesn't support our MAC/KEX algorithms - https://phabricator.wikimedia.org/T103351#2894365 (10Paladox) @hashar someone added support for this in trilead-ssh2 here https://github.com/jenkinsci/t... [20:08:55] (03CR) 10Paladox: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/328534 (https://phabricator.wikimedia.org/T153737) (owner: 10Paladox) [20:09:47] (03CR) 10Paladox: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/328475 (https://phabricator.wikimedia.org/T129864) (owner: 10Aklapper) [20:09:53] (03CR) 10Paladox: [C: 031] "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/328477 (https://phabricator.wikimedia.org/T134441) (owner: 10Aklapper) [20:10:02] (03CR) 10Paladox: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/328535 (https://phabricator.wikimedia.org/T153737) (owner: 10Hashar) [20:29:16] 10Gerrit, 07Upstream: Gerrit Internal Server Error when trying to cherry-pick patch from master to master - https://phabricator.wikimedia.org/T149878#2894478 (10Paladox) I think this is now fixed in gerrit 2.13. [21:12:29] 10Gerrit, 07Upstream: Gerrit shows HTTP 500 error when pasting an emoji - https://phabricator.wikimedia.org/T145885#2894684 (10Paladox) [21:13:11] 10Gerrit, 07Upstream: Gerrit shows HTTP 500 error when pasting an emoji - https://phabricator.wikimedia.org/T145885#2644059 (10Paladox) Should hopefully be fixed if we do T153899. [22:17:01] 10Gerrit, 07Upstream: Gerrit shows HTTP 500 error when pasting an emoji - https://phabricator.wikimedia.org/T145885#2894803 (10Paladox) The db requires converting to utf8mb4 to support emojis in the db see http://andy-carter.com/blog/saving-emoticons-unicode-from-twitter-to-a-mysql-database [23:02:25] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]