[01:15:12] PROBLEM - zuul_gearman_service on gallium is CRITICAL: Connection refused [01:15:32] PROBLEM - zuul_service_running on gallium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [01:16:04] ACKNOWLEDGEMENT - zuul_gearman_service on gallium is CRITICAL: Connection refused daniel_zahn gerrit migration ongoing [01:16:04] ACKNOWLEDGEMENT - zuul_service_running on gallium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server daniel_zahn gerrit migration ongoing [01:31:45] ^^ under maintenance due to Gerrit switch [01:44:33] Project beta-code-update-eqiad build #114244: 04FAILURE in 1 min 32 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114244/ [01:45:32] PROBLEM - Puppet run on integration-slave-trusty-1003 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [01:46:00] PROBLEM - Puppet run on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [01:46:34] PROBLEM - Puppet run on integration-raita is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [01:48:52] PROBLEM - Puppet run on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:51:21] hashar: Host key issues ^? [01:51:32] PROBLEM - Puppet run on integration-slave-trusty-1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [01:52:15] PROBLEM - Puppet run on integration-slave-trusty-1016 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [01:53:33] RECOVERY - zuul_gearman_service on gallium is OK: TCP OK - 0.000 second response time on port 4730 [01:53:53] RECOVERY - zuul_service_running on gallium is OK: PROCS OK: 2 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [01:54:34] Yippee, build fixed! [01:54:35] Project beta-code-update-eqiad build #114245: 09FIXED in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114245/ [01:55:38] ostriches: yeah or simply that the repo is not available [01:55:55] they are sorting out by themselves [01:57:05] the same happens in prod [01:57:17] they are failing for one run ..then works again [01:58:13] It probably "Failed" in that it returned a non-zero exit code, but actually "worked" [01:58:20] And then after it fixes itself, it moves forwrd find. [01:58:21] *fine [01:58:49] did we ever get a reply about the TTL for reverse entry btw? [01:58:59] we did not touch them [02:00:08] Oh yeah. Brandon said we could lower the TTL, but nbd if we don't. [02:00:17] It should just update within the hour [02:00:25] ok [02:00:31] RECOVERY - Puppet run on integration-slave-trusty-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [02:14:33] Project beta-code-update-eqiad build #114247: 04FAILURE in 1 min 33 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114247/ [02:16:33] PROBLEM - Puppet run on integration-slave-trusty-1003 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:24:11] PROBLEM - Puppet run on integration-slave-trusty-1012 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:24:31] Project beta-code-update-eqiad build #114248: 04STILL FAILING in 1 min 31 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114248/ [02:26:49] PROBLEM - Puppet run on integration-slave-trusty-1004 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:26:54] PROBLEM - Puppet run on integration-slave-trusty-1017 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:26:56] PROBLEM - Puppet run on integration-slave-trusty-1013 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:27:14] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:31:28] PROBLEM - Puppet run on deployment-sca02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:32:02] PROBLEM - Puppet run on deployment-sentry01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:32:10] PROBLEM - Puppet run on integration-slave-trusty-1023 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:33:22] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:34:31] Project beta-code-update-eqiad build #114249: 04STILL FAILING in 1 min 30 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114249/ [02:35:18] PROBLEM - Puppet run on deployment-eventlogging04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:35:50] PROBLEM - Puppet run on integration-slave-trusty-1014 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:36:09] PROBLEM - Puppet run on integration-slave-trusty-1018 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:40:09] PROBLEM - Puppet run on integration-slave-trusty-1011 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:44:28] Project beta-code-update-eqiad build #114250: 04STILL FAILING in 1 min 28 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114250/ [02:49:01] PROBLEM - zuul_service_running on gallium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [02:49:54] ACKNOWLEDGEMENT - zuul_gearman_service on gallium is CRITICAL: Connection refused daniel_zahn gerrit migration [02:49:54] ACKNOWLEDGEMENT - zuul_service_running on gallium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server daniel_zahn gerrit migration [02:51:02] RECOVERY - zuul_service_running on gallium is OK: PROCS OK: 2 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [02:54:33] Project beta-code-update-eqiad build #114251: 04STILL FAILING in 1 min 33 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114251/ [03:00:02] Project mediawiki-core-code-coverage build #2158: 04FAILURE in 2.1 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/2158/ [03:04:34] Project beta-code-update-eqiad build #114252: 04STILL FAILING in 1 min 33 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114252/ [03:14:31] Yippee, build fixed! [03:14:32] Project beta-code-update-eqiad build #114253: 09FIXED in 1 min 31 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/114253/ [03:21:34] RECOVERY - Puppet run on integration-slave-trusty-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [03:23:53] RECOVERY - Puppet run on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [03:26:01] RECOVERY - Puppet run on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [03:26:35] RECOVERY - Puppet run on integration-raita is OK: OK: Less than 1.00% above the threshold [0.0] [03:31:31] RECOVERY - Puppet run on integration-slave-trusty-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [03:31:55] RECOVERY - Puppet run on integration-slave-trusty-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [03:31:55] RECOVERY - Puppet run on integration-slave-trusty-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [03:32:13] RECOVERY - Puppet run on integration-slave-trusty-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [03:34:14] RECOVERY - Puppet run on integration-slave-trusty-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [03:36:28] RECOVERY - Puppet run on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:36:50] RECOVERY - Puppet run on integration-slave-trusty-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [03:37:02] RECOVERY - Puppet run on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:37:10] RECOVERY - Puppet run on integration-slave-trusty-1023 is OK: OK: Less than 1.00% above the threshold [0.0] [03:37:14] RECOVERY - Puppet run on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:38:20] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [03:38:58] Yay recoveries! [03:40:48] RECOVERY - Puppet run on integration-slave-trusty-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [03:45:09] RECOVERY - Puppet run on integration-slave-trusty-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [03:45:19] RECOVERY - Puppet run on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:11] RECOVERY - Puppet run on integration-slave-trusty-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [03:59:41] PROBLEM - zuul_gearman_service on gallium is CRITICAL: Connection refused [04:00:03] PROBLEM - zuul_service_running on gallium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [04:01:51] RECOVERY - zuul_gearman_service on gallium is OK: TCP OK - 0.000 second response time on port 4730 [04:02:14] RECOVERY - zuul_service_running on gallium is OK: PROCS OK: 2 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [04:17:58] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #84: 04FAILURE in 21 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/84/ [04:36:20] PROBLEM - Puppet run on deployment-eventlogging04 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [05:04:19] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [05:11:18] RECOVERY - Puppet run on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [05:39:23] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [06:32:41] 06Release-Engineering-Team, 10Gerrit, 06Operations, 13Patch-For-Review: replace gerrit server (ytterbium) with jessie server (lead) - https://phabricator.wikimedia.org/T125018#2491177 (10Dzahn) a:03Dzahn let me close this when all ytterbium remnants are actually gone (decom etc) [06:58:17] PROBLEM - Puppet run on phab-beta is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:28:14] (03CR) 10Paladox: "check experimental" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/300532 (owner: 10Hashar) [08:28:39] hashar it seems zuul is faster with gerrit 2.12. [08:29:16] :) [08:36:33] paladox: Gerrit is certainly faster yeah [08:36:40] it is on a stronger box [08:36:42] Yep :) [08:36:55] its internals are probably faster as well [08:36:57] and [08:36:58] Meaning it should be able to handle more load now. [08:37:01] it got restarted :] [08:37:03] oh [08:37:04] :) [08:37:13] and the cache was all cleaned [08:37:19] making it regenerate [08:37:20] :) [08:37:28] and thanks for your testing of Gerrit/Zuul etc! [08:37:32] plus i think cloning will be faster [08:37:35] and your welcome [08:37:39] I got debs to built [08:37:39] too [08:37:54] we had a few unexpected issue during the upgrade [08:37:58] See [08:37:59] http://gerrit-jenkins.wmflabs.org/job/debian-glue/ [08:38:02] oh [08:38:02] but nothing I think we could have caught beforehand [08:38:09] Nope [08:38:17] But now i can edit in browser :) :) [08:38:23] :D [08:38:31] I am taking a quick break, need a coffeee [08:38:37] hashar ive submitted the patch [08:38:48] for getting it to work with debian-glue on jenkins [08:38:49] :) [08:39:04] and shoulden affect any other test. [08:39:19] https://gerrit.wikimedia.org/r/#/c/300790/ [08:39:20] and ok [08:39:46] (03CR) 10Paladox: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/300790 (owner: 10Paladox) [08:43:13] paladox: yeah so in theory :D [08:43:20] yep [08:44:04] I should copy paste our week-end discussion to a task [08:44:31] Oh [08:44:33] :) [08:44:45] so your patch might do the job [08:44:58] but it is a terrrrible hack :] [08:45:31] Yay [08:45:34] and yep [08:45:38] it is a horrible hack [08:45:43] But even changing the path [08:45:50] causes it to go unstable [08:46:09] I tryed changing where it stores the debs and it caused the job to report it as unstable [08:48:01] 10Continuous-Integration-Infrastructure: debian-glue fails to find generated binaries packages (.deb) - https://phabricator.wikimedia.org/T141246#2491357 (10hashar) [08:48:02] paladox: https://phabricator.wikimedia.org/T141246 [08:48:03] hashar i kept my pc on all night, i woke up at 3am, i saw irc that the upgrade was happening but it was so earley in the morning i coulden watch, :) [08:48:07] thanks [08:48:18] ah yeah 3am is terrible! [08:48:30] I had a 2 hours or so nap yesterday afternoon [08:48:36] and went to bed at 10pm [08:48:55] Yep [08:48:55] 10Continuous-Integration-Infrastructure: debian-glue fails to find generated binaries packages (.deb) - https://phabricator.wikimedia.org/T141246#2491369 (10Paladox) It seems changing BUILDRESULT to another location causes the build to be reported as unstable. I am not sure why it does that. [08:48:56] oh [08:48:56] so I have had almost slept a normal "night" when I woke up [08:49:02] Oh [08:49:03] went to bed at 7 and woke up at 10am [08:49:10] Oh [08:49:30] But i saw the gerrit mantenance page [08:49:36] :) [08:49:48] You were up at 4 am (3am my time) [08:49:50] very earley [08:51:01] I initially stayed up until 1:20am. [08:51:14] watching tv, but then it was too late. [08:53:19] It made the dev like http://gerrit-jenkins.wmflabs.org/job/debian-glue/lastSuccessfulBuild/artifact/zuul_2.1.0-391-gbc58ea3-wmf1precise1+0~20160724215145.103~1.gbp4c0776_amd64.deb [08:53:20] :) [08:53:24] dev = deb [08:54:18] Ive also left a comment on https://gerrit.wikimedia.org/r/#/c/300567/1 otherwise it is good to go with that minor change. [08:54:32] Ive been using it with the change i noted in the inline comment [08:54:35] hashar ^^ [08:56:53] https://gerrit.wikimedia.org/r/300830 package_builder: do not override BUILDRESULT [08:56:55] going to apply that [08:57:26] Ok [08:57:29] Yay [08:57:30] :) [08:57:32] thanks [08:59:30] (03CR) 10Hashar: "check experimental" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/300532 (owner: 10Hashar) [09:00:14] /mnt/home/jenkins-deploy/.pbuilderrc does not exist [09:00:15] pfff [09:00:37] Oh [09:00:41] Yep [09:00:45] I just did echo [09:01:43] E: failed creating buildresult dir: [09:02:02] Oh [09:02:18] because the package builder pbuidlder is invoked with --buildresult '' [09:02:30] =echo USENETWORK=yes >> ~/.pbuilderrc [09:02:30] echo PBUILDER_USENETWORK=yes >> ~/.pbuilderrc [09:02:37] oh [09:04:52] PROBLEM - Puppet run on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [0.0] [09:05:40] my shell is lame [09:05:49] ${BUILDRESULT%-something} [09:05:51] the % is wrong [09:06:14] Oh [09:06:30] Which shell do you use [09:07:13] (03CR) 10Hashar: "check experimental" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/300532 (owner: 10Hashar) [09:11:16] fails still https://integration.wikimedia.org/ci/job/debian-glue/208/console [09:11:17] bah [09:11:22] but with an almost proper path [09:11:28] 00:01:42.782 dpkg-deb: error: failed to read archive `/mnt/jenkins-workspace/workspace/debian-glue/*.deb': No such file or directory [09:11:39] I guess piuparts is wrong [09:12:29] Oh [09:12:32] Yep [09:12:36] Actually nope [09:12:41] It needs the debs in there [09:12:58] Which is where https://gerrit.wikimedia.org/r/#/c/300790/1 will fix it for us [09:13:06] bah it got build to /mnt/pbuilder/result/precise-amd64 [09:13:07] :( [09:13:38] Yep [09:13:54] With my patch it should be a workaround but also deletes the prevous zuul debs [09:13:55] :) [09:19:54] RECOVERY - Puppet run on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [09:23:35] (03CR) 10Hashar: "check experimental" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/300532 (owner: 10Hashar) [09:33:28] 07Browser-Tests, 10MobileFrontend, 06Reading-Web-Backlog, 03Reading-Web-Sprint-77-Segmentation-fault, and 4 others: Spike [2hrs] Wikidata description browser tests do not run anywhere - https://phabricator.wikimedia.org/T137756#2377830 (10phuedx) @dr0ptp4kt @Jhernandez: Given that this is blocked – right?... [09:46:33] I've been getting lots of "ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)" jenkins failure [09:46:34] s [09:46:39] Could someone take a look? [09:46:53] hashar: ^ ? [09:47:09] Glaisher: yeah mysql dead [09:47:21] guess we will have to look into that [09:47:38] it does not start on boot apparently [09:48:20] Glaisher: solved [09:48:44] hashar: Wow, that was fast! [09:48:46] Thanks. [09:51:09] Glaisher: will be better once I figure out why mysql does not spawn :D [09:51:20] :-) [09:51:42] hashar try the patch i uploaded [09:51:44] for the debs [09:51:46] :) [09:51:59] please :). [09:57:03] (03PS5) 10Lethexie: Add usage to forbid superglobals like $_GET,$_POST [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 [09:58:24] (03PS2) 10Lethexie: Add the SpaceBeforeClassBraceSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/297355 [10:27:35] 07Browser-Tests, 10MobileFrontend, 06Reading-Web-Backlog, 03Reading-Web-Sprint-77-Segmentation-fault, and 4 others: Spike [2hrs] Wikidata description browser tests do not run anywhere - https://phabricator.wikimedia.org/T137756#2491547 (10Jhernandez) Makes sense. There's still work on preparing the languag... [11:08:52] (03PS14) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [11:14:11] (03PS15) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [11:23:51] Project language-screenshots-VisualEditor » chrome,en,Linux,ci-jessie-wikimedia build #20: 09SUCCESS in 5 min 36 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=en,PLATFORM=Linux,label=ci-jessie-wikimedia/20/ [11:35:03] (03PS16) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [11:41:04] paladox: I think I managed to fix up the debian-glue job https://integration.wikimedia.org/ci/job/debian-glue/221/ [11:41:22] Oh [11:41:23] :) [11:41:31] yay [11:41:42] But it is orange [11:41:47] meaning unstable [11:41:49] and falls [11:41:52] hashar ^^ [11:42:17] (03CR) 10jenkins-bot: [V: 04-1] WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) (owner: 10Zfilipin) [11:42:17] That happened to me [11:42:22] when i changed the path [11:42:30] So i had to change it back [11:42:44] and do the workaround. [11:46:16] (03PS17) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [11:49:32] hashar yay you managed to build deps in jenkins. :) [11:50:09] Oh wow it is hard to find the docs for its-phabricator but i found it. [11:50:18] Looks like the plugin will need updating. [11:53:51] (03PS18) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [11:55:09] Project language-screenshots-VisualEditor » chrome,hr,Linux,ci-jessie-wikimedia build #21: 09SUCCESS in 6 min 26 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=hr,PLATFORM=Linux,label=ci-jessie-wikimedia/21/ [12:01:55] Project language-screenshots-VisualEditor » chrome,bcl,Linux,ci-jessie-wikimedia build #22: 04FAILURE in 6 min 43 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=bcl,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:02:46] Project language-screenshots-VisualEditor » chrome,ar,Linux,ci-jessie-wikimedia build #22: 09SUCCESS in 7 min 34 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=ar,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:15] Project language-screenshots-VisualEditor » chrome,as,Linux,ci-jessie-wikimedia build #22: 09SUCCESS in 9 min 3 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=as,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:19] Project language-screenshots-VisualEditor » chrome,ast,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 7 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=ast,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:22] Project language-screenshots-VisualEditor » chrome,az,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 10 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=az,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:23] Project language-screenshots-VisualEditor » chrome,azb,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 11 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=azb,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:26] Project language-screenshots-VisualEditor » chrome,it,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 14 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=it,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:27] Project language-screenshots-VisualEditor » chrome,fy,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 15 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=fy,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:27] Project language-screenshots-VisualEditor » chrome,fi,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 16 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=fi,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:04:29] Project language-screenshots-VisualEditor » chrome,ka,Linux,ci-jessie-wikimedia build #22: 15ABORTED in 9 min 17 sec: https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,LANGUAGE_SCREENSHOT_CODE=ka,PLATFORM=Linux,label=ci-jessie-wikimedia/22/ [12:06:57] hashar http://www.neowin.net/news/eu-data-protection-chief-encryption-should-be-promoted-backdoors-should-be-illegal :) [12:08:29] (03PS1) 10Hashar: debian-glue: set BUILDRESULT [integration/config] - 10https://gerrit.wikimedia.org/r/300858 [12:09:21] (03CR) 10Paladox: [C: 031] ":)" [integration/config] - 10https://gerrit.wikimedia.org/r/300858 (owner: 10Hashar) [12:10:48] (03Draft2) 10Paladox: Testing [integration/config] - 10https://gerrit.wikimedia.org/r/300859 [12:12:40] (03CR) 10Paladox: "Testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing te" [integration/config] - 10https://gerrit.wikimedia.org/r/300859 (owner: 10Paladox) [12:12:50] (03Abandoned) 10Paladox: Testing [integration/config] - 10https://gerrit.wikimedia.org/r/300859 (owner: 10Paladox) [12:56:39] (03CR) 10Hashar: [C: 032] debian-glue: set BUILDRESULT [integration/config] - 10https://gerrit.wikimedia.org/r/300858 (owner: 10Hashar) [12:57:17] ^^ :) [12:57:27] (03Merged) 10jenkins-bot: debian-glue: set BUILDRESULT [integration/config] - 10https://gerrit.wikimedia.org/r/300858 (owner: 10Hashar) [13:12:25] (03CR) 10Hashar: (DO NOT MERGE) fetch proper repo/heads (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) (owner: 10Hashar) [13:12:39] paladox: can you elaborate on https://gerrit.wikimedia.org/r/#/c/300567/1/jjb/operations-debs.yaml ? :) [13:12:51] hi, any objections to me cherry-picking https://gerrit.wikimedia.org/r/#/c/300827/ in beta and provision it on deployment-imagescaler01 ? cc gilles [13:12:52] the aim is to have the job to clone from the real repository (ie gerrit) [13:13:03] so upstream / master / debian branches are the proper one. [13:13:12] hashar oh, that broke it for me on the test install i have [13:13:20] That's why i suggested that [13:13:27] and oh [13:13:29] godog: no objections from me. Make sure to !log it there though :] [13:13:45] godog: beta being the perfect place to play test such change imho [13:13:56] hashar: yup, do I need the !log deployment-prep or simply !log ? [13:14:03] simply !log [13:14:10] the bot listening in this channel does all the magic [13:14:22] and that ultimately ends up in https://tools.wmflabs.org/sal/releng/ [13:14:37] which is really just an audit trail. I dont think anyone watch that log carefully [13:14:42] unless needed [13:15:12] hehe, the trailing slash actually makes it a 404 [13:15:30] bah [13:15:39] sorry :( [13:17:07] !log cherry-pick https://gerrit.wikimedia.org/r/#/c/300827/ on deployment-puppetmaster [13:17:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [13:17:12] hashar but anyways, doing it that way broke it on the test instance, but keeping that $ZUUL_URL bit worked for me, [13:17:20] But anyways you can merge the patch [13:17:22] as it is [13:17:33] I will see if i can get it working [13:17:40] (03CR) 10Paladox: [C: 031] (DO NOT MERGE) fetch proper repo/heads [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) (owner: 10Hashar) [13:17:50] paladox: another way would have to inject in the environement something like GIT_BASE_URL=https://gerrit.wikimedia.org and have Zuul set it [13:18:02] Oh [13:18:04] paladox: but really gerrit.wm.o is hardcoded in a lot of jobs [13:18:08] Yep [13:20:25] (03PS2) 10Hashar: (DO NOT MERGE) fetch proper repo/heads [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) [13:20:53] 10Continuous-Integration-Infrastructure: debian-glue fails to find generated binaries packages (.deb) - https://phabricator.wikimedia.org/T141246#2491856 (10hashar) 05Open>03Resolved a:03hashar [13:21:40] 10Continuous-Integration-Infrastructure: debian-glue fails to find generated binaries packages (.deb) - https://phabricator.wikimedia.org/T141246#2491357 (10hashar) Fixed by having the jenkins job to export BUILDRESULT before the build and provide package step + an extra patch in the puppet package_builder that... [13:22:28] hashar it seems that the debs arnt being copied over see https://integration.wikimedia.org/ci/job/debian-glue/ [13:22:49] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:22:50] paladox: give me more details? :D [13:22:51] (03CR) 10Paladox: "check experimental" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/300532 (owner: 10Hashar) [13:23:07] oh [13:23:29] hashar the debs doint look like they are being saved to the job [13:23:38] ie allowing us to download or view them. [13:23:40] lets wait for the rebuild and check [13:23:47] Ok [13:24:27] 10Continuous-Integration-Config: Frequent "No space left on device" failures for debian-glue jobs on integration-slave-jessie-1001 - https://phabricator.wikimedia.org/T124746#1964900 (10hashar) Partly related to T141246. Each build ended up saving the resulting .deb inside the global /mnt/ I think I also got th... [13:25:01] Yep works now [13:25:02] https://integration.wikimedia.org/ci/job/debian-glue/ [13:25:07] but shows it as unstable [13:25:08] ? [13:25:33] there is a few test failures [13:25:44] which happened in the previous build [13:26:06] so Jenkins assume that the issues have been introduced in the previous build and are not a problem of the current build [13:26:10] should disable that really [13:26:11] Ih [13:26:17] Oh [13:27:32] hashar you that reindexing problem ostriches had this mornning, that he had to take gerrit offline. [13:27:39] It is fixed in gerrit 2.12.3 [13:27:53] He said he had problems with locks with is fixed in gerrit 2.12.3 [13:28:16] great! [13:28:58] • Fix internal server error when loading submit rules. [13:29:04] Oh wrong one [13:29:53] • Fix error reindexing changes when a change no longer exists. [13:29:58] I think it was that one ^^ [13:30:02] https://gerrit-documentation.storage.googleapis.com/ReleaseNotes/ReleaseNotes-2.11.9.html [13:32:16] https://gerrit.googlesource.com/gerrit/+/3d35012d512a8d54382848379b798027f9b0ceda%5E%21/#F0 [13:37:47] RECOVERY - Puppet run on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:46:55] hashar we can make the debian-glue test non-voting for integration/zuul [13:47:08] and promote it to actually test it without us needing to recheck it. [13:47:11] :) [13:48:48] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:49:09] (03CR) 10Paladox: "check experimental" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/299869 (owner: 10Hashar) [13:49:53] hashar it seems to have failed https://integration.wikimedia.org/ci/job/debian-glue/230/console [13:53:31] (03CR) 10Paladox: (DO NOT MERGE) fetch proper repo/heads (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) (owner: 10Hashar) [13:57:34] paladox: I am hacking it [13:57:44] Oh ah :) :) [14:03:04] (03CR) 10Paladox: "check experimental" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/299869 (owner: 10Hashar) [14:03:10] hashar yay it worked. [14:04:20] (03PS3) 10Hashar: debian-glue now uses zuul-cloner [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) [14:04:44] paladox: ^^:) [14:04:51] Thanks [14:04:52] :) [14:04:52] it is a bit hacky [14:04:54] Yep [14:04:59] but fix a long standing bug [14:05:09] (03CR) 10Hashar: [C: 032] debian-glue now uses zuul-cloner [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) (owner: 10Hashar) [14:05:11] Yep [14:05:21] hashar what is the long standing bug? [14:05:53] (03Merged) 10jenkins-bot: debian-glue now uses zuul-cloner [integration/config] - 10https://gerrit.wikimedia.org/r/300567 (https://phabricator.wikimedia.org/T117869) (owner: 10Hashar) [14:06:08] 10Continuous-Integration-Config, 13Patch-For-Review: Make debian-glue to use Gerrit as the upstream repository - https://phabricator.wikimedia.org/T117869#2491975 (10hashar) `debian-glue` job is fixed. Still have to review the other jobs based on the JJB template `{name}-debian-glue` but I plan to overhaul th... [14:06:29] paladox: https://phabricator.wikimedia.org/T117869 [14:06:30] Yay [14:06:35] it would fetch from the zuul merger [14:06:43] which has local branches NOT matching the ones in Gerrit [14:06:55] Oh [14:07:14] We can make it and promote it to actual test [14:07:14] on integration/zuul [14:07:35] shoulden we use the name-debian-glue test for that to allow us to make it non voting for integration/zuul only. [14:07:39] or should we make it voting [14:07:40] ? [14:08:43] none voting for now [14:08:47] it fails :) [14:08:48] RECOVERY - Puppet run on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:09:12] Ok [14:09:34] hashar does that mean we can duplicate what we done for debian-glue to name-debian-glue [14:09:38] or are there diffrences [14:09:39] ? [14:09:49] I wanna see whether the job can build the debian package for mediawiki [14:09:58] Yay [14:09:59] in mediawiki/debian.git :) [14:10:00] :) [14:10:07] trying it out with a dummy commit at https://gerrit.wikimedia.org/r/#/c/262746/ [14:10:08] Are you doing that now> [14:10:10] ? [14:10:11] and https://integration.wikimedia.org/ci/job/debian-glue/238/console [14:10:16] yeah [14:10:21] :) [14:10:50] hashar you just saved alot of time now, people who are inexperenced in building can do it through jenkins [14:10:51] :) [14:10:55] :) [14:10:59] yeah that is the rough idea [14:11:06] though it always have been a side project [14:11:14] Oh [14:11:56] :) [14:12:20] 00:01:10.606 ERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored. [14:12:20] bah [14:13:33] Oh [14:13:35] Yep [14:13:39] But you can ignore that [14:13:43] it does it for zuul too [14:14:45] hashar yay it built the deb [14:16:20] 10Continuous-Integration-Config, 10MediaWiki-Debian: Set up CI auto-building for mediawiki/debian repository - https://phabricator.wikimedia.org/T122978#2492027 (10hashar) I have been sprinting out some change to the Jenkins job `debian-glue` and slightly enhanced it. It builds in a cowbuilder image based on... [14:16:34] 10Continuous-Integration-Config: Make debian-glue to use Gerrit as the upstream repository - https://phabricator.wikimedia.org/T117869#2492028 (10hashar) [14:17:18] 10Continuous-Integration-Config: Make debian-glue to use Gerrit as the upstream repository - https://phabricator.wikimedia.org/T117869#1785557 (10hashar) a:03hashar [14:17:21] hashar oh it failed for jessie for zuul https://integration.wikimedia.org/ci/job/debian-glue/237/console [14:18:14] 10Deployment-Systems, 10scap, 07WorkType-NewFunctionality: Create canary deploy process for MediaWiki - https://phabricator.wikimedia.org/T136883#2492036 (10thcipriani) [14:18:16] 10Deployment-Systems, 10scap, 07WorkType-NewFunctionality: Create ability to deploy-to and run-checks-on canary for MediaWiki deploys - https://phabricator.wikimedia.org/T136886#2492034 (10thcipriani) 05Open>03Resolved [14:19:06] (03CR) 10Paladox: "check experimental" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/299869 (owner: 10Hashar) [14:22:25] mhh I'm getting a strange error in deployment-prep whereas the same works when pushed to another jessie instance in labs, Error: Could not retrieve catalog from remote server: Error 400 on SERVER: undefined method `match' for 8801:Fixnum at /etc/puppet/modules/thumbor/manifests/init.pp:64 on node deployment-imagescaler01.deployment-prep.eqiad.wmflabs [14:22:34] ok for me to bounce puppetmaster ? [14:22:42] 10Continuous-Integration-Config, 10Analytics-Wikimetrics: tox runs all tests (including manual ones) - https://phabricator.wikimedia.org/T71183#2492051 (10Milimetric) @hashar, yes wikimetrics is maintained but it lost most of its stakeholders in reorganizations. So it's not very active. It has value for a fe... [14:22:53] godog: yeah I often restart it [14:23:03] godog: at least once a week. Presumably due to some memory leak [14:23:31] godog: and somehow it sometimes does not recognizes new defines/classes that are itrnoduced by cherry picks [14:23:42] ah? the master on deployment-puppetmaster says it was started on may 31st [14:23:46] anyways [14:23:47] oh [14:23:51] so not that often :) [14:23:59] or maybe that is the integration-puppetmaster I reboot more often [14:24:08] !log bounce puppetmaster on deployment-puppetmaster [14:24:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:24:52] I deluded myself into thinking that was the problem, but yeah I've seen some caching behaviour too [14:28:08] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:29:50] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:44:51] 10Continuous-Integration-Config, 10Analytics-Wikimetrics: tox runs all tests (including manual ones) - https://phabricator.wikimedia.org/T71183#2492094 (10hashar) p:05Triage>03Low Gave a try again and tox/nosetests fail on my local machine: ``` IOError: [Errno 2] No such file or directory: '/srv/wikimetric... [14:52:46] (03PS1) 10Hashar: debian-glue: pass distribution to piuparts [integration/config] - 10https://gerrit.wikimedia.org/r/300877 [14:56:21] (03CR) 10Hashar: [C: 032] debian-glue: pass distribution to piuparts [integration/config] - 10https://gerrit.wikimedia.org/r/300877 (owner: 10Hashar) [14:57:04] (03Merged) 10jenkins-bot: debian-glue: pass distribution to piuparts [integration/config] - 10https://gerrit.wikimedia.org/r/300877 (owner: 10Hashar) [14:59:42] hashar :) [14:59:49] brb setting up wifi extender. [14:59:51] RECOVERY - Puppet run on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:04:21] RECOVERY - Puppet run on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [15:06:46] I give up with wifi extender [15:07:02] apple extender dosent work correctly when i try and bridge in windows. [15:08:06] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [15:09:12] (03PS19) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [15:20:12] ostriches with gerrit 2.12 does that mean you can run gcc to reduce repo's sizes. [15:20:53] (03CR) 10Paladox: "check experimental" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/299869 (owner: 10Hashar) [15:23:49] paladox: We could already, and we used to have a cron that did it weekly. Nice thing now is I can schedule it internally so no need to run the cron [15:24:00] (which I already configured. it runs saturdays I think) [15:24:02] Oh [15:24:03] :) [15:24:25] Plus zuul is running perfectly with gerrit [15:24:26] :) [15:24:52] ostriches hasharAway got debs being built through jenkins today :) zuul and mediawiki [15:25:03] making it easy for us to do the same with gerrit. [15:25:40] 10Continuous-Integration-Config, 10Analytics-Wikimetrics: tox runs all tests (including manual ones) - https://phabricator.wikimedia.org/T71183#2492178 (10Nuria) @harshar: tests cannot be run from depo alone as they require a wikimetrics instance running, there are a few unit tests but mostly they are integrat... [15:26:57] ostriches also the bug you hit this mornning with reindexing was it because of changes, because there is a fix in gerrit 2.12.3 todo with reindexing [15:27:05] but not sure if it fixes the problem you hit [15:27:28] https://gerrit-documentation.storage.googleapis.com/ReleaseNotes/ReleaseNotes-2.11.9.html [15:27:34] Fix error reindexing changes when a change no longer exists. [15:28:02] Nah. It's actually pretty predictable what happens and I don't think there's really a fix that can be done. [15:28:02] Which is part of the gerrit 2.12.3 change but they doint repeat the same changes in the release notes if they are shared accross releses [15:28:08] oh [15:28:37] Basically I tried to force reindex while the online reindexer was going and they were competing for lock files [15:28:42] It...kinda makes sense [15:28:47] Oh [15:28:48] ah [15:36:04] ostriches: Are there any known issues with ACLs into the new gerrit? I'm trying to work out what's going on, but I might just be forgetting what the config was beforehand. [15:36:39] Known issues? What's going on? [15:37:02] ostriches: https://gerrit.wikimedia.org/r/#/admin/projects/VisualEditor/VisualEditor,access [15:37:24] ostriches: VE core has V+2 rights for the VisualEditor group but not (?) for jenkinsbot or i18n-bot. [15:37:45] ostriches: Whereas I'm pretty sure we locked the repo down so that only the bots could merge. Unless we unset it. [15:38:06] ...not to my knowledge? [15:38:21] JenkinsBot looks like it has access on the parent repo... [15:38:25] Should inherit. [15:38:32] Hmm. Also there are no members listed in the VE group. [15:38:42] Oh, wait, no, they're there now. [15:38:52] Odd. [15:39:41] OK, so VE-MW has the set-up I was expecting; maybe I was just wrong: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/VisualEditor,access [15:39:43] Never mind! [15:39:55] James_F https://phabricator.wikimedia.org/diffusion/GVED/browse/master/;53a0e99c83e7833e6cfc5da5d2efb654b85b8bc4 [15:40:23] Hmm. [15:41:45] ostriches gerrit 2.12 now makes it easy for us to view refs/meta/config changes in phabricator [15:42:15] workaround is to create a change through web editor and edit it in refs/meta/config and then you get a diffusion link for parent project [15:42:30] it's better then before when it didnt do that. [15:42:43] :) [15:53:45] ostriches, i think we can do https://phabricator.wikimedia.org/T103990 now. Since gerrit 2.12 include jgit 4.1 [15:53:49] https://bugs.chromium.org/p/gerrit/issues/detail?id=175 [16:10:19] (03PS2) 10Jforrester: GeoCrumbs no longer depends on CustomData in master [integration/config] - 10https://gerrit.wikimedia.org/r/300068 [16:11:29] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 07Epic: Migrate base image to Debian Jessie (epic) - https://phabricator.wikimedia.org/T136429#2492409 (10dduvall) [16:12:31] 10Continuous-Integration-Infrastructure: zuul-cloner fails mediawiki-extensions-hhvm job with "error: object file .git/objects/30 is empty" - https://phabricator.wikimedia.org/T141269#2492417 (10Krinkle) [16:12:34] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2334744 (10dduvall) p:05Triage>03Normal [16:15:25] 10Continuous-Integration-Infrastructure, 05Continuous-Integration-Scaling, 06Release-Engineering-Team: Identify metric (or metrics) that gives a useful indication of user-perceived (Wikimedia developer) service of CI - https://phabricator.wikimedia.org/T139771#2492434 (10chasemp) @hashar where on that is th... [16:20:30] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492480 (10Luke081515) [16:20:35] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492492 (10Luke081515) p:05Triage>03Unbreak! [16:20:46] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492495 (10Luke081515) [16:20:48] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T139214#2492494 (10Luke081515) [16:21:00] start the drama... [16:21:12] greg-g: ^ FYI, for deployment of wmf.12 [16:21:31] Need the server side error [16:22:06] The code just throws a 503. [16:22:18] Reedy: for example take a look at: http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Selenium_language_test_page&curid=19717&diff=343626&oldid=343621 [16:22:29] Look at the error log? [16:23:15] where can I find it? [16:24:09] check the hhvm/apache log on the machine serving it? [16:24:18] Or fatal.log on fluorine [16:24:35] I didn't do that before [16:24:39] maybe you can? :) [16:25:49] PROBLEM - Puppet run on deployment-kafka04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [16:27:49] can't login to deployment-fluorine... [16:28:06] no access or a problem at that instance? [16:28:54] ssh isn't working from tools-login [16:30:04] I feel this is dejavu and I should be using another bastion [16:30:06] Reedy: tools-login? [16:30:19] 'beta' and 'tools' are separate labs projects [16:30:37] beta goes through the main labs bastion [16:30:37] Yup [16:30:40] it doesn't have its own [16:30:49] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:30:52] reedy@tools-bastion-03:~$ ssh -A deployment-tin [16:30:52] Linux deployment-tin 3.13.0-91-generic #138-Ubuntu SMP Fri Jun 24 17:00:34 UTC 2016 x86_64 [16:33:42] are you using agent forwarding Reedy? [16:33:50] Nope [16:33:55] I'm just using -A for the hell of it [16:34:34] anyway deployment-fluorine works for me [16:34:52] It does for me, from another source [16:34:56] Luke081515: I have you a stack trace [16:35:24] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492480 (10Reedy) ``` 2016-07-25 16:34:28 [V5Y-lApEEH8AAFjzP48AAAAJ] deployment-mediawiki02 enwiki 1.28.0-alpha fatal ERROR: [15b5d9cc] PHP Fatal... [16:35:48] RECOVERY - Puppet run on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [16:35:58] 2016-07-25 16:34:28 [V5Y-lApEEH8AAFjzP48AAAAJ] deployment-mediawiki02 enwiki 1.28.0-alpha fatal ERROR: [15b5d9cc] PHP Fatal Error: Couldn't find constant DifferenceEngine::MW_DIFF_VERSION {"exception_id":"15b5d9cc"} [16:36:07] shouldn't be MW_DIFF_VERSION [16:36:08] * Reedy fixes [16:37:29] https://gerrit.wikimedia.org/r/300899 [16:38:02] bad ostriches [16:38:33] tsts :D [16:38:53] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492563 (10Reedy) Caused by https://gerrit.wikimedia.org/r/#/c/300576 [16:39:02] Reedy: Blame Aaron, he reviewed my shit code :p [16:39:06] lolol [16:39:32] He should know better than to let me loose on MediaWiki :p [16:39:40] LOL [16:39:41] xD [16:39:52] Another issue that static analysis would've picked up on ;) [16:42:30] Luke081515: anyway, nice catch [16:42:49] Reedy: I was not the first user, who saw it ;) (see the task desc) [16:42:56] meh [16:43:02] claim credit when it's offered :P [16:43:20] but anyway, otherwise I guess we would have some big problems tomorow ;) [16:43:52] ostriches just wanted a new t-shirt [16:44:13] :D [16:44:21] "I broke Wikipedia"? [16:44:28] Been there done that [16:44:36] Got the pile of t-shirts. [16:44:41] I broke wikipedia before it was cool [16:44:45] ostriches i belive this is resolved [16:44:46] https://phabricator.wikimedia.org/T85002 [16:45:09] I did some testing and searching by for example T1 will result in the same as bug:. [16:45:10] T1: Get puppet runs into logstash - https://phabricator.wikimedia.org/T1 [16:45:26] ha ^^ lol the first ever task [16:46:31] Real first task: https://phabricator.wikimedia.org/T2001 [16:46:36] Always the first. [16:46:40] And Most Important! [16:46:56] Oh [16:47:01] Is that really the first task [16:47:18] It was bug 1 in BZ [16:47:22] Oh [16:47:35] ostriches: I think we had bugs at sourceforge before? :O [16:47:41] Those all got migrated. [16:47:44] But it seems with this approche it will grab all references for T1. [16:47:45] T1: Get puppet runs into logstash - https://phabricator.wikimedia.org/T1 [16:47:45] And lost their bug numbers. [16:47:47] :) [16:47:50] oh [16:47:55] https://phabricator.wikimedia.org/T25223 - also one of the most important. cc Reedy :p [16:48:09] Arguably one of the best bugs ever, tbh. [16:48:24] I thought there is wikimedia in london too [16:48:24] ? [16:48:41] ostriches: beeeeeer [16:49:08] There was much beer that week. [16:49:55] Double clicking that link opens microsoft edge ha. [16:51:52] * paladox_ really needs the bt smart hub, the netgear wifi extender is rubish when lots of people are on it [16:53:08] Luckly i get a filter in the bt smart hub to prevent interferenced. [16:54:40] ostriches i belive comcast share peoples internet, bt do the same but are totally seperate from the network the home user is on and is opt in. [16:54:50] :) [16:55:43] It's just a VLAN [16:55:47] Still eats your internet connection [16:55:54] VLAN + isloated SSID [16:56:13] Actually dosent with me [16:56:20] as i am on unlimited [16:56:45] Plus i think i go through like 1tb a month, i doint use that much but the other family members do like there gaming [16:57:19] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T139214#2492703 (10Luke081515) [16:57:22] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible, 15User-Luke081515: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492700 (10Luke081515) 05Open>03Resolved a:03Luke081515 Checked, works. [16:57:26] The xbox causes a huge slow down on the extender causes interference making it slow for everyone even connecting to the bt home hub 5 is slow do to interferenced [16:57:32] Reedy ^^ [16:57:35] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible, 15User-Luke081515: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492706 (10Luke081515) a:05Luke081515>03Reedy [16:57:51] 06Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 07Beta-Cluster-reproducible: Viewing diffs throws 503@beta - https://phabricator.wikimedia.org/T141272#2492480 (10Luke081515) [17:01:03] 06Release-Engineering-Team (Long-Lived-Branches), 10ReleaseTaggerBot: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches - https://phabricator.wikimedia.org/T141278#2492749 (10mmodell) [17:04:47] 06Release-Engineering-Team (Long-Lived-Branches), 10ReleaseTaggerBot: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches - https://phabricator.wikimedia.org/T141278#2492749 (10mmodell) [17:18:05] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T139212#2492835 (10demon) 05Open>03Resolved [17:20:56] weird thing with the gerrit changeover, i'm not getting valid json responses from the /r/changes api anymore, there is some junk prefixed to the response: https://gerrit.wikimedia.org/r/changes/?q=project:mediawiki/extensions/CirrusSearch [17:21:03] filled a ticket, but perhaps someone knows what's going on? [17:37:37] ebernhardson: Replied on the ticket. [17:37:45] That's by design. For security. [17:37:52] ostriches: intentional. interesting [17:38:22] realized the problem i had was unrelated to that anyways, it was just the first thing that jumped out when debugging. [17:38:24] Something about not wanting to inject a json payload somewhere without the developer being aware? [18:06:23] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Cannot SSH to a few CI slaves due to DNS failure - https://phabricator.wikimedia.org/T129640#2111479 (10AlexMonk-WMF) It's difficult to do anything with the instances now gone... I suggest this be closed [18:17:11] Yippee, build fixed! [18:17:11] Project mediawiki-core-code-coverage build #2159: 09FIXED in 3 hr 17 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/2159/ [18:19:52] 06Release-Engineering-Team, 15User-greg: Institute quarterly(?) review of incident reports and follow-up - https://phabricator.wikimedia.org/T141287#2493130 (10greg) [18:20:24] PROBLEM - Host deployment-upload is DOWN: CRITICAL - Host Unreachable (10.68.16.189) [18:21:18] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493162 (10Reedy) [18:29:34] 10Continuous-Integration-Infrastructure: debian-glue fails to find generated binaries packages (.deb) - https://phabricator.wikimedia.org/T141246#2493191 (10hashar) pbuilderrc change has been reviewed by @fgiunchedi / @akosiaris thank you! [18:31:25] 06Release-Engineering-Team, 06ArchCom, 06Developer-Relations, 10Phabricator: Consider alternative processes for Unbreak Now bugs, especially those which cross-cut components - https://phabricator.wikimedia.org/T140207#2493211 (10greg) >>! In T140207#2490321, @Tgr wrote: >>>! In T140207#2487988, @greg wrote... [18:31:51] 06Release-Engineering-Team, 15User-greg: Identify "first responders" for "all" "components" deployed on Wikimedia servers - https://phabricator.wikimedia.org/T141066#2493213 (10greg) [18:32:59] 06Release-Engineering-Team, 06ArchCom, 06Developer-Relations, 10Phabricator: Consider alternative processes for Unbreak Now bugs, especially those which cross-cut components - https://phabricator.wikimedia.org/T140207#2493217 (10greg) p:05Triage>03High [18:34:07] greg-g: Your epiphany sounds good to me. :-) [18:35:14] James_F: it was awesome [18:35:28] I literally bounced in my chair [18:35:32] * James_F grins. [18:35:43] ACTION: GG to secure self to chair. [18:36:18] I really just want to buy the $190 Ikea adjustable standing desk for home and throw away this old horrible chair, but yes :) [18:36:54] James_F: can you +1 on task for historical reasons? :) [18:37:04] Meh. [18:37:05] Sure. [18:37:09] I like validation [18:37:13] "in writing" [18:37:21] it looks good in my promotion packet :P [18:37:52] 06Release-Engineering-Team, 06ArchCom, 06Developer-Relations, 10Phabricator: Consider alternative processes for Unbreak Now bugs, especially those which cross-cut components - https://phabricator.wikimedia.org/T140207#2493223 (10Jdforrester-WMF) >>! In T140207#2493211, @greg wrote: > `#Wikimedia-Incident`... [18:38:01] * James_F laughs. [18:38:37] :) :) [18:39:31] ok, looking at the UBN!s now: https://phabricator.wikimedia.org/maniphest/query/RL5AY7Erj31i/ [18:39:46] this seems legit: https://phabricator.wikimedia.org/T138725 [18:39:49] 06Release-Engineering-Team, 06ArchCom, 06Developer-Relations, 10Phabricator: Consider alternative processes for Unbreak Now bugs, especially those which cross-cut components - https://phabricator.wikimedia.org/T140207#2493231 (10JAufrecht) Is there consensus that UBN! has a common, global meaning, rather t... [18:41:41] 06Release-Engineering-Team, 06ArchCom, 06Developer-Relations, 10Phabricator: Consider alternative processes for Unbreak Now bugs, especially those which cross-cut components - https://phabricator.wikimedia.org/T140207#2493234 (10greg) >>! In T140207#2493231, @JAufrecht wrote: > Is there consensus that UBN!... [18:43:50] 06Release-Engineering-Team, 06ArchCom, 06Developer-Relations, 10Phabricator: Consider alternative processes for Unbreak Now bugs, especially those which cross-cut components - https://phabricator.wikimedia.org/T140207#2493259 (10greg) (In other words: People/teams/projects are free to use UBN! as they see... [18:50:45] back to UBN! reviewing... [18:53:32] that one I said looks legit I just commented on asking for status etc [18:54:38] the rest in the query are either known/working on (https://phabricator.wikimedia.org/T138673) or Fundraising (I'm not going to mess with them), iOS (also not going to mess with) or pywikibot (also not going to mess with) [18:54:54] * greg-g ends review [18:55:34] * paladox uses iOS. [18:55:55] many people do, but their backlog is more actively managed than most and it is being addressed [18:56:04] iow: I don't need to worry about it [18:57:55] 06Release-Engineering-Team, 06Operations, 15User-greg: Institute a weekly review of all UBN! tasks - https://phabricator.wikimedia.org/T141130#2493314 (10greg) I didn't do this at the time I said I would but for today: ```lang=irc 18:39 <+ greg-g> ok, looking at the UBN!s now: https://phabricator.wikimed... [19:00:50] 06Release-Engineering-Team, 06Operations, 15User-greg: Institute a weekly review of all UBN! tasks - https://phabricator.wikimedia.org/T141130#2493327 (10greg) [19:04:48] Oh FFS [19:04:56] Why does Gerrit hijack Ctrl+C?! [19:05:03] I want to copy commit titles, not add reviewers [19:05:24] Oh it's just because I pressed c [19:05:34] * RoanKattouw heads to the upstream bug tracker to file some bugs [19:07:30] 11:25 < paladox> mutante ostriches it seems gerrit 2.12.2 breaks keybored shortkeys for non us keybords on the sidebyside diff [19:07:33] 11:25 < paladox> https://gerrit-documentation.storage.googleapis.com/ReleaseNotes/ReleaseNotes-2.11.8.html [19:07:36] 11:25 < paladox> gerrit 2.12.3 fixes it [19:07:38] RoanKattouw: ^ you are not the first [19:07:47] RoanKattouw: you can confirm if it doesnt on gerrit-test.wmflabs [19:07:48] https://bugs.chromium.org/p/gerrit/issues/detail?id=4072 [19:08:01] Oh interestin [19:08:02] on 2.12.3 [19:08:22] i noticed that too with Ctrl + R for me [19:08:25] I think that sounds like the reverse of my issue [19:08:41] As in, the fix for that issue could cause my issue [19:08:46] Will try gerrit-test [19:09:05] Yup you're right it's fixed there [19:09:43] https://bugs.chromium.org/p/gerrit/issues/detail?id=1207 [19:09:48] minor upgrade to come soon [19:11:24] Yeah, letting the dust settle :) [19:11:29] But asap. [19:28:50] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493162 (10hashar) Is the issue that nothing validate the schema of extension.json? If such a thing, it should be run via the MediaWiki core `structure` testsuite which runs... [19:29:35] Reedy: is there something to validate extension.json / skin.json files ? [19:29:45] Not sure really [19:29:51] We check its valid json [19:30:00] Lego has written schema files [19:30:14] I thought we did validate the schema? [19:30:32] It might not work fully for v2 [19:30:46] As it let me submit something that didn't validate to v2 [19:30:58] Ah. [19:31:04] given TorBlock managed to sneak in an invalid schema... [19:31:09] I2fd1caaa50c288821ab6847dc29d60e6554d9df5 [19:31:10] Yeah. [19:31:11] Maybe the file used by CI just needs a bump [19:31:21] File name that is [19:31:26] Wait [19:31:38] No, no version number file name is newest [19:31:47] CI is really lame [19:31:48] Older are copied to version numbers [19:31:58] * James_F nods. [19:32:04] There's something askew anyway [19:32:10] beside cloning/setting up the patches, it just run something like cd tests/phpunit ; php phpunit.php --testsuite extensions [19:32:11] afaik [19:32:15] So it's more "something doesn't work in the existing validation" than "there is none". [19:32:22] Should be an easy fix [19:32:30] I think. But I'm not sure [19:32:31] Or a really hard to spot subtle bug. [19:32:47] I guess I made a test case :p [19:32:53] Hmm [19:33:10] Does Jenkins automatically use extension.json if it exists? [19:33:19] And hence, know to validate if [19:33:24] 06Release-Engineering-Team, 07Documentation, 15User-greg: Document tech leads for RelEng projects - https://phabricator.wikimedia.org/T139539#2493517 (10greg) [19:33:26] 06Release-Engineering-Team, 15User-greg: Identify RelEng projects 'worthy' of a tech lead - https://phabricator.wikimedia.org/T139540#2493514 (10greg) 05Open>03Resolved a:03greg We did, they're the quarterly goals. For anything else, see the skill matrix. [19:44:19] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493554 (10Reedy) [20:29:35] Reedy: is there something to validate extension.json / skin.json files ? [20:29:45] Not sure really [20:29:50] We ch... [19:52:01] (03PS4) 10Awight: Quit testing DonationInterface against REL1_25 [integration/config] - 10https://gerrit.wikimedia.org/r/299940 (owner: 10Ejegg) [19:52:19] Reedy: let me check [19:52:40] thanks! [19:52:42] that is integration/jenkins.git there is some mediawiki.d which has a bunch of php snippets injected at bottom of LocalSettings.php [19:52:47] one of them has the logic to load extensions [19:53:10] mediawiki/conf.d/50_mw_ext_loader.php [19:53:42] Reedy: https://github.com/wikimedia/integration-jenkins/blob/master/mediawiki/conf.d/50_mw_ext_loader.php [19:54:06] where the text file extensions_load.txt is crafted by Zuul/Jenkins and has an explicit list of extensions to load [19:54:15] that is because the jenkins job do not clear their workspace [19:54:21] so yeah https://github.com/wikimedia/integration-jenkins/blob/master/mediawiki/conf.d/50_mw_ext_loader.php#L62 [19:54:22] and you can have random extensions present in there [19:54:27] extension.json takes precedence [19:54:36] yeah so it loads it [19:54:55] not sure why MediaWiki does not bail out [19:56:25] Need to work out why https://gerrit.wikimedia.org/r/#/c/300895/ apparently breaks lots of the unit tests... Even though the config doesn't change :( [19:58:29] Reedy: first validate that tests passed fine previously? I usually go to the lastest merged change (usually a l10nbot one) and hit recheck [19:58:39] They did [19:58:41] just to confirm the repos at the tip of the branch works fine [19:58:44] Locally and on jenkins [19:58:50] We've had a few trivial, unrelated patches go on fine [19:59:11] :( [19:59:30] I've got a feeling it's not breaking anything, it's just exposing something that's already broken/suppressed somehow [19:59:44] https://integration.wikimedia.org/ci/job/mwext-testextension-php55/17182/artifact/log/mw-debug-cli.log/*view*/ [19:59:49] that is the debug log [19:59:59] there is a bunch of Tiff metadata is invalid, missing or has errors. [20:00:05] not sure if that is related though [20:00:08] 06Release-Engineering-Team, 07Documentation, 15User-greg: Document tech leads for RelEng projects - https://phabricator.wikimedia.org/T139539#2493605 (10greg) 05Open>03Resolved a:03greg yup [20:00:18] I suspect some of them are expected [20:00:41] if you compare the debug output with the one of a passing build that can confirm/infirm it [20:03:28] Hmm. No sign of them in https://integration.wikimedia.org/ci/job/mwext-testextension-php55/17183/artifact/log/mw-debug-cli.log/*view*/ [20:05:41] I wonder if it's those messages [20:05:41] Reedy: that is usually at that point I download both console [20:05:44] then do a word diff [20:05:50] wdiff good.txt bad.txt | colordiff [20:11:44] Oh, got it [20:11:47] 06Release-Engineering-Team, 06Operations, 15User-greg: Institute quarterly(?) review of incident reports and follow-up - https://phabricator.wikimedia.org/T141287#2493671 (10faidon) [20:12:22] $wgMediaHandlers['image/tiff'] [20:12:28] string(16) "PagedTiffHandler" [20:12:30] old ^ [20:12:38] string(11) "TiffHandler" [20:12:39] new [20:12:46] something has gone weird there [20:14:07] "MediaHandlers": { [20:14:07] "image/tiff": "PagedTiffHandler" [20:14:07] }, [20:14:13] Why is it dropping Paged? [20:14:42] Why isn't it in config? [20:16:51] Doesn't need to be it seems [20:17:58] 06Release-Engineering-Team, 06Operations, 15User-greg: Institute quarterly(?) review of incident reports and follow-up - https://phabricator.wikimedia.org/T141287#2493130 (10faidon) I'd like us (#operations) to be involved in those discussions. We are de facto and de jure the primary incident responders as w... [20:20:05] There's some bug here [20:20:14] and not with the extension [20:23:04] 06Release-Engineering-Team, 06Operations, 15User-greg: Institute quarterly(?) review of incident reports and follow-up - https://phabricator.wikimedia.org/T141287#2493729 (10greg) +1 :) I think this first meeting I'm having with TPG is just "is someone willing to help brainstorm" :) so, yeah, I'll loop you i... [20:32:50] why is the word paged being eaten!? [20:47:10] Oh, it needs to override [21:14:17] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493873 (10Legoktm) Eh wtf. Is ExtensionJsonValidationTest not running? It's a structure test which was written for this exact purpose. [21:14:30] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493874 (10Legoktm) p:05Triage>03Unbreak! [21:14:59] hashar: ^ lol [21:15:32] oh no [21:16:10] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493879 (10Legoktm) It ran the test... https://integration.wikimedia.org/ci/job/mwext-testextension-hhvm/18770/testReport/(root)/ExtensionJsonValidationTest__testPassesValida... [21:16:26] Reedy: Phabricator knows 'irc' as a markdown language :D so https://phabricator.wikimedia.org/T141289#2493554 could have: lang=irc [21:16:35] but that make it colorful :D [21:17:30] 10Continuous-Integration-Config, 10MediaWiki-Configuration: extension.json schema validation - https://phabricator.wikimedia.org/T141289#2493880 (10Reedy) >15:36:58 Warning: in_array() expects parameter 2 to be an array or collection in /mnt/jenkins-workspace/workspace/mwext-testextension-hhvm/src/extensions/T... [21:18:39] Reedy: maybe PHPUnit is able to convert warnings/noticed etc to exceptions [21:18:43] that might be good thing on ci [21:19:13] convertErrorsToExceptions="true" [21:19:13] convertNoticesToExceptions="true" [21:19:13] convertWarningsToExceptions="true" [21:19:14] oh [21:19:36] which we have [21:20:30] heh [21:20:46] I guess the PHPUnit one is overriden by MediaWiki [21:22:12] anyway sleep() [21:27:05] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Cannot SSH to a few CI slaves due to DNS failure - https://phabricator.wikimedia.org/T129640#2493921 (10hashar) 05Open>03Resolved a:03hashar Agreed. I guess that was a transient issue related to DHCP/DNS. I havent encountered that... [21:27:12] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Cannot SSH to a few CI slaves due to DNS failure - https://phabricator.wikimedia.org/T129640#2493924 (10hashar) a:05hashar>03None [21:27:31] I thought he went to sleep? :P [21:30:39] PROBLEM - Puppet staleness on integration-puppetmaster is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [43200.0] [21:30:41] yes [21:30:49] Reedy * hashar has quit (Remote host closed the connection) [21:38:34] I have some remedial questions about CI on an extension: How are composer libraries updated during testing? I'm seeing phpunit failures due to the job not loading dependencies from my extension's composer.json as it was changed in a recent patch. For example, https://integration.wikimedia.org/ci/job/mwext-testextension-hhvm-non-voting/1187/console [21:39:03] Reading https://github.com/wikimedia/integration-jenkins/blob/master/bin/mw-fetch-composer-dev.sh hasn't enlightened me yet... [21:40:00] * awight plays with xargs composer locally... [21:41:13] awight: hashar might be the person to ask, but he's gone for hte night [21:41:31] awight: but, we don't run composer IIRC on jenkins [21:41:32] Reedy: Or one can hope! [21:41:44] So, adding it to the mediawiki-vendor repo is the usual workaround [21:41:47] he's always awake [21:42:10] hrm, thx that helps narrow down the cause at least! [21:42:23] add the vendor change, Depends-On: IWHATEVER [21:42:47] Right, I just realized that the mw-fetch-composer-dev script only does anything in MW_INSTALL_PATH [21:48:20] 10Continuous-Integration-Config, 10MediaWiki-Unit-tests, 07Regression: jenkins no longer outputs list and reasons of why tests are being skipped - https://phabricator.wikimedia.org/T141308#2493962 (10Legoktm) [21:50:41] 10Continuous-Integration-Infrastructure, 05Continuous-Integration-Scaling, 06Release-Engineering-Team: Identify metric (or metrics) that gives a useful indication of user-perceived (Wikimedia developer) service of CI - https://phabricator.wikimedia.org/T139771#2493977 (10hashar) Inspired by the graph above,... [21:50:59] 10Continuous-Integration-Infrastructure, 05Continuous-Integration-Scaling, 06Release-Engineering-Team: Identify metric (or metrics) that gives a useful indication of user-perceived (Wikimedia developer) service of CI - https://phabricator.wikimedia.org/T139771#2493979 (10hashar) Another thing we are missing... [21:55:19] 10Continuous-Integration-Config, 10Fundraising-Backlog, 10MediaWiki-extensions-DonationInterface, 03Fundraising Sprint Nitpicking, 07Unplanned-Sprint-Work: Continuous integration mw-ext composer behavior is not predictable - https://phabricator.wikimedia.org/T141309#2493985 (10awight) [21:55:23] (03CR) 1020after4: [C: 031] "I'd really like to refactor the way debs are built and get this stuff working on nodepool slaves." [integration/config] - 10https://gerrit.wikimedia.org/r/300790 (owner: 10Paladox) [21:56:36] OIC! Thanks for pointing me to Depends-On:, that changes everything :D [21:57:01] (03CR) 10Paladox: "It seems hashar fixed the problem. but some distro seem to not work, ie Jessie, but precise-Wikimedia does. Tested on zuul" [integration/config] - 10https://gerrit.wikimedia.org/r/300790 (owner: 10Paladox) [21:57:06] 10Continuous-Integration-Config, 10MediaWiki-Unit-tests, 07Regression: jenkins no longer outputs list and reasons of why tests are being skipped - https://phabricator.wikimedia.org/T141308#2493962 (10hashar) The skips are no more reported to the console log since https://gerrit.wikimedia.org/r/#/c/289629/ A... [22:00:07] looks like gerrit upgrade broke the changeid links in phabricator... (by changing the url format) [22:00:20] twentyafterfour oh where [22:00:28] I think i may know a fix [22:00:37] But im not sure what you mean [22:01:56] ? [22:01:58] 10Continuous-Integration-Config, 10MediaWiki-Unit-tests, 07Regression: jenkins no longer outputs list and reasons of why tests are being skipped - https://phabricator.wikimedia.org/T141308#2494038 (10Legoktm) Apparently this was intentional? {c15ba5fc20f458de2967cdad8fdfe8ba5adda341} I commented on the patc... [22:02:42] twentyafterfour ^^ [22:06:33] twentyafterfour yep i know a fi [22:06:35] fix [22:07:41] I will upload patch now [22:11:11] twentyafterfour please could you review https://phabricator.wikimedia.org/D296 [22:11:16] which includes the fix for changid [22:17:08] twentyafterfour :) :) [22:22:15] 10Continuous-Integration-Config, 10Fundraising-Backlog, 10MediaWiki-extensions-DonationInterface, 03Fundraising Sprint Nitpicking, 07Unplanned-Sprint-Work: Continuous integration mw-ext composer behavior is not predictable - https://phabricator.wikimedia.org/T141309#2494084 (10awight) Reedy suggested I t... [22:30:36] "Depends-On" may have killed zuul. iono [22:32:37] Circular dependancy? :P [22:33:46] Reedy is that in phabricator or gerrit? [22:36:49] Reedy I belive that is fixed in zuul now. [22:36:57] Supposed to be [22:37:03] Wouldn't be the first time [22:37:16] Reedy: It's possible--I only used the Depends-On header in one place, but who knows what implicit assumptions gerrit makes... [22:37:34] The usual problem was making 2 commits depend on each other [22:37:52] Reedy that should be fixed [22:37:52] paladox: thanks [22:38:01] awight: uh, that's not going to work [22:38:02] your welcome twentyafterfour [22:38:13] legoktm it should, since it is fixed in zuul [22:38:28] paladox: welcome to software [22:38:31] regressions do happen [22:38:37] What is jenkins doing? apart from not merging things [22:38:42] Oh yep, but it was broken before [22:38:51] and fixed in the latest zuul update [22:38:54] no, I mean the way awight edited his commit message [22:39:09] Reedy see https://phabricator.wikimedia.org/T129938 [22:39:11] Oh [22:40:58] twentyafterfour i hope it fixes the problem, i didnt remove the '' part just in case that could havr caused it to fail. [22:41:00] legoktm: I was kinda surprised too--you mean I should have put the Depends-On with the other Bug and Change-Id headers at the bottom? [22:41:10] Yeh [22:41:12] But that part seems to have worked, Zuul sees the dependency... [22:41:57] ok thx for the fix [22:41:59] https://integration.wikimedia.org/zuul/ [22:42:03] graphs don't look good [22:42:45] The tests froze [22:42:49] sigh [22:42:50] legoktm ostriches ^^ [22:42:58] yes I'm looking [22:43:07] I know why [22:43:11] Oh [22:43:16] is it nodepool again? [22:43:18] I'm happy to get my depends-on experiment outta there if this is the problem? [22:43:56] yes that was the problem [22:44:03] the vendor repo is not controlled by zuul [22:44:11] aha. well damn [22:44:12] but now I think zuul is just stuck in a loop [22:44:35] sorry for the maintenance! [22:45:11] I'm gonna restart zuul [22:45:14] reloading it didn't work [22:45:53] !log restarting zuul due to depends-on lockup [22:45:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [22:48:04] I re+2'd everything [22:49:14] Did you restart or stop & start? [22:49:17] Restarting keeps the queue [22:49:55] Oh, i think the upstream bug needs reopening [22:50:02] plus the phab task [22:51:00] legoktm i thought that https://phabricator.wikimedia.org/rCIZU92464a2291a032b9253fd02fbdd1f61569a4522c would fix problems. [22:51:13] Could it be the zuul-merger needing updating too to support that? [22:52:04] ostriches: restart wouldn't have worked, because that waits for all the jobs to finish and that wasn't going to happen [22:52:08] so I did stop/start [22:52:30] paladox: it's not a dependency cycle. it's when the depends-on project isn't controlled by zuul, zuul freaks out because it doesn't know it even exists [22:52:30] 10Continuous-Integration-Infrastructure, 10Zuul, 07Upstream: Circular dependencies break Zuul - https://phabricator.wikimedia.org/T129938#2494211 (10Paladox) 05Resolved>03Open This seems to still break. We found zuul stop working because of this. !log restarted zuul due to depends-on lockup [22:52:43] legoktm oh [22:53:02] 10Continuous-Integration-Infrastructure, 10Zuul, 07Upstream: Circular dependencies break Zuul - https://phabricator.wikimedia.org/T129938#2494214 (10Paladox) 05Open>03Resolved [22:53:33] 10Continuous-Integration-Infrastructure: zuul-cloner fails mediawiki-extensions-hhvm job with "error: object file .git/objects/30 is empty" - https://phabricator.wikimedia.org/T141269#2494217 (10Krinkle) And again: legoktm could you open a task for that please? [22:54:05] I'm not sure if a bug was ever filed for it, but we've seen it before [22:54:07] yeah [22:54:12] so that we can get upstream to try and fix it please. [22:54:13] thanks [22:54:14] :) [22:54:17] 10Continuous-Integration-Config, 10Fundraising-Backlog, 10MediaWiki-extensions-DonationInterface, 03Fundraising Sprint Nitpicking, 07Unplanned-Sprint-Work: Continuous integration mw-ext composer behavior is not predictable - https://phabricator.wikimedia.org/T141309#2494218 (10awight) *Don't* try what I... [22:56:14] legoktm: only if you happen to know--I'm trying to understand how CI requires libs from mw-ext-NNN/composer.json while testing an extension patchset. [22:56:31] it doesn't [22:56:43] you have to use a -composer variant job for that [22:56:49] a few of the Wikibase extensions have it set up [22:56:51] oho [22:56:57] or you put your dependencies into mediawiki/vendor [22:57:02] which is preferred for deployment purposes [22:57:11] because that way you can't merge when they're mismatched [22:57:22] okay ty. [22:58:06] the mw/vendor thing would be tricky for us, cos this is DonationInterface and the main cluster doesn't depend on that extension's runtime [22:58:16] But I can look into a -composer job. [22:58:30] twentyafterfour, ive updated the diff now with your feedback :) [22:58:53] We've been living on borrowed time, I guess, cos the ext/vendor libraries are often populated correctly. That's the part that really baffles me. [23:01:37] Actually, there's a (cd mw-ext-NNN; composer install) in the jobs we're using, but perhaps it's misbehaving [23:04:26] Anyone feel like helping me deploy https://gerrit.wikimedia.org/r/#/c/299940/ ? [23:04:44] That would mostly unblock us... [23:12:26] 10Continuous-Integration-Infrastructure, 10Zuul, 07Upstream: Zuul-cloner failing to acquire .git lock sometimes - https://phabricator.wikimedia.org/T86730#2494268 (10Krinkle) >>! In T86730#2466354, @Paladox wrote: > I belive this is fixed in https://phabricator.wikimedia.org/rCIZU0a6a0c422cdffe668bec2a9420d9... [23:15:46] awight: I can help [23:16:48] (03CR) 1020after4: [C: 032] Quit testing DonationInterface against REL1_25 [integration/config] - 10https://gerrit.wikimedia.org/r/299940 (owner: 10Ejegg) [23:17:13] twentyafterfour: thank you. I also wouldn't mind having gallium deployment privs at some point, if that sounds sane... [23:17:19] twentyafterfour could you land https://phabricator.wikimedia.org/D296 please. [23:17:20] Although I should probably not wish that upon myself. [23:17:31] (03Merged) 10jenkins-bot: Quit testing DonationInterface against REL1_25 [integration/config] - 10https://gerrit.wikimedia.org/r/299940 (owner: 10Ejegg) [23:19:05] ostriches: I've found a pretty bad Gerrit bug [23:19:45] If you use the side-by-side diff view and your window is narrow enough, the last few characters of the line get cut off and cannot be viewed even with the lame super thin horizontal scrollbars [23:20:15] I can scroll in those windows.... [23:21:11] e.g. view https://gerrit.wikimedia.org/r/#/c/298419/17/includes/DiscussionParser.php@49 in a Chrome window that's 1000px wide [23:21:19] It cuts off the last ");" for me [23:21:48] Yeah it cuts off, but I can scroll sideways to see it [23:22:01] Yes, I can, but the last ); of the line is cut off for me [23:22:18] I see it.... [23:22:33] https://usercontent.irccloud-cdn.com/file/qMkdggFf/gerrit-cutoff.png [23:22:43] Maybe just because it's punctuation? Or maybe something to do with line length? [23:22:48] ostriches i have a css change for https://phabricator.wikimedia.org/T141286 [23:22:52] :) [23:23:00] paladox: I know. I won't be reviewing it today [23:23:01] If I drag the window to my external monitor, it fixes itself because the whole line fits without scrolling [23:23:04] ok [23:23:31] https://usercontent.irccloud-cdn.com/file/6l5A7Sns/Screen%20Shot%202016-07-25%20at%2016.23.16.png [23:23:38] RoanKattouw: ^ [23:23:54] * RoanKattouw rants about Gerrit devs feeling so special-snowflakey that they built their own freaking *scrollbars* [23:24:07] come on, it's a technology that's been built into browsers since hte 90s [23:24:20] That's probably CodeMirror, the diff/editfrombrowser thingie they use [23:24:20] Hmm maybe we have different diff view prefs [23:24:22] But yeah, it's lame [23:24:32] Possibly! [23:24:58] https://usercontent.irccloud-cdn.com/file/UCsuSH0h/Diff%20prefs [23:25:15] Btw, the fact that there's a "Fast" vs "Slow" render mode cracks me up [23:25:20] Yeah, WTF [23:25:28] I just changed columns from 200 to 300 but no dice [23:25:33] "Hmmm, I like my diffs like I love a good roast, nice and slow!" [23:25:52] https://usercontent.irccloud-cdn.com/file/sOC0jZQ3/gerrit-diff-prefs.png [23:26:18] could it be https://github.com/codemirror/CodeMirror/issues/3781 [23:26:38] ostriches RoanKattouw ^^ [23:27:56] RoanKattouw do you use firefox? [23:29:20] RoanKattouw: Anyway, file an issue and drop your screenshots in it. I'm dipping out a bit early since I worked last night. [23:29:32] Will have a look later [23:29:42] ostriches it seems to be [23:29:43] https://github.com/codemirror/CodeMirror/issues/3821 [23:30:35] paladox: I use both FF and Chrome, found this bug in Chrome [23:30:39] Oh [23:30:47] Does the above look like your problem [23:30:48] https://github.com/codemirror/CodeMirror/issues/3821 [23:38:36] RoanKattouw ostriches this seems like it is related https://github.com/codemirror/CodeMirror/commit/352e34ca49b7da33add8b4b85599271323cfba8e [23:40:22] Holy crap it's even worse in my test case on gerrit-test: http://gerrit-test.wmflabs.org/gerrit/#/c/32/2/tests/fixtures/layout-cloner.yaml [23:42:08] (03PS1) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) [23:42:52] (03CR) 10jenkins-bot: [V: 04-1] Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [23:45:53] Yep i can reproduce thay [23:45:55] thay [23:45:56] that [23:46:25] RoanKattouw, would you be able to file a bug at https://bugs.chromium.org/p/gerrit/issues/list [23:46:26] please [23:46:29] against 2.12. [23:46:49] And ask for any fixes to be backported to 2.12 please. [23:46:57] Yup already doing that [23:46:59] Against 2.12.3 [23:47:07] Thanks [23:47:58] * paladox i would have thought that google has millions not even trillions pices of code, i doint know how they missed that [23:48:00] lol [23:51:23] (03PS2) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) [23:51:26] RoanKattouw i found afix [23:51:27] a fix [23:51:37] paladox: https://bugs.chromium.org/p/gerrit/issues/detail?id=4292 [23:51:39] Oh? [23:51:47] I belive it was fixed in the css file we have but due to css id changes [23:51:51] it also requires changes [23:51:58] Interesting [23:52:04] I doint belive .commentPanelMessage { works any more [23:52:35] We set .com-google-gerrit-client-diff-DiffTable_BinderImpl_GenCss_style-difftable .CodeMirror pre span now [23:52:44] Im going to upload a patch now. [23:52:46] :) [23:53:49] * RoanKattouw eagerly awaits the patch so he can try the CSS change in his browser tools [23:55:21] RoanKattouw https://gerrit.wikimedia.org/r/301027 [23:55:30] I am going to apply it to gerri-test. [23:55:36] Hmm [23:55:40] That'll just break the line though :( [23:56:11] Yeh by putting as it did in gerrit 2.8 [23:56:24] I tested and seems to put the rest of it on a seperate line [23:56:39] Yeah which is not actually what I want :/ [23:56:53] Also that rule seems to have been targeted at commit messages or comments, not diffs [23:57:56] See http://gerrit-test.wmflabs.org/gerrit/#/c/32/2/tests/fixtures/layout-cloner.yaml [23:58:04] oh [23:58:24] Hmm I guess it doesn't scroll at all now, so that is better [23:58:39] (I had tried applying the change on the fly, that looked worse) [23:59:29] Oh [23:59:39] Ive applied it now, it deffintly fixed it [23:59:46] at a cost of it's uggleness [23:59:55] and thanks for filling the bug [23:59:56] :)