[00:00:58] and, bonus, no fiddling with your local checkout :) [00:01:29] Wait, what local checkout?! ;-) [00:06:24] huh, I added a TOTAL_CHANGES thing to this script. wmf.22 -> wmf.23 was 546 changes, this train was 278. [00:10:10] nice [00:10:15] I feel safer already [00:13:20] fwiw: wmf.21 -> wmf.22 was 659 [00:14:18] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.23 deployment blockers - https://phabricator.wikimedia.org/T183962#4029855 (10greg) 05Open>03Resolved [00:14:35] wmf.22 had a major blocker, wmf.23 had "issues" [00:14:38] I see a pattern :P [00:15:32] * thcipriani pushes up my dumb patch that tracks this [00:15:38] s/my/his/ [00:15:44] * thcipriani doesn't /me [00:16:06] no_justification: https://groups.google.com/forum/m/#!topic/repo-discuss/-V68LXajdj4 heh :) [00:16:14] /me'ing ain't easy [00:31:18] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:34:36] (03PS1) 10Thcipriani: Add totals to deploy-notes [tools/release] - 10https://gerrit.wikimedia.org/r/416873 [00:35:34] (03CR) 10Greg Grossmeier: [C: 031] "Yes please." [tools/release] - 10https://gerrit.wikimedia.org/r/416873 (owner: 10Thcipriani) [00:39:50] legoktm: btw, given the php7 job is planned to become voting, soon, we need to fix the bug where the job fails on certain repos with a "variable EXT_NAME is not set" error. [00:40:09] https://integration.wikimedia.org/ci/job/mwext-testextension-php70-jessie-non-voting/1776/console [00:40:14] Is that reported/known? [00:44:53] (03PS2) 10Thcipriani: Add totals to deploy-notes [tools/release] - 10https://gerrit.wikimedia.org/r/416873 [00:45:29] oo, unique committers too [01:24:32] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [01:45:02] Krinkle: oh, that's my fault. Of course $EXT_NAME won't be set when it's running against mediawiki/core patches [01:46:45] WOW [01:46:51] https://pingback.wmflabs.org/ IS AWESOME [01:48:07] legoktm: we should send ori the link :) [01:54:44] * legoktm does [02:28:43] !log manually triggering jenkins jobs [02:28:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [03:39:15] 10Continuous-Integration-Infrastructure: castor rsync's taking 3-5 minutes for mwgate-npm jobs - https://phabricator.wikimedia.org/T188375#4030597 (10Legoktm) Could we set up a second castor instance, and split the traffic? (either randomly or some consistent hash based on job and repo names). [03:42:21] (03PS1) 10Legoktm: Update extension dependencies with reality [integration/config] - 10https://gerrit.wikimedia.org/r/416886 [03:44:16] (03CR) 10Legoktm: [C: 032] Update extension dependencies with reality [integration/config] - 10https://gerrit.wikimedia.org/r/416886 (owner: 10Legoktm) [03:45:38] (03Merged) 10jenkins-bot: Update extension dependencies with reality [integration/config] - 10https://gerrit.wikimedia.org/r/416886 (owner: 10Legoktm) [03:48:55] * legoktm angrily shakes fist at hashar [03:50:02] (03CR) 10Legoktm: "The zuul change for this was left undeployed, I've deployed it now." [integration/config] - 10https://gerrit.wikimedia.org/r/415588 (https://phabricator.wikimedia.org/T115755) (owner: 10Hashar) [03:50:16] !log deployed https://gerrit.wikimedia.org/r/416886 https://gerrit.wikimedia.org/r/415588 [03:50:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [04:12:55] (03CR) 10Legoktm: [C: 032] "..." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416693 (owner: 10Thiemo Kreuz (WMDE)) [04:13:29] (03Merged) 10jenkins-bot: Remove not needed "@return void" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416693 (owner: 10Thiemo Kreuz (WMDE)) [04:13:41] (03CR) 10jenkins-bot: Remove not needed "@return void" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416693 (owner: 10Thiemo Kreuz (WMDE)) [04:14:07] (03CR) 10Legoktm: [C: 032] "..." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416694 (owner: 10Thiemo Kreuz (WMDE)) [04:15:04] (03Merged) 10jenkins-bot: Specify "@return void" on all Sniff::process() methods [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416694 (owner: 10Thiemo Kreuz (WMDE)) [04:16:09] (03CR) 10jenkins-bot: Specify "@return void" on all Sniff::process() methods [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416694 (owner: 10Thiemo Kreuz (WMDE)) [04:51:30] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [04:55:58] 10Continuous-Integration-Infrastructure: castor caching model seems to be broken for non-voting jobs - https://phabricator.wikimedia.org/T189077#4030921 (10Legoktm) [05:21:32] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [07:21:17] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), 10PHP 7.0 support, 10Patch-For-Review: Run MediaWiki tests on PHP 7 - https://phabricator.wikimedia.org/T144962#4031073 (10Legoktm) [07:51:42] (03PS1) 10Hashar: [PoolCounter] tweak rake file filter [integration/config] - 10https://gerrit.wikimedia.org/r/416904 (https://phabricator.wikimedia.org/T187797) [08:22:14] (03PS2) 10Hashar: Add more specific rake job for PoolCounter [integration/config] - 10https://gerrit.wikimedia.org/r/416904 (https://phabricator.wikimedia.org/T187797) [08:22:16] (03PS1) 10Hashar: docker: add rake image with libevent-dev for PoolCounter [integration/config] - 10https://gerrit.wikimedia.org/r/416910 [08:22:57] (03CR) 10Hashar: [C: 032] "Tested locally with https://gerrit.wikimedia.org/r/#/c/416907/ :]" [integration/config] - 10https://gerrit.wikimedia.org/r/416910 (owner: 10Hashar) [08:24:19] (03Merged) 10jenkins-bot: docker: add rake image with libevent-dev for PoolCounter [integration/config] - 10https://gerrit.wikimedia.org/r/416910 (owner: 10Hashar) [08:42:13] (03PS3) 10Hashar: Add more specific rake job for PoolCounter [integration/config] - 10https://gerrit.wikimedia.org/r/416904 (https://phabricator.wikimedia.org/T187797) [08:42:24] (03CR) 10Hashar: [C: 032] Add more specific rake job for PoolCounter [integration/config] - 10https://gerrit.wikimedia.org/r/416904 (https://phabricator.wikimedia.org/T187797) (owner: 10Hashar) [08:43:45] (03Merged) 10jenkins-bot: Add more specific rake job for PoolCounter [integration/config] - 10https://gerrit.wikimedia.org/r/416904 (https://phabricator.wikimedia.org/T187797) (owner: 10Hashar) [08:56:22] (03PS1) 10Hashar: Drop mwext-PoolCounter-build-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/416913 (https://phabricator.wikimedia.org/T187797) [08:57:21] (03CR) 10jerkins-bot: [V: 04-1] Drop mwext-PoolCounter-build-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/416913 (https://phabricator.wikimedia.org/T187797) (owner: 10Hashar) [08:58:07] (03PS2) 10Hashar: Drop mwext-PoolCounter-build-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/416913 (https://phabricator.wikimedia.org/T187797) [09:12:46] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q4, 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4031287 (10hashar) [09:32:23] legoktm: when does ping back pingback? [09:32:26] only at install? [09:42:15] hashar: if you have too much time on your hands, a look at https://gerrit.wikimedia.org/r/#/c/416743/ would be welcomed! [09:46:10] 10Continuous-Integration-Config, 10OOUI: Speed up oojs/ui CI job/tests - https://phabricator.wikimedia.org/T189055#4031319 (10hashar) [10:02:46] 10Release-Engineering-Team (Kanban): Find out if there's an existing task/plan to get rid of globals. - https://phabricator.wikimedia.org/T189059#4031342 (10Aklapper) Potentially relevant tasks (which are disconnected): * {T11968} and dependencies * {T72470} for JS. (Not sure they should be connected.) * {T5412... [10:19:39] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Wikimedia-Incident: In Gerrit set receive.rejectImplicitMerges = True in All-Projects - https://phabricator.wikimedia.org/T189024#4031390 (10hashar) 05Open>03Resolved a:03hashar mediawiki/core now has: Reject implicit merges when c... [10:23:22] Project beta-update-databases-eqiad build #23959: 04FAILURE in 3 min 20 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/23959/ [10:31:15] gehel: yeah I will look at it this afternoon [10:31:44] hashar: Thanks! And no emergency! [10:32:39] gehel: I have a few other java based repos to test out.Some failling when running on docker for x reason :D [10:32:56] hashar: let me know if I can help! [10:54:06] PROBLEM - Host deployment-videoscaler01 is DOWN: CRITICAL - Host Unreachable (10.68.19.130) [10:54:52] PROBLEM - Host deployment-tmh01 is DOWN: CRITICAL - Host Unreachable (10.68.16.211) [10:57:22] we broke the database update in beta, the fix is coming [11:00:02] gehel: I am trying out dcausse Jenkins job for search/extra-analysis at https://integration.wikimedia.org/ci/job/search-extra-analysis-maven-java8-docker/2/console [11:00:39] The xml element org.apache.lucene should be placed before org.elasticsearch [11:00:40] pfff [11:00:41] looks good! It is failing for a good reason! [11:01:08] (03CR) 10Hashar: [C: 032] "Well done :]" [integration/config] - 10https://gerrit.wikimedia.org/r/416743 (https://phabricator.wikimedia.org/T188686) (owner: 10DCausse) [11:01:23] That's actually a check we voluntary added. Having a standard order in pom.xml makes diff'ing them soo much easier [11:02:36] (03Merged) 10jenkins-bot: Add search/analysis-extra to jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/416743 (https://phabricator.wikimedia.org/T188686) (owner: 10DCausse) [11:03:02] strange... `/src/pom.xml` ? That should be `/pom.xml` [11:07:38] gehel: the docker container have the project code under /src [11:08:43] !log reloading Zuul for "Add search/analysis-extra to jenkins" | T188686 [11:08:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:08:49] T188686: Set up CI and github sync for new extra-analysis repo - https://phabricator.wikimedia.org/T188686 [11:09:37] hashar: ok, that makes some kind of sense. Slightly confusing as we usually have an src directory inside the project [11:09:45] ahhh [11:10:01] so yeah you would have stuff like /src/src/org/java/foo/nightmare/path/toolong.java [11:10:14] Yep [11:10:42] hashar: thanks! :) [11:10:58] and hopefully the site:publish step will work as well [11:11:25] <_joe_> hashar: I like that example [11:12:08] <_joe_> but you forget to thowr some org/wikimedia/search/AbstractProxy/Singleton/AbstractFactory/Interface/Implementation in the middle [11:12:13] ahahha [11:12:44] I have seen an example with 6 classes and ~ 2k lines of code/comment [11:12:51] <_joe_> ++designpatterns [11:12:54] just to differentiate between relative and absolute file paths [11:13:34] <_joe_> hashar: https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/aop/framework/AbstractSingletonProxyFactoryBean.html [11:13:38] so you would PathRelativeness::singleton( MaybeRelativePath('/foobar') ) [11:14:00] which would yield a RelativeNessResult() object to let you figure out whether yeah it is starting with a / or not :] [11:14:36] meh :) [11:23:32] Yippee, build fixed! [11:23:32] Project beta-update-databases-eqiad build #23960: 09FIXED in 3 min 31 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/23960/ [11:30:23] 10Scap, 10Operations, 10Packaging: Install git-lfs client (at least on scap targets & masters) - https://phabricator.wikimedia.org/T180628#4031604 (10akosiaris) >>! In T180628#4028541, @mmodell wrote: > @akosiaris: What would it take to get the git-lfs package back-ported to stretch? It's written in go, howe... [12:13:29] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [12:30:53] PROBLEM - Puppet errors on deployment-ores01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:34:18] 10Scap, 10Operations, 10Packaging: Install git-lfs client (at least on scap targets & masters) - https://phabricator.wikimedia.org/T180628#4031758 (10akosiaris) I 've just uploaded `git-lfs` to stretch-wikimedia/main. [12:43:30] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [12:49:17] 10Phabricator (Upstream), 10Upstream: The star for the "favorites" menu is non-obvious - https://phabricator.wikimedia.org/T160125#4031811 (10Aklapper) p:05Low>03Lowest [13:07:32] 10Scap, 10Operations, 10Packaging: Install git-lfs client (at least on scap targets & masters) - https://phabricator.wikimedia.org/T180628#4031839 (10Paladox) @akosiaris would it be easy to backport that to Jessie too? [14:09:49] greg-g: It'd be interesting to see if that's counting the same one numerous times (I thought we had some sort of GUID) [14:10:01] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q4, 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4031981 (10hashar) [14:10:50] And it also suggests Oracle isn't used [14:15:17] (03PS1) 10Hashar: Drop empty lines in jjb/language-screenshots.yaml [integration/config] - 10https://gerrit.wikimedia.org/r/416944 [14:18:13] (03CR) 10Zfilipin: [C: 032] Drop empty lines in jjb/language-screenshots.yaml [integration/config] - 10https://gerrit.wikimedia.org/r/416944 (owner: 10Hashar) [14:19:37] (03Merged) 10jenkins-bot: Drop empty lines in jjb/language-screenshots.yaml [integration/config] - 10https://gerrit.wikimedia.org/r/416944 (owner: 10Hashar) [14:22:47] PROBLEM - Free space - all mounts on deployment-mediawiki04 is CRITICAL: CRITICAL: deployment-prep.deployment-mediawiki04.diskspace.root.byte_percentfree (<11.11%) [14:26:40] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<10.00%) [14:26:44] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find a few people interested in reviewing Selenium patches - https://phabricator.wikimedia.org/T188744#4018049 (10zeljkofilipin) @Niedzielski, @dcausse: great! I'll make sure not to spam you, but I might add you to a commit and a task once in a while. [14:26:47] 10Scap, 10Operations, 10Packaging: Install git-lfs client (at least on scap targets & masters) - https://phabricator.wikimedia.org/T180628#4032006 (10mmodell) @akosiaris Awesome, thank you! [14:27:50] RECOVERY - Free space - all mounts on deployment-mediawiki04 is OK: OK: All targets OK [14:42:25] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q4, 10VisualEditor: Migrate language-screenshots-VisualEditor off of Nodepool to Docker containers - https://phabricator.wikimedia.org/T189122#4032062 (10hashar) p:05Triage>03High [14:42:51] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q4, 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4032074 (10hashar) [14:48:05] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Update page object pattern in Selenium tests - https://phabricator.wikimedia.org/T185094#4032086 (10zeljkofilipin) Thanks a lot to @Jdrewniak. From [[ https://gerrit.wikimedia.org/r/#/c/412956/ | 412956 ]... [14:51:25] (03PS1) 10Hashar: Migrate VisualEditor screenshot job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/416958 (https://phabricator.wikimedia.org/T189122) [14:52:02] (03CR) 10Hashar: "Quick and dirty first iteration. That first requires proper entry points to be added to mediawiki/extensions/VisualEditor https://gerrit.w" [integration/config] - 10https://gerrit.wikimedia.org/r/416958 (https://phabricator.wikimedia.org/T189122) (owner: 10Hashar) [14:52:21] (03CR) 10jerkins-bot: [V: 04-1] Migrate VisualEditor screenshot job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/416958 (https://phabricator.wikimedia.org/T189122) (owner: 10Hashar) [14:52:56] (03PS2) 10Hashar: Migrate VisualEditor screenshot job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/416958 (https://phabricator.wikimedia.org/T189122) [14:53:16] PROBLEM - Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76) [15:01:18] (03CR) 10Thiemo Kreuz (WMDE): "I think the comments removed in this patch here illustrate the dark side of that particular sniff quite nicely. When you force people to a" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416695 (owner: 10Thiemo Kreuz (WMDE)) [15:17:13] (03PS2) 10Thiemo Kreuz (WMDE): Remove comments describing a File object as a "File object" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416695 [15:19:12] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Update page object pattern in Selenium tests - https://phabricator.wikimedia.org/T185094#4032204 (10zeljkofilipin) [15:23:57] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Update page object pattern in Selenium tests - https://phabricator.wikimedia.org/T185094#4032219 (10zeljkofilipin) [15:28:48] oh man [15:29:00] zeljkof: I think I got the VisualEditor language screenshot working on docker :] [15:29:12] hashar: oh, cool! :) [15:29:57] zeljkof: https://phabricator.wikimedia.org/P6815 :] [15:30:13] copy pasted from the jenkins job [15:30:32] and I have stuff pilling up in src/screenshots/ [15:30:39] hashar: that looks so beautiful and clean [15:32:08] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Update page object pattern in Selenium tests - https://phabricator.wikimedia.org/T185094#4032255 (10zeljkofilipin) [15:34:38] zeljkof: yeah maybe. I think I will do the switch tomorrow morning [15:34:47] not sure how commons_upload will work [15:37:32] thcipriani: the reason in 2014 we did not want hhvm is that it was broken [15:37:48] or to be more precise, mediawiki/hhvm integration were at their early stage [15:38:04] so experimenting the integration of hhvm on beta would have made beta useless most of the time [15:38:12] though now [15:38:20] beta is used to test out the hhvm package before it lands in prod [15:38:32] interesting [15:38:37] and in my experience lot of people are doing the puppet work / basic work on beta BEFORE production [15:38:45] so beta definitely has its usefulness there [15:39:09] in the sense it is easy to iterate puppet patch, catch all low hanging fruits [15:39:21] the outcome is still not production ready [15:39:29] but at least it is a few steps forward [15:49:52] (03CR) 10Hashar: "Ran it locally with https://phabricator.wikimedia.org/P6815 and that seems to work." [integration/config] - 10https://gerrit.wikimedia.org/r/416958 (https://phabricator.wikimedia.org/T189122) (owner: 10Hashar) [15:53:34] zeljkof: https://commons.wikimedia.beta.wmflabs.org/w/index.php?title=Special:ListFiles/Hashar&ilshowall=1 !]]]]]]]]]]]]]]]]]] [15:54:00] IT WORKS [15:57:27] hashar: nice [16:05:23] zeljkof: and I even found a couple issues https://github.com/amire80/commons_upload/issues :D [16:06:13] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q4, 10VisualEditor, 10Patch-For-Review: Migrate language-screenshots-VisualEditor off of Nodepool to Docker containers - https://phabricator.wikimedia.org/T189122#4032359 (10hashar) And I have filled... [16:06:25] zeljkof: so yeah we need a visualeditor patch to be merged (that adds the npm/rake entry point) and then we can migrate the jenkins job \o/ [16:06:37] so I think I am done with the longtail of migrating jobs to Nodepool [16:06:46] and will start looking at migrating MediaWiki tests yeahhhhh [16:08:25] hashar: it is highly unlikely either Amir or me would be fixing problems in commons_upload :( [16:09:01] it's more likely I would rewrite it in node https://phabricator.wikimedia.org/T139747 [16:13:24] Project mwext-phpunit-coverage-publish build #1852: 04FAILURE in 20 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1852/ [16:22:47] Project mwext-phpunit-coverage-publish build #1853: 04STILL FAILING in 35 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1853/ [16:23:30] Yippee, build fixed! [16:23:30] Project mwext-phpunit-coverage-publish build #1854: 09FIXED in 42 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1854/ [16:26:24] Project mwext-phpunit-coverage-publish build #1855: 04FAILURE in 31 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1855/ [16:33:17] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:41:35] Yippee, build fixed! [16:41:36] Project mwext-phpunit-coverage-publish build #1856: 09FIXED in 1 min 45 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1856/ [16:46:46] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q4, 10VisualEditor, 10Patch-For-Review: Migrate language-screenshots-VisualEditor off of Nodepool to Docker containers - https://phabricator.wikimedia.org/T189122#4032470 (10zeljkofilipin) >>! In T189... [16:57:12] zeljkof: too late https://github.com/amire80/commons_upload/pull/12 :] [16:59:25] (03Abandoned) 10Lucas Werkmeister (WMDE): Disable coverage builds for WBQEV [integration/config] - 10https://gerrit.wikimedia.org/r/414774 (https://phabricator.wikimedia.org/T185697) (owner: 10Lucas Werkmeister (WMDE)) [17:00:32] hashar: cool :) [17:07:39] addshore: it pings back at install and whenever any of the values change IIRC [17:12:33] (03CR) 10Legoktm: [C: 04-1] "Sorry, I meant removing it from MW-CS outright, not just disabling it for this repo (I'd like to keep this as exclusion free as possible s" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416695 (owner: 10Thiemo Kreuz (WMDE)) [17:13:37] zeljkof: and a lame https://github.com/amire80/commons_upload/pull/13 :] [17:15:27] (03CR) 10Thiemo Kreuz (WMDE): "I understand that and support the idea of not having any exceptions in the local .phpcs.xml. I'm happy to hear you support the idea of rem" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416695 (owner: 10Thiemo Kreuz (WMDE)) [17:27:16] (03CR) 10Legoktm: [C: 04-1] "I think just a separate patch is fine." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416695 (owner: 10Thiemo Kreuz (WMDE)) [18:34:37] PROBLEM - App Server Main HTTP Response on deployment-mediawiki07 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 hphp_invoke - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 287 bytes in 0.009 second response time [18:43:42] 10Continuous-Integration-Infrastructure, 10Front-end-Standards-Group, 10MediaWiki-extensions-General: Decide whether we want the package-lock.json to commit or ignore - https://phabricator.wikimedia.org/T179229#4032957 (10Jhernandez) I think we should have them committed as npm states, as they give you full... [18:49:54] hashar: ping? [18:54:29] PROBLEM - Puppet errors on deployment-secureredirexperiment is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:54:35] 10Deployments, 10Release-Engineering-Team (Watching / External), 10Operations, 10HHVM, and 2 others: Translation cache exhaustion caused by changes to PHP code in file scope - https://phabricator.wikimedia.org/T103886#1401646 (10Dzahn) applied on mwdebug1001 via puppet, other canary appservers to follow [19:06:33] PROBLEM - puppet last run on contint1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [19:07:30] Krinkle: So I'm thinking a slightly different direction for updating beta. [19:07:54] We could *push* repos as they're changed to deployment-tin....then some cron will run scap every so often [19:08:01] Rather than having a pull job do it too [19:08:50] legoktm: Do we correlate values from the same install? I could be wrong but it seems the current graphs in labs just indefinitely increase based on total observed count, is that right? E.g. when an install changes from PHP 5.6 to PHP 7.0 and then upgrades MediaWiki, can we currently observe that the old 5.6 install now effectively no longer exists? [19:09:29] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [19:09:30] Alternatively, we could like, make it send unconditionally once every N months, like twice per year, and then graph our data points on a 6-month cycle count(), instead of totals. Just a couple thoughts :) [19:10:31] no_justification: Hm.. what would be the gain of doing git push to beta, but still a cron for scap? Could go as well with a git hook and make it fully pushed. Just curious why not the cron though, do you want it to happen more frequently? [19:10:53] Hmm, could do that as ahook [19:11:07] http://toroid.org/git-website-howto [19:11:33] RECOVERY - puppet last run on contint1001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [19:11:34] That's what I use for my wordpress sites where I have limited shell access to the hoster, for pushing updates to the wp config and wp themes [19:11:35] Probably post-receive? [19:11:43] Yeah [19:12:12] In my case I actually separate the bare repo and the actual working tree checkout so that the checkout doesn't have .git anywhere. [19:12:34] so I push to srv:repo.git, and then post-receive runs the hook to update the worktree and do anything else I want. [19:12:47] A-ha. So I'd thought about separating the git work tree and the actual checkout before anyway for other reasons [19:13:13] But for similarity with prod, we might not want to do that. [19:13:28] no_justification: So from where would we push? [19:13:41] Just gerrit directly. [19:13:44] A replication destination [19:13:50] ...oh [19:13:53] Interesting. [19:14:00] And then whatever user (I guess mwdeploy) on beta would get the public key as an authorized user [19:14:31] I guess this kinda diverges from prod, you're right there.... [19:14:47] The cron would be a good starting point though? Feels like you're getting very close :) [19:15:05] Cron would be an improvement over now and more consistent with prod (pull, not push) [19:15:46] Ok, so we'll keep at what we're doing now [19:18:48] Krinkle: don't know, that's a question for cindy [19:20:12] no_justification: I'm not disagreeing on the idea btw, just really wanna see it move away from jenkins as first step first :) [19:23:32] Step 0 :D [19:27:27] 10Release-Engineering-Team (Kanban): Find out if there's a plan to get rid of using globals in PHP code - https://phabricator.wikimedia.org/T189059#4033074 (10Krinkle) [19:38:53] legoktm: Regarding PHPUnit 4/6, "After PHP 7 is voting", so that means Pu4 works on P7? Didn't know that. [19:44:47] no_justification: when you're at a stopping point (and if you're not still out sick) I would appreciate supervision while merging my wikitech config patches. [19:47:31] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10Wikimedia-Site-requests: Something like puppet-compile for wmf-config - https://phabricator.wikimedia.org/T189152#4033138 (10Krinkle) [19:47:40] no_justification: ^ Something to break your brain over :) [19:48:09] Krinkle: yes. but once you hit PHP 7.2 it starts spamming warnings all over the place [19:48:26] cool, not bad :) [19:49:34] Krinkle: I will not be nerd sniped! [19:53:13] andrewbogott: Soooo, I kinda wanna split this into two changes. I'd feel safer. The first would have the filebackend.php and CommonSettings.php changes. Those *should* be fine to land alone? Right? Especially filebackend (since we won't be loading it yet). CommonSettings is partially hidden by the HHVM flag (good) plus enabling InstantCommons shouldn't /break/ anything [19:53:21] Then the second change would be InitialiseSettings.php [19:53:28] That's the part that is more likely to break things [19:53:44] no_justification: sure, I'll split it up [19:54:06] I mean I s'pose it should all Just Work [19:54:10] Long as the Swift side is setup [19:54:16] 10Release-Engineering-Team (Kanban), 10MediaWiki-Configuration: Find out if there's a plan to get rid of using globals in PHP code - https://phabricator.wikimedia.org/T189059#4033153 (10Legoktm) I don't think the problem in the LegacyEncoding incident would have been helped much if we weren't using global conf... [19:54:17] And we got the upload paths right [20:01:49] no_justification: ok, split into https://gerrit.wikimedia.org/r/#/c/417017/ and https://gerrit.wikimedia.org/r/#/c/416607/. I could also use https://gerrit.wikimedia.org/r/#/c/415914/ (which I'm currently live-hacking on labweb) [20:06:38] In theory the upload paths are right, but if we do break image upload on wikitech no one else will notice for weeks so we definitely will have time to test and iterate if needed. [20:10:12] So the last one we'll land asap [20:10:16] Then the first, then the secon [20:10:51] Live hacks on labsweb -> easily overwritten [20:12:30] paladox: I've rebuilt its-phabricator from master. I'll get it into archiva today, I hope [20:12:40] no_justification ah thanks :) [20:12:45] no_justification should i make https://gerrit-review.googlesource.com/c/plugins/its-base/+/161730 public? [20:12:54] They have merged it into the branches already [20:12:59] so basically already public [20:13:53] 404 not found, add me to change ;-) [20:13:56] Oh duh [20:13:58] Not logged in [20:14:28] heh [20:14:42] no_justification: so, the modern way to apply a config patch is 1) mark +2 2) wait a few minutes for it to merge itself 3) fetch, rebase, submodule-update in /srv/mediawiki on tin 4) scap deploy in /srv/mediawiki on tin [20:14:43] ? [20:14:46] no_justification once the plugin is deployed, i can do the config change. [20:16:11] andrewbogott: (1), (2) yes. [20:16:30] (3) is "fetch/rebase/submodule-update" but it's in /srv/mediawiki-staging/ [20:16:51] (4) Is `scap sync-file .....` or `scap sync` [if you want a full site deploy] [20:17:09] (MW deploys don't use `scap deploy` yet) [20:17:14] what is in /srv/mediawiki-staging vs /srv/mediawiki? [20:17:24] (when I have to write several paragraphs to justify a sleep(20ms) :( https://gerrit.wikimedia.org/r/#/c/417021/ ) [20:17:30] poolcounterd is fun [20:17:32] /srv/mediawiki is the live version [20:17:41] -staging is....only on deploy masters (tin and naos) [20:18:02] ah, I see, so -staging deploys /to/ /srv/mediawiki [20:18:06] Yes [20:18:16] ok, mind if I give that a whirl with https://gerrit.wikimedia.org/r/#/c/415914/ ? [20:18:21] I mean, is now a good time? [20:18:27] Check with Tyler, he's in train window :) [20:18:31] 10Release-Engineering-Team (Kanban), 10MediaWiki-extensions-PoolCounter, 10Patch-For-Review: Fix tests of PoolCounter extension - https://phabricator.wikimedia.org/T178517#4033256 (10hashar) [20:29:04] 10Release-Engineering-Team (Kanban), 10MediaWiki-extensions-PoolCounter, 10Patch-For-Review: Fix tests of PoolCounter extension - https://phabricator.wikimedia.org/T178517#4033282 (10hashar) I can reproduce by running poolcounterd under strace and then running: ``` bundle exec cucumber --name 'Just readers'... [20:46:15] 10Phabricator (Upstream), 10Upstream: Phame blog posts don't have 'published' metadata, only 'updated' - https://phabricator.wikimedia.org/T188890#4022822 (10mmodell) This should be resolved with tonight's phabricator upgrade [20:57:35] 10Gerrit, 10Phabricator, 10Release-Engineering-Team, 10releng-201516-q3, and 3 others: [RfC]: Migrate code review / management from Gerrit to Phabricator - https://phabricator.wikimedia.org/T119908#4033358 (10daniel) TechCom wonders what to do with this. Can we drop it from the RFC process? Or close it? [21:08:22] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:57:01] 10Gerrit, 10Phabricator, 10Release-Engineering-Team, 10releng-201516-q3, and 3 others: [RfC]: Migrate code review / management from Gerrit to Phabricator - https://phabricator.wikimedia.org/T119908#4033567 (10mmodell) 05Open>03declined {icon trash} :'( [22:03:28] 10Gerrit, 10Phabricator, 10Release-Engineering-Team, 10releng-201516-q3, and 3 others: [RfC]: Migrate code review / management from Gerrit to Phabricator - https://phabricator.wikimedia.org/T119908#4033583 (10demon) (was about to close as well, here's what I was writing) For a variety of reasons, this RfC... [22:10:17] 10Differential, 10Documentation, 10Gerrit-Migration: Update Commit Message Guidelines for Differential - https://phabricator.wikimedia.org/T123081#4033593 (10demon) 05Open>03declined Per T119908 [22:10:36] 10Differential, 10Analytics-Tech-community-metrics, 10DevRel-April-2016, 10DevRel-March-2016, 10Developer-Relations (Jan-Mar-2018): Make MetricsGrimoire/korma support gathering Code Review statistics from Phabricator's Differential - https://phabricator.wikimedia.org/T118753#4033598 (10Aklapper) [22:10:52] 10Differential, 10Analytics-Tech-community-metrics, 10DevRel-April-2016, 10DevRel-March-2016, 10Developer-Relations (Jan-Mar-2018): Make MetricsGrimoire/korma support gathering Code Review statistics from Phabricator's Differential - https://phabricator.wikimedia.org/T118753#1808364 (10Aklapper) 05stall... [22:11:42] andre was quick :) [22:11:53] 10Differential, 10translatewiki.net, 10Gerrit-Migration, 10I18n: Train L10n-bot to work with repositories hosted in Wikimedia Phabricator (Diffusion) - https://phabricator.wikimedia.org/T92493#4033607 (10demon) 05Open>03declined Per T119908 [22:12:09] 10Differential, 10Gerrit-Migration: Cross-repository gating of changes pre-merge in Differential - https://phabricator.wikimedia.org/T131955#4033615 (10demon) 05Open>03declined Per T119908 [22:13:07] 10Phabricator, 10Wikimedia Phabricator RfC: Configure Phabricator for our needs - https://phabricator.wikimedia.org/T34#4033632 (10demon) [22:13:09] 10Differential, 10releng-201516-q4, 10Documentation, 10Gerrit-Migration, 10WorkType-NewFunctionality: Initial documentation of example Differential workflows (with Gerrit equivalents) - https://phabricator.wikimedia.org/T117058#4033631 (10demon) [22:13:23] 10Continuous-Integration-Config, 10Utilities-mwdumper, 10Jenkins, 10Patch-For-Review: Re-add mwdumper builds to continuous integration / jenkins - https://phabricator.wikimedia.org/T133456#4033640 (10demon) [22:13:26] 10Differential, 10Utilities-mwdumper, 10Gerrit-Migration: Migrate mwdumper to Differential - https://phabricator.wikimedia.org/T134434#4033636 (10demon) 05Open>03declined Per T119908 [22:13:39] 10releng-201516-q4, 10Gerrit-Migration, 10Goal: Phase 1 repository migrations to Differential (goal - end of June 2016) - https://phabricator.wikimedia.org/T130418#4033648 (10demon) [22:13:49] 10releng-201516-q4, 10Gerrit-Migration, 10Goal: Phase 1 repository migrations to Differential (goal - end of June 2016) - https://phabricator.wikimedia.org/T130418#2135504 (10demon) [22:14:24] 10Diffusion, 10GitHub-Mirrors, 10Repository-Admins, 10Gerrit-Migration: Have Phabricator take over replication to Github - https://phabricator.wikimedia.org/T115624#4033660 (10demon) 05Open>03declined Per T119908 [22:14:27] 10Gerrit, 10Gerrit-Migration: Provide static dump of Gerrit - https://phabricator.wikimedia.org/T617#4033664 (10demon) 05stalled>03declined Per T119908 [22:15:06] 10Release-Engineering-Team (Someday), 10Gerrit-Migration, 10WorkType-NewFunctionality: Write script to migrate open changes from Gerrit to Differential by repository - https://phabricator.wikimedia.org/T122979#4033673 (10demon) 05stalled>03declined Per T119908 [22:21:52] 10Gerrit, 10Phabricator, 10Release-Engineering-Team, 10releng-201516-q3, and 3 others: [RfC]: Migrate code review / management from Gerrit to Phabricator - https://phabricator.wikimedia.org/T119908#4033696 (10hashar) Should we also close tasks from #gerrit-migration and archives it? [22:25:03] 10Gerrit, 10Phabricator, 10Release-Engineering-Team, 10releng-201516-q3, and 3 others: [RfC]: Migrate code review / management from Gerrit to Phabricator - https://phabricator.wikimedia.org/T119908#4033700 (10Aklapper) >>! In T119908#4033696, @hashar wrote: > Should we also close tasks from #gerrit-migrati... [22:35:25] 10Gerrit, 10Phabricator, 10Release-Engineering-Team, 10releng-201516-q3, and 3 others: [RfC]: Migrate code review / management from Gerrit to Phabricator - https://phabricator.wikimedia.org/T119908#4033712 (10demon) >>! In T119908#4033700, @Aklapper wrote: >>>! In T119908#4033696, @hashar wrote: >> Should... [22:55:34] 10Phabricator (Upstream), 10Upstream: Phame blog posts don't have 'published' metadata, only 'updated' - https://phabricator.wikimedia.org/T188890#4033764 (10Samwilson) Thank you! [23:06:49] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10GitHub-Mirrors, 10Repository-Admins, 10Patch-For-Review: Set up CI and github sync for new extra-analysis repo - https://phabricator.wikimedia.org/T188686#4033774 (10hashar) 05Open>03Resolved a:03dcausse The CI job passed on the first c... [23:14:55] 10Phabricator (2018-03-07): Phame blog posts don't have 'published' metadata, only 'updated' - https://phabricator.wikimedia.org/T188890#4033795 (10mmodell) [23:26:46] 10Beta-Cluster-Infrastructure, 10Wikidata: Stack overflow in WikibaseRepo initialization on Wikidata Beta - https://phabricator.wikimedia.org/T188924#4033843 (10Addshore) p:05Normal>03Low a:03MoritzMuehlenhoff Switching to low now, the train seemed to go fine and test.wikidata.org is fine. Assigning mor... [23:53:48] PROBLEM - Free space - all mounts on deployment-mediawiki04 is CRITICAL: CRITICAL: deployment-prep.deployment-mediawiki04.diskspace.root.byte_percentfree (<11.11%) [23:56:39] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<30.00%) [23:58:47] RECOVERY - Free space - all mounts on deployment-mediawiki04 is OK: OK: All targets OK