[01:22:09] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<12.50%) [01:32:05] ffs [02:45:56] 10Continuous-Integration, 10VisualEditor, 5§ VisualEditor Q3 Blockers: Concurrent builds using local Chromium/Firefox browsers on Linux host fail - https://phabricator.wikimedia.org/T90673#1076934 (10Jdforrester-WMF) [03:05:27] Yippee, build fixed! [03:05:27] Project browsertests-ZeroBanner-en.m.wikipedia.org-linux-phantomjs build #479: FIXED in 26 sec: https://integration.wikimedia.org/ci/job/browsertests-ZeroBanner-en.m.wikipedia.org-linux-phantomjs/479/ [03:22:16] (03PS1) 10Legoktm: Add sniff to check for spacey function calls [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/193766 [03:40:50] Yippee, build fixed! [03:40:50] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #345: FIXED in 33 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/345/ [04:11:00] Yippee, build fixed! [04:11:01] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce build #343: FIXED in 40 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce/343/ [04:33:14] (03PS2) 10Legoktm: Add sniff to check for spacey function calls [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/193766 [05:44:21] Yippee, build fixed! [05:44:22] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #498: FIXED in 25 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/498/ [06:37:05] PROBLEM - App Server Main HTTP Response on deployment-mediawiki02 is CRITICAL: Connection refused [06:37:14] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [06:40:00] PROBLEM - App Server bits response on deployment-mediawiki02 is CRITICAL: Connection refused [06:50:02] RECOVERY - App Server bits response on deployment-mediawiki02 is OK: HTTP OK: HTTP/1.1 200 OK - 3895 bytes in 0.003 second response time [06:52:08] RECOVERY - App Server Main HTTP Response on deployment-mediawiki02 is OK: HTTP OK: HTTP/1.1 200 OK - 49384 bytes in 0.688 second response time [06:55:36] Yippee, build fixed! [06:55:37] Project beta-scap-eqiad build #43533: FIXED in 1 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43533/ [07:02:25] RECOVERY - Free space - all mounts on deployment-jobrunner01 is OK: OK: All targets OK [07:14:42] Project beta-scap-eqiad build #43535: FAILURE in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43535/ [07:26:56] PROBLEM - Content Translation Server on deployment-cxserver03 is CRITICAL: Connection refused [07:34:52] Yippee, build fixed! [07:34:53] Project beta-scap-eqiad build #43537: FIXED in 50 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43537/ [07:36:55] RECOVERY - Content Translation Server on deployment-cxserver03 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.019 second response time [07:39:55] 6Release-Engineering, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1077235 (1080686) Can we get some feedback on this please? We want to start tracking stuff in the wiki and build some automated tools, which need the structured data in the wiki. [07:50:13] (03CR) 10Tobias Gritschacher: [C: 031] "@krinkle?" [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (https://phabricator.wikimedia.org/T86176) (owner: 10Adrian Lang) [08:32:54] PROBLEM - Content Translation Server on deployment-cxserver03 is CRITICAL: Connection refused [08:37:55] RECOVERY - Content Translation Server on deployment-cxserver03 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.021 second response time [08:57:07] good morning [08:58:31] hi hashar [08:58:33] good morning [08:59:26] 10Quality-Assurance, 6Release-Engineering, 10MediaWiki-extensions-GettingStarted, 5Patch-For-Review: Pass MEDIAWIKI_CAPTCHA_BYPASS_PASSWORD in on Jenkins so GettingStarted browser tests pass - https://phabricator.wikimedia.org/T91220#1077413 (10phuedx) After this work has been done – and the GettingStarted... [09:01:13] zeljkof: good morning :) [09:01:29] hashar: good morning! :) [09:02:22] hashar: did you set up hangouts? [09:02:25] zeljkof: can you look at your google chat ? [09:02:36] would like to test out video over a chat client I have [09:02:44] hashar: looking [09:22:54] hashar: https://gerrit.wikimedia.org/r/#/c/193361/ [09:23:25] (03CR) 10Zfilipin: WIP Created the first Android centralNotice Jenkins job (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/193361 (https://phabricator.wikimedia.org/T86092) (owner: 10Zfilipin) [09:24:05] (03PS3) 10Zfilipin: WIP Created the first Android CentralNotice Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/193361 (https://phabricator.wikimedia.org/T86092) [09:24:44] (03PS6) 10Zfilipin: Delete unused mwext-browsertests Jenkins jobs [integration/config] - 10https://gerrit.wikimedia.org/r/193566 (https://phabricator.wikimedia.org/T91161) [09:27:02] (03PS7) 10Zfilipin: Delete unused mwext-browsertests Jenkins jobs [integration/config] - 10https://gerrit.wikimedia.org/r/193566 (https://phabricator.wikimedia.org/T91161) [09:27:15] hashar: https://gerrit.wikimedia.org/r/#/c/193566/ [09:29:57] (03CR) 10Hashar: [C: 032] "Pairing with Zeljkof, we found out both jobs are broken for random reasons. The jobs have always been experimental and has a bunch of dup" [integration/config] - 10https://gerrit.wikimedia.org/r/193566 (https://phabricator.wikimedia.org/T91161) (owner: 10Zfilipin) [09:34:44] hashar: https://integration.wikimedia.org/ci/job/integration-zuul-layoutdiff/2998/console [09:36:09] $ diff -u <(echo foo) <(echo bar); echo "Diff exit code: $?" [09:36:09] @@ -1 +1 @@ [09:36:09] -foo [09:36:09] +bar [09:36:09] Diff exit code: 1 [09:36:12] zeljkof: ^ [09:36:31] (03Merged) 10jenkins-bot: Delete unused mwext-browsertests Jenkins jobs [integration/config] - 10https://gerrit.wikimedia.org/r/193566 (https://phabricator.wikimedia.org/T91161) (owner: 10Zfilipin) [09:47:06] 10Beta-Cluster, 6Labs, 10Labs-Vagrant, 10MediaWiki-extensions-OpenStackManager, and 10 others: Labs' Phabricator tags overhaul - https://phabricator.wikimedia.org/T89270#1077511 (10Aklapper) [09:52:43] (03PS3) 10Zfilipin: Refactor VisualEditor JJB builder for production status browsertest [integration/config] - 10https://gerrit.wikimedia.org/r/193577 (https://phabricator.wikimedia.org/T90423) [09:52:49] hashar: https://gerrit.wikimedia.org/r/#/c/193577/ [09:54:38] (03CR) 10Hashar: [C: 031] "Sounds good to me, just confirm with VE team they agree on using @production tag which they should :)" [integration/config] - 10https://gerrit.wikimedia.org/r/193577 (https://phabricator.wikimedia.org/T90423) (owner: 10Zfilipin) [09:55:00] hashar: https://gerrit.wikimedia.org/r/#/c/193579/ [10:00:47] +1ed boths [10:17:00] good morning hashar [10:17:19] do you think this is ready? https://gerrit.wikimedia.org/r/#/c/193393 [10:21:16] (03PS2) 10Zfilipin: Setup Gather browser tests job [integration/config] - 10https://gerrit.wikimedia.org/r/193393 (https://phabricator.wikimedia.org/T91082) (owner: 10Jdlrobson) [10:22:13] joakino: do you know how to deploy the jenkins job and test it? [10:22:43] zeljkof: i have the instructions somewhere around, is that what is needed? [10:23:17] joakino: well, deploying the job, running it and getting it to pass would prove that the code is fine :) [10:33:45] joakino: good morning [10:33:56] joakino: I am still catching up reinstalling my system :( [10:34:46] ok, i'll try to do it when i can, we're really late this sprint, so not time for now, thx zeljkof hashar [10:54:46] 10Continuous-Integration, 10Tool-Labs: labs-toollabs-debian-glue fails apparently with a timeout - https://phabricator.wikimedia.org/T91247#1077688 (10scfc) 3NEW [11:00:37] 10Quality-Assurance, 10MediaWiki-extensions-UploadWizard, 6Multimedia, 10Multimedia-Sprint-2015-02-25, 5Patch-For-Review: UploadWizard browser test for chunked upload - https://phabricator.wikimedia.org/T89289#1077699 (10Gilles) 5Open>3Resolved Works fine with Chrome on the integration server: https:... [11:07:19] 10Continuous-Integration, 10MediaWiki-ResourceLoader, 10MediaWiki-Vagrant, 10Wikidata, and 2 others: qunit test broken without explicitly setting $wgResourceLoaderMaxQueryLength - https://phabricator.wikimedia.org/T90453#1077706 (10JanZerebecki) The test run without that setting failed even though it shows... [11:12:33] 10Quality-Assurance, 6Release-Engineering, 10MediaWiki-extensions-GettingStarted, 5Patch-For-Review: Pass MEDIAWIKI_CAPTCHA_BYPASS_PASSWORD in on Jenkins so GettingStarted browser tests pass - https://phabricator.wikimedia.org/T91220#1077718 (10zeljkofilipin) [11:13:19] 10Continuous-Integration, 6Release-Engineering, 7Jenkins, 5Patch-For-Review: Delete unused mwext-browsertests Jenkins jobs - https://phabricator.wikimedia.org/T91161#1077720 (10zeljkofilipin) The patch is merged and the jobs are deleted from Jenkins. [11:13:35] 10Continuous-Integration, 6Release-Engineering, 7Jenkins, 5Patch-For-Review: Delete unused mwext-browsertests Jenkins jobs - https://phabricator.wikimedia.org/T91161#1077722 (10zeljkofilipin) 5Open>3Resolved [11:24:39] (03PS3) 10Zfilipin: Setup Gather browser tests job [integration/config] - 10https://gerrit.wikimedia.org/r/193393 (https://phabricator.wikimedia.org/T91082) (owner: 10Jdlrobson) [11:26:51] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #1: FAILURE in 7.6 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/1/ [11:28:57] PROBLEM - Content Translation Server on deployment-cxserver03 is CRITICAL: Connection refused [11:33:11] (03CR) 10Zfilipin: [C: 04-1] "I have deployed the job, but it fails:" [integration/config] - 10https://gerrit.wikimedia.org/r/193393 (https://phabricator.wikimedia.org/T91082) (owner: 10Jdlrobson) [11:38:54] RECOVERY - Content Translation Server on deployment-cxserver03 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.020 second response time [11:52:24] (03CR) 10Zfilipin: [C: 031] "The reason the job fails is that the feature file is missing @chrome and @en.m.wikipedia.beta.wmflabs.org tags." [integration/config] - 10https://gerrit.wikimedia.org/r/193393 (https://phabricator.wikimedia.org/T91082) (owner: 10Jdlrobson) [12:23:34] hashar: Hm.. why are phpunit-zend jobs running on gallium instead of precise labs? [12:27:52] (03PS22) 10Krinkle: Fix WikibaseJavaScriptApi tests [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (https://phabricator.wikimedia.org/T86176) (owner: 10Adrian Lang) [12:30:15] (03CR) 10Krinkle: "prepare-mediawiki runs mw-run-update-script, shouldn't composer be run before that?" [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (https://phabricator.wikimedia.org/T86176) (owner: 10Adrian Lang) [12:43:03] (03PS3) 10Hashar: Revert "Switch wikidata-gremlin to Java 8 on Trusty" [integration/config] - 10https://gerrit.wikimedia.org/r/191893 (https://phabricator.wikimedia.org/T85964) [12:43:09] (03CR) 10Hashar: [C: 032] Revert "Switch wikidata-gremlin to Java 8 on Trusty" [integration/config] - 10https://gerrit.wikimedia.org/r/191893 (https://phabricator.wikimedia.org/T85964) (owner: 10Hashar) [12:49:56] (03Merged) 10jenkins-bot: Revert "Switch wikidata-gremlin to Java 8 on Trusty" [integration/config] - 10https://gerrit.wikimedia.org/r/191893 (https://phabricator.wikimedia.org/T85964) (owner: 10Hashar) [13:26:10] hashar: ping [13:29:58] (03PS2) 10Krinkle: Remove mediawiki-core-regression-* jobs [integration/config] - 10https://gerrit.wikimedia.org/r/193583 (https://phabricator.wikimedia.org/T88018) [13:34:15] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #171: ABORTED in 56 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/171/ [13:34:48] Project beta-scap-eqiad build #43573: FAILURE in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43573/ [13:38:22] hashar: I need you :) [13:39:23] !log integration-slave12xx and integration-slave14xx instances still depooled due to T90984 [13:39:28] Logged the message, Master [13:41:38] 10Continuous-Integration, 5Patch-For-Review: Fetch dependencies using composer instead of cloning mediawiki/vendor repository for non-WMF deployment branches - https://phabricator.wikimedia.org/T90303#1077855 (10Krinkle) p:5Triage>3Normal [13:42:18] 10Continuous-Integration, 6Release-Engineering, 10MediaWiki-File-management, 6Multimedia: Parser tests intermittently failing on Zend due to unexpected thumbnail error - https://phabricator.wikimedia.org/T91016#1077857 (10Krinkle) p:5Triage>3Unbreak! [13:44:10] 10Continuous-Integration: MediaWiki PHPUnit false negative: "Error creating thumbnail: Unable to save thumbnail to destination" - https://phabricator.wikimedia.org/T91208#1077863 (10Krinkle) [13:44:12] 10Continuous-Integration, 6Release-Engineering, 10MediaWiki-File-management, 6Multimedia: Parser tests intermittently failing on Zend due to unexpected thumbnail error - https://phabricator.wikimedia.org/T91016#1072472 (10Krinkle) [13:44:57] 10Continuous-Integration: Create composer validate --no-check-publish job for MediaWiki extensions - https://phabricator.wikimedia.org/T91176#1077873 (10Krinkle) [13:46:59] 10Continuous-Integration: Use dedicated jobs for mediawiki-core wmf branches that honour submodules - https://phabricator.wikimedia.org/T88239#1077880 (10Krinkle) [13:47:27] 10Continuous-Integration: php-composer-validate job should not be triggered if a composer.json file is removed from the repository - https://phabricator.wikimedia.org/T89263#1077884 (10Krinkle) p:5Triage>3Low [13:47:43] 10Continuous-Integration, 5Patch-For-Review: Fetch dependencies using composer instead of cloning mediawiki/vendor repository for non-WMF deployment branches - https://phabricator.wikimedia.org/T90303#1077891 (10Krinkle) [13:47:45] 10Continuous-Integration, 10MediaWiki-Unit-tests: Support running MediaWiki PHPUnit tests via composer - https://phabricator.wikimedia.org/T89626#1077890 (10Krinkle) [13:49:50] 10Continuous-Integration, 7Jenkins: Install and use load based balancer plugin - https://phabricator.wikimedia.org/T84911#1077895 (10Krinkle) p:5Triage>3Low [13:50:20] 10Continuous-Integration, 7Jenkins: Install and use load based balancer plugin - https://phabricator.wikimedia.org/T84911#1077897 (10Krinkle) 5Open>3declined a:3Krinkle [13:50:40] 10Continuous-Integration, 3Continuous-Integration-Isolation: Isolate contintcloud labs project from rest of the labs project - https://phabricator.wikimedia.org/T86168#1077899 (10Krinkle) p:5Triage>3Normal [13:50:46] 10Continuous-Integration, 10Wikimedia-Labs-Infrastructure, 3Continuous-Integration-Isolation: Figure out how to dedicate baremetal to a specific labs project - https://phabricator.wikimedia.org/T84989#1077901 (10Krinkle) p:5Triage>3Normal [13:50:57] 10Continuous-Integration, 7HHVM: HHVM Jenkins job throw: Unable to set CoreFileSize to 8589934592: Operation not permitted (1) - https://phabricator.wikimedia.org/T78799#1077903 (10Krinkle) p:5Triage>3Low [13:51:48] 10Continuous-Integration: Add mediawiki-extensions-hhvm and mediawiki-extensions-zend jobs to integration/phpunit.git repo - https://phabricator.wikimedia.org/T88479#1077905 (10Krinkle) 5Open>3declined a:3Krinkle integration/phpunit has been frozen and deprecated in favour of using local entry points for... [13:52:39] 10Continuous-Integration: Create composer validate --no-check-publish job for MediaWiki extensions - https://phabricator.wikimedia.org/T91176#1077908 (10Krinkle) p:5Triage>3Low [13:53:22] 10Continuous-Integration, 10MediaWiki-Unit-tests: Support running MediaWiki PHPUnit tests via composer - https://phabricator.wikimedia.org/T89626#1077911 (10Krinkle) p:5Triage>3Normal [13:54:43] 10Continuous-Integration, 10MediaWiki-Unit-tests: Support running MediaWiki PHPUnit tests via composer - https://phabricator.wikimedia.org/T89626#1040683 (10Krinkle) Moving to Done from #contint since the infrastructure for this is now available. It's not up to T90303 to apply it to mediawiki-core. [13:55:00] Yippee, build fixed! [13:55:00] Project beta-scap-eqiad build #43575: FIXED in 55 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43575/ [13:55:04] 10Continuous-Integration: Use dedicated jobs for mediawiki-core wmf branches that honour submodules - https://phabricator.wikimedia.org/T88239#1077919 (10Krinkle) p:5Triage>3Normal [13:55:17] 10Continuous-Integration, 10MediaWiki-Codesniffer: Convert existing legacy phpcs jobs to use composer entry point + versioning - https://phabricator.wikimedia.org/T90943#1077924 (10Krinkle) p:5Triage>3Low [13:59:56] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1077938 (10Krinkle) [14:00:47] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1076362 (10Krinkle) Looks like deployment-bastion is missing the `scap-recompile` deployment tool. CC-ing #Bet... [14:01:16] 10Continuous-Integration, 5Patch-For-Review, 7Technical-Debt: Remove mediawiki-core-regression-* jobs from mediawiki-core#postmerge pipeline - https://phabricator.wikimedia.org/T88018#1077944 (10Krinkle) p:5Triage>3Low [14:01:39] 10Continuous-Integration, 5Patch-For-Review, 7Technical-Debt: Remove mediawiki-core-regression-* jobs from mediawiki-core#postmerge pipeline - https://phabricator.wikimedia.org/T88018#1002548 (10Krinkle) p:5Low>3Lowest [14:01:53] 10Continuous-Integration, 5Patch-For-Review, 7Technical-Debt: Remove mediawiki-core-regression-* jobs from mediawiki-core#postmerge pipeline - https://phabricator.wikimedia.org/T88018#1002548 (10Krinkle) p:5Lowest>3Low [14:02:20] 10Continuous-Integration: LocalSettings.php should be copied in teardown instead of setup - https://phabricator.wikimedia.org/T90613#1077949 (10Krinkle) p:5Triage>3Low [14:02:37] 10Continuous-Integration, 5Patch-For-Review, 7Technical-Debt: Add regression tests for slave-script tools - https://phabricator.wikimedia.org/T86158#1077951 (10Krinkle) p:5Triage>3Low [14:03:20] 10Continuous-Integration, 5Patch-For-Review, 7Technical-Debt: Add regression tests for slave-script tools - https://phabricator.wikimedia.org/T86158#1077954 (10Krinkle) 5Open>3declined a:3Krinkle Most integration/slave-script tools have been frozen and deprecated in favour of test entry points. Writing... [14:03:26] 10Continuous-Integration, 7Technical-Debt: Add regression tests for slave-script tools - https://phabricator.wikimedia.org/T86158#1077957 (10Krinkle) [14:03:49] 10Continuous-Integration: scap-recompile missing - https://phabricator.wikimedia.org/T90337#1077958 (10Krinkle) [14:03:51] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1077959 (10Krinkle) [14:03:58] 6Release-Engineering, 10Wikidata: Travis failure for mysql + hhvm due to DBConnectionError - https://phabricator.wikimedia.org/T90178#1077960 (10Lydia_Pintscher) p:5Triage>3Normal [14:04:19] 10Continuous-Integration: Merge extensions PHPUnit and QUnit jobs - https://phabricator.wikimedia.org/T88207#1077962 (10Krinkle) p:5Triage>3Low [14:05:19] 10Continuous-Integration, 10MediaWiki-Configuration, 6MediaWiki-Core-Team, 7Blocked-on-Continuous-Integration: Update jenkins for extension registration changes - https://phabricator.wikimedia.org/T86359#1077966 (10Krinkle) 5Open>3Resolved p:5Triage>3Normal a:3Legoktm [14:05:57] 10Continuous-Integration, 3Continuous-Integration-Isolation: Puppetize Nodepool configuration - https://phabricator.wikimedia.org/T89143#1077975 (10Krinkle) p:5Triage>3Normal [14:06:06] 10Continuous-Integration: Set up salt for integration slaves in labs - https://phabricator.wikimedia.org/T87819#1077977 (10Krinkle) p:5Triage>3Low [14:07:39] 10Continuous-Integration, 6Mobile-Web, 7Documentation, 7Technical-Debt: [jsduck] Various custom tags should be easily shareable between projects - https://phabricator.wikimedia.org/T86587#1077982 (10Krinkle) p:5Triage>3Low [14:12:46] 10Continuous-Integration: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1077986 (10hashar) The puppet master yaml report files are being garbage collected by a cronjob. That has been done by YuviPanda because /var was filling quickly. The value... [14:13:12] 10Continuous-Integration, 5Patch-For-Review: Jenkins: Overhaul the phpcs macro - https://phabricator.wikimedia.org/T50420#1077989 (10Krinkle) 5Open>3Resolved a:3Krinkle [14:13:13] 10Continuous-Integration: split phpcs in voting and non-voting sniffs - https://phabricator.wikimedia.org/T48500#1077991 (10Krinkle) [14:14:43] Krinkle: are your questions regarding: https://gerrit.wikimedia.org/r/#/c/193130/ answered? [14:16:16] 10Continuous-Integration: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1077992 (10Krinkle) As I said before, that is not the problem nor the solution. The line can be seen as alternating between up and down in short-term graphs. That is indeed t... [14:17:04] jzerebecki: Have you determined that this does not affect production? E.g. Wikibase is not creating so many modules or requesting so many at once? [14:18:01] We may want to track this [14:19:02] By instrumenting the inner loop of the request splitting with a log/track call. [14:19:05] Krinkle: if it affects production currently IE would be broken, so this would fix it [14:19:20] jzerebecki: No, it would mask the problem. [14:19:22] which migh be bad from some point of view ;) [14:20:01] That many modules is either Wikibase requesting too much for an acceptable front-end performance, or requesting a reasonable amount but registering too many modules and thus not understanding how modules are meant tob e used [14:20:28] So logging of this "error" would be appropiate. [14:21:00] Also, if 2000 is the limit. This would presumably affect EventLogging [14:21:23] Having some references to back that up would avoid this becoming a blackbox [14:21:32] Whatever data you used to come with that, please include it. [14:22:33] Especially considering we don;t support all web servers and versions of IE for the Grade A payload. [14:22:53] Suporting "some" servers and "some" version of IE. Id rather have it be explicit so it is more open to improvement. [14:23:36] Krinkle: it is a widely known problem, i can look up e.g. a blog of the IE team about it for you. is that enough? even if we discard IE there would still be a higher limit in ngix we would be hitting [14:24:14] I don't mind the values. I need it documented so that there is a path forward in the future where we have an explicit choice to make if we want drop support for some part or another. [14:24:25] As opposed to "it was there for a good reason" and never being able to confidently change it. [14:24:50] Information about IE is quite often misinterpreted or distributed or generalised. Having a source just make things easier to reason about. [14:25:02] For example, does it affect IE11? [14:25:12] Or future versions of Nginx. [14:25:17] Those are all in flux. [14:25:47] Krinkle: there is no clear documentation, you would need to test them all [14:26:44] from experience I know that the 2k IE limit is the lowest one i ever encountered [14:27:38] Krinkle: do you want to implement the logging or can you tell me where the splitting is done and what i should use to log it? [14:27:45] Yeah. But stating that you only know it to affect IE5.5-IE8 would help. [14:28:00] Which seems to be claimed at http://support.microsoft.com/kb/208427 [14:28:09] It seems current nginx max is 8k [14:28:15] but configurable [14:29:13] 10Continuous-Integration: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1077995 (10hashar) Something is broken apparently: ``` integration-puppetmaster# crontab -l -u puppet 27 0,8,16 * * * find /var/lib/puppet/reports -type f -mmin +2160 -delet... [14:29:24] Krinkle: http://blogs.msdn.com/b/ieinternals/archive/2014/08/13/url-length-limits-in-internet-explorer.aspx claims "However, even today there are code paths that will not handle URLs that exceed the legacy limit; for instance, the WinINET cache will not store and retrieve resources that were retrieved from a URL longer than the legacy limit." when IE9 was already released [14:30:01] Again, I'm not saying 2k is wrong. Just include the data you already had with you when you got to that. [14:30:18] Just making sure it's in a state where we can take it somewhere without wondering why it is the way it is. [14:35:10] Krinkle: updated the commit message. [14:35:18] Krinkle: do you want to implement the logging or can you tell me where the splitting is done and what i should use to log it? [14:36:27] jzerebecki: it's done lower down in the same method you updated [14:36:36] .. if ( maxQueryLength > 0 && !$.isEmptyObject( moduleMap ) && l + bytesA .. [14:38:13] The main request is after the loop [14:39:34] mw.track() [14:40:32] 10Continuous-Integration: "/usr/local/bin/zuul-cloner: No such file or directory" on new instances - https://phabricator.wikimedia.org/T90984#1078010 (10Krinkle) Weird. It exists on integration-slave1405 but not integration-slave1401. [14:45:10] 10Continuous-Integration, 6Release-Engineering, 7Jenkins, 5Patch-For-Review: Delete unused mwext-browsertests Jenkins jobs - https://phabricator.wikimedia.org/T91161#1078018 (10Krinkle) p:5Low>3Unbreak! [14:45:33] 10Continuous-Integration, 6Release-Engineering, 7Jenkins, 5Patch-For-Review: Delete unused mwext-browsertests Jenkins jobs - https://phabricator.wikimedia.org/T91161#1075514 (10Krinkle) p:5Unbreak!>3Low [14:47:25] 10Continuous-Integration: Warn/alert on too many jobs queued - https://phabricator.wikimedia.org/T85034#1078033 (10Krinkle) p:5Triage>3Normal [14:48:24] 10Continuous-Integration: phplint fails on paths containing a space - https://phabricator.wikimedia.org/T89380#1078042 (10Krinkle) p:5High>3Normal [14:48:42] 10Continuous-Integration: phplint fails on paths containing a space - https://phabricator.wikimedia.org/T89380#1034975 (10Krinkle) jakub-onderka/php-parallel-lint is not affected I believe. [14:53:17] 10Continuous-Integration: [upstream] Jenkins: Link to job page from rebuild doesn't use correct url pattern - https://phabricator.wikimedia.org/T65660#1078063 (10Krinkle) p:5Normal>3Lowest [14:54:01] 10Continuous-Integration: phantomjs browsertests fails with invalid byte sequence in UTF-8 - https://phabricator.wikimedia.org/T59099#1078075 (10Krinkle) p:5Normal>3Low [14:55:00] 10Continuous-Integration, 6Release-Engineering, 7Jenkins: Jenkins: browser test host performance issue for timed builds - https://phabricator.wikimedia.org/T68449#1078078 (10Krinkle) p:5Normal>3Low [14:56:15] 10Continuous-Integration, 10MediaWiki-Unit-tests: Use parallel-phpunit in MediaWIki PHPUnit test jobs - https://phabricator.wikimedia.org/T50217#489950 (10Krinkle) [14:56:42] 10Continuous-Integration, 5Patch-For-Review: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1078095 (10hashar) I have applied change https://gerrit.wikimedia.org/r/#/c/193825/ on the integration puppetmaster [14:57:14] 10Continuous-Integration: Jenkins: point TMPDIR to dir in workspace or tmpfs and delete after build - https://phabricator.wikimedia.org/T70563#1078096 (10Krinkle) [14:57:31] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078097 (10Krenair) You'll need to convince people to get the rejection on T10390 reverted, at least. Then you'll need it to be properly security reviewed. [14:57:35] 10Continuous-Integration, 5Patch-For-Review: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1078099 (10hashar) a:5hashar>3None [14:58:43] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078101 (1080686) I don't understand. We alread use SWM on wikitech wiki, why can't we just do the same for the affcom wiki then? Where is the difference? [14:59:33] 10Continuous-Integration: Jenkins: point TMPDIR to dir in workspace or tmpfs and delete after build - https://phabricator.wikimedia.org/T70563#1078102 (10Krinkle) In [aa484c156529b7](https://github.com/wikimedia/integration-jenkins/commit/aa484c156529b7) I changed mw-teardown.sh to remove the entire tmpfs direct... [15:01:24] 10Continuous-Integration: phplint should detect PHP files not having the .php suffix - https://phabricator.wikimedia.org/T65041#1078114 (10Krinkle) [15:13:05] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078158 (10Krenair) labsconsole (now wikitech) was much more separate from production at the time it got SMW installed. chapcomwiki being private seems like another goo... [15:16:05] 10Quality-Assurance, 6Release-Engineering, 10MediaWiki-extensions-GettingStarted, 5Patch-For-Review: Pass MEDIAWIKI_CAPTCHA_BYPASS_PASSWORD in on Jenkins so GettingStarted browser tests pass - https://phabricator.wikimedia.org/T91220#1078172 (10hashar) The secret password has to be added to the Jenkins cre... [15:20:32] 10Beta-Cluster, 6Release-Engineering: Beta cluster unable to create thumbnail for WebM video - https://phabricator.wikimedia.org/T90332#1078184 (10coren) [15:21:12] 10Beta-Cluster, 6Release-Engineering: Beta cluster unable to create thumbnail for WebM video - https://phabricator.wikimedia.org/T90332#1056292 (10coren) This does not appear to be an operations issue at this time. If diagnosis determines we can help, feel free to put this back on our queue. [15:22:19] 10Staging, 6operations: Package geoipupdate for jessie - https://phabricator.wikimedia.org/T90229#1078195 (10coren) a:3faidon [15:23:53] 10Staging, 6operations: Package geoipupdate for jessie - https://phabricator.wikimedia.org/T90229#1054117 (10coren) p:5Triage>3Normal [15:27:33] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078205 (1080686) Well, in the end we simply need the features and (since we lost quite some time since this discussion has started) we need them soon. We have a huge t... [15:32:11] 10Staging, 5Patch-For-Review: Setup staging-palladium as puppetmaster and saltmaster - https://phabricator.wikimedia.org/T88304#1078212 (10coren) p:5Triage>3Normal [15:33:33] 10Continuous-Integration, 5Patch-For-Review: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1078217 (10Krinkle) p:5High>3Normal [15:34:21] 10Continuous-Integration: "/usr/local/bin/zuul-cloner: No such file or directory" on new instances - https://phabricator.wikimedia.org/T90984#1078218 (10Krinkle) a:5hashar>3Krinkle [15:36:54] 10Continuous-Integration: "/usr/local/bin/zuul-cloner: No such file or directory" on new instances - https://phabricator.wikimedia.org/T90984#1078230 (10Krinkle) Looks like the disk space on integration-puppetmaster was the immediate cause. Due to integration-puppetmaster being low on space, the puppet run sto... [15:39:42] !log Removing /usr/local/src/zuul from integration-slave12xx and integration-slave14xx to let puppet re-install zuul-cloner (T90984) [15:39:44] Logged the message, Master [15:40:40] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078245 (10Krenair) You cannot reasonably expect Wikimedia's developers to just drop everything and rush over here to start preparing an extension for deployment becaus... [15:43:43] 10Continuous-Integration, 7Tracking: Gallium must be backed up (tracking) - https://phabricator.wikimedia.org/T65934#1078255 (10Krinkle) [15:44:15] 10Continuous-Integration: Zuul log should be compressed after rotation - https://phabricator.wikimedia.org/T65935#1078260 (10Krinkle) [15:44:16] 10Continuous-Integration: Figure out paths that needs to be backed up on gallium - https://phabricator.wikimedia.org/T65938#1078261 (10Krinkle) [15:44:17] 10Continuous-Integration, 5Patch-For-Review: Write a cronjob to compress old Jenkins builds' logs - https://phabricator.wikimedia.org/T65939#1078259 (10Krinkle) [15:44:18] 10Continuous-Integration: Drop some Jenkins jobs history - https://phabricator.wikimedia.org/T65936#1078262 (10Krinkle) [15:44:19] 10Continuous-Integration: Remove /var/lib/git replication from gallium - https://phabricator.wikimedia.org/T65937#1078263 (10Krinkle) [15:44:21] 10Continuous-Integration: Remove /var/lib/git replication from gallium - https://phabricator.wikimedia.org/T65937#710754 (10Krinkle) [15:44:22] 10Continuous-Integration: Zuul log should be compressed after rotation - https://phabricator.wikimedia.org/T65935#710464 (10Krinkle) [15:44:23] 10Continuous-Integration, 7Tracking: Gallium must be backed up (tracking) - https://phabricator.wikimedia.org/T65934#710390 (10Krinkle) [15:44:24] 10Continuous-Integration: Figure out paths that needs to be backed up on gallium - https://phabricator.wikimedia.org/T65938#710875 (10Krinkle) [15:44:25] 10Continuous-Integration: Drop some Jenkins jobs history - https://phabricator.wikimedia.org/T65936#710616 (10Krinkle) [15:44:26] 10Continuous-Integration, 5Patch-For-Review: Write a cronjob to compress old Jenkins builds' logs - https://phabricator.wikimedia.org/T65939#710931 (10Krinkle) [15:47:48] 10Continuous-Integration, 10MediaWiki-Unit-tests: Bogus unit test failures from UIDGeneratorTest - https://phabricator.wikimedia.org/T91070#1078273 (10hashar) That is a race condition when the same job run twice or more on the same node. The jobs are sharing the same temp directory and one of them delete the m... [15:49:08] 10Continuous-Integration, 10MediaWiki-Unit-tests: MediaWiki Jobs running concurrently on the same instance share the same $wgTmpDirectory causing race condition - https://phabricator.wikimedia.org/T91070#1078281 (10hashar) [15:56:03] (03PS1) 10Hashar: Fix $wgTmpDirectory race condition [integration/jenkins] - 10https://gerrit.wikimedia.org/r/193832 (https://phabricator.wikimedia.org/T91070) [15:56:34] 10Continuous-Integration, 10MediaWiki-Unit-tests, 5Patch-For-Review: MediaWiki Jobs running concurrently on the same instance share the same $wgTmpDirectory causing race condition - https://phabricator.wikimedia.org/T91070#1078305 (10hashar) a:3hashar [15:58:23] Krinkle: added track to https://gerrit.wikimedia.org/r/#/c/193130/ merge please :) [15:58:35] 10Continuous-Integration, 10MediaWiki-Unit-tests, 5Patch-For-Review: MediaWiki Jobs running concurrently on the same instance share the same $wgTmpDirectory causing race condition - https://phabricator.wikimedia.org/T91070#1078314 (10Krinkle) Nice catch. Changing that would make it match what `$MW_DB_PATH`... [16:00:13] (03CR) 10Krinkle: "See comment on T91070. This code relies on the directory existing, which semantically clashes with how it is created. If it is relying on " [integration/jenkins] - 10https://gerrit.wikimedia.org/r/193832 (https://phabricator.wikimedia.org/T91070) (owner: 10Hashar) [16:04:11] 10Continuous-Integration: "/usr/local/bin/zuul-cloner: No such file or directory" on new instances - https://phabricator.wikimedia.org/T90984#1078323 (10Krinkle) It created it but with the wrong permissions. Jobs still not working. ``` $ dsh-ci-slaves 'ls -l /usr/local/bin/zuul-cloner' integration-slave1401.eq... [16:06:40] 10Continuous-Integration, 5Patch-For-Review: Write a cronjob to compress old Jenkins builds' logs - https://phabricator.wikimedia.org/T65939#1078328 (10hashar) I have installed [[ https://wiki.jenkins-ci.org/display/JENKINS/Compress+Build+Log+Plugin | Jenkins Compress Build Log Plugin ]] which at least get the... [16:08:33] PROBLEM - Free space - all mounts on deployment-jobrunner01 is CRITICAL: CRITICAL: deployment-prep.deployment-jobrunner01.diskspace.root.byte_percentfree.value (<44.44%) [16:15:49] ^d: btw, sorry for not sending out the "weekly log errors summary" email on Friday; I had a migraine that put me out until about 2pm on Saturday [16:16:03] <^d> Feeling better I hope? [16:20:13] 10Beta-Cluster: Crashed tables in deployment-db1 (can not login on beta) - https://phabricator.wikimedia.org/T91055#1078380 (10hashar) 5Open>3Resolved a:3hashar I guess this has been fixed meanwhile. Anyway I ran REPAIR TABLE on all three tables: ``` hashar@deployment-bastion:~$ sql enwiki (wikiadmin@depl... [16:25:23] ^d: yeah, except for not really having a weekend :) [16:25:46] I had none either :( [16:26:01] hashar: :/ [16:26:40] * ^d took lots of pictures of mountains with snow on them [16:27:11] ^d: the ones on instagram looked purty [16:27:38] <^d> Oh man it was fantastic. Best weekend of the season by far [16:27:54] <^d> We got the better part of a foot of snow between Fri-Sat in north Tahoe area [16:28:07] nice [16:30:58] Colorado is, evidently, having the snowiest February of all time. This is what my backyard has looked like over the course of last week: https://archive.org/details/timelapse_20150301 [16:31:17] oo, you did it?! [16:32:10] nice [16:32:31] timelapsification. Wish I would have turned off auto white balance. Other than that it turned out pretty nice. [16:38:32] 10Continuous-Integration, 5Patch-For-Review: "/usr/local/bin/zuul-cloner" broken on new instances - https://phabricator.wikimedia.org/T90984#1078470 (10Krinkle) [16:43:03] 10Continuous-Integration: "/usr/local/bin/zuul-cloner" broken on new instances - https://phabricator.wikimedia.org/T90984#1078475 (10Krinkle) [16:44:08] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078477 (1080686) Well, we are a volunteer body who serve the movement, as volunteers, and provide a service. The techs are paid staff intended to provide their tech se... [16:54:59] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078504 (10greg) [16:55:17] (03CR) 10Mvolz: "Ping? :)" [integration/config] - 10https://gerrit.wikimedia.org/r/191063 (owner: 10Jforrester) [16:56:46] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1019342 (10greg) Hi Manuel, Sorry for the delay and confusion here. I'll talk with some people here to determine if and how we can do this... [16:57:11] ^d: thoughts re smw on affcom? [16:57:49] <^d> i'm inclined to say no [16:57:55] * greg-g nods [16:58:10] I don't want it to start creeping over places [16:58:34] <^d> wikitech was supposed to be a one-off exception because we were using it to build the open stack manager with it [16:58:50] <^d> (in retrospect, everyone hates that decision to couple openstack with MW :)) [16:59:01] yep :) [17:02:25] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078532 (1080686) Thanks a lot Greg for your response. That certainly brings some brightness into this discussion :-) I don't know the rat... [17:10:00] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078553 (10csteipp) SMW has had an unusually high number of security issues, so I would definitely advise against it for any wiki that cont... [17:14:07] 10Continuous-Integration: Jenkins: Set up PHPUnit testing on MySQL backend - https://phabricator.wikimedia.org/T37912#1078582 (10greg) p:5Lowest>3Normal Changing priority: testing on top of the software we use in production is a good thing. [17:16:12] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078587 (10greg) Regarding the decision not to use SMW on WMF production wikis: See https://phabricator.wikimedia.org/T10390#132245 I can'... [17:17:18] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078591 (10Krenair) > Of course I don't expect "drop everything", but so far we have made a resolution that we need to introduce certain wo... [17:19:18] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078601 (10Krenair) >>! In T88748#1078553, @csteipp wrote: > What's the affcom wiki dbname? Is it private? chapcomwiki, domain is chapcom.... [17:20:05] on beta, it seems we are getting served stale versions of some resource loader modules (e.g. gadgets) [17:20:12] http://wikidata.beta.wmflabs.org/wiki/Q12480 [17:20:28] aude: yeah, there's a bug for that [17:20:33] mw.loader.store.get('ext.gadget.AuthorityControl'); (after clearing local storage) [17:20:34] https://phabricator.wikimedia.org/T90983 [17:20:36] awww [17:21:29] 10Beta-Cluster: Beta cluster bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1078605 (10greg) p:5High>3Normal [17:21:38] 10Beta-Cluster, 6Release-Engineering: Beta cluster unable to create thumbnail for WebM video - https://phabricator.wikimedia.org/T90332#1078608 (10greg) p:5Triage>3Normal [17:21:50] 10Beta-Cluster, 10VisualEditor, 10Wikimedia-Search: File search on beta labs returns results from production commons and beta commons. VE search results are in unexpected order also. - https://phabricator.wikimedia.org/T90650#1078611 (10greg) p:5Triage>3High [17:21:52] 10Beta-Cluster, 10VisualEditor, 10Wikimedia-Search: File search on beta labs returns results from production commons and beta commons. VE search results are in unexpected order also. - https://phabricator.wikimedia.org/T90650#1064118 (10greg) p:5High>3Normal [17:22:00] 10Beta-Cluster: Can't run mwscript without explicit sudo on Beta Cluster - https://phabricator.wikimedia.org/T89802#1078617 (10greg) p:5Triage>3Normal [17:22:16] 10Beta-Cluster, 10ContentTranslation-cxserver, 6Language-Engineering, 10MediaWiki-extensions-ContentTranslation: [Beta] cxserver: Permission of /data/project/cxserver/log is suspicious - https://phabricator.wikimedia.org/T90416#1078620 (10greg) p:5High>3Normal [17:22:29] 10Beta-Cluster: Account creation throttling too restrictive on Beta Cluster - https://phabricator.wikimedia.org/T87704#1078627 (10greg) p:5Lowest>3Normal [17:22:32] 10Beta-Cluster, 6Multimedia: Low-quality images are not rendered in beta - https://phabricator.wikimedia.org/T71757#1078630 (10greg) p:5Lowest>3Normal [17:24:43] 10Beta-Cluster: Beta cluster bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1078635 (10aude) I keep getting stale versions of gadget js, even after clearing local storage. For http://wikidata.beta.wmflabs.org/wiki/Q12480, I am getting old ext.gadget.AuthorityContr... [17:24:46] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078636 (1080686) Hmm, having Wikibase could be an alternative. It would need some help from Wikidata pros to achieve what we planned to do... [17:30:57] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078643 (10Legoktm) >>! In T88748#1078636, @80686 wrote: > What we need: > * forms with specific fields for all types of affiliates, to sto... [17:39:22] greg-g, do you know if anyone did anything about the issue on deployment-jobrunner01? [17:40:40] Krenair: I... dont'. [17:41:06] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078695 (1080686) Ok, problem here: SMW is all simple MW syntax, there are a bunch of examples and experience around. I have no clue how t... [17:42:13] greg-g, okay [17:42:33] greg-g, I found that the job runner ran out of space before its jobrunner.log was like >4GB full [17:42:36] by the issue do oyu mean .... yeah [17:42:59] being spammed with the full HTML response basically saying "Sorry, we were not able to work out what wiki you were trying to view." [17:43:05] because it's sending broken requests [17:43:14] twentyafterfour: ^d thcipriani: deployment-jobrunner is running low on space ^ [17:43:16] because type/wiki data in redis is wrongly formatted in some way [17:43:22] twentyafterfour: ^d thcipriani: see also http://shinken.wmflabs.org/problems [17:43:24] <^d> Hmm [17:43:49] I compared the format of the data for the ready job queue in beta vs. prod and it looked all wrong [17:44:20] I probably left some more info in this channel's logs [17:46:02] <^d> /var/log is taking up about 7.5G of the 18G root partition [17:46:29] greg-g: I'm not sure I have access to shinken? [17:46:35] hmmm [17:46:37] <^d> wikitech log [17:46:47] 3.2GB of which is jobrunner.log [17:46:48] <^d> *login [17:46:54] /var/log/mediawiki/jobrunner.log [17:47:14] thcipriani: guest/guest [17:47:25] job queue data I looked at was from `hgetall jobqueue:aggregator:h-ready-queues:v2` from https://wikitech.wikimedia.org/wiki/Redis#Connecting (server is deployment-redis01) [17:47:31] <^d> dafuq? [17:47:40] * ^d is reading the log [17:48:00] 31<Krenair>30 because it's sending broken requests [17:48:03] because type/wiki data in redis is wrongly formatted in some way [17:48:59] <^d> curl -XPOST -s -a 'http://127.0.0.1:9005/rpc/RunJobs.php?wiki=webVideoTranscode%2Fcommonswiki&type=webVideoTranscode%2Fenwiki&maxtime=60&maxmem=300M' [17:49:05] <^d> Is failing [17:49:09] yes ^d [17:49:18] * ^d is caught up now [17:49:25] I told you that twice :p [17:49:44] <^d> Yes yes, but I was trying to correlate what I was seeing with what you said :) [17:50:53] <^d> Running it by hand gets the same result [17:51:06] yes [17:51:09] the request is broken [17:51:16] <^d> It's not a php5/hhvm thing. [17:51:16] because the data it's generated from is broken [17:51:19] <^d> I tried the former [17:51:20] see redis [17:51:23] I told you this already [17:51:43] <^d> Pardon me for trying to understand [17:53:36] in beta we have entries like "cirrusSearchLinksUpdatePrioritized%2Fenwiki/webVideoTranscode%2Fenwiki" [17:53:58] but prod - cirrusSearchLinksUpdatePrioritized/commonswiki [17:55:45] But I don't know why that's happening [17:56:20] 10Beta-Cluster: Beta cluster bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1078737 (10aude) even though the gadget was modified today, the response has "Last-Modified:Thu, 26 Feb 2015 18:21:11 GMT" for: http://bits.beta.wmflabs.org/wikidata.beta.wmflabs.org/load.... [17:59:58] 6Release-Engineering, 10Security-Reviews, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Install SMW on AffCom wiki - https://phabricator.wikimedia.org/T88748#1078778 (10Nemo_bis) Lua and forms don't have much to do with each other. If the use case is simple, perhaps https://www.mediawiki.org/wiki... [18:05:58] (03CR) 10Jdlrobson: "@Zfilipin fixed in https://gerrit.wikimedia.org/r/#/c/193851/" [integration/config] - 10https://gerrit.wikimedia.org/r/193393 (https://phabricator.wikimedia.org/T91082) (owner: 10Jdlrobson) [18:07:11] <^d> Krenair: I really dunno [18:36:12] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #512: FAILURE in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/512/ [18:44:57] so all the requests are broken? [18:45:23] (re jobrunner01) [18:45:25] all the jobrunner requests are [18:45:52] Project beta-scap-eqiad build #43607: FAILURE in 1 min 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43607/ [18:46:02] so probably something in the configuration for the jobrunner it's self or mediawiki code? [18:46:20] (that failure is due to jobrunner01 being completely out of space there) [18:46:32] twentyafterfour, whatever code is dumping stuff into the job queue in redis [18:46:37] I think that's in mediawiki code [18:46:48] maybe ask AaronSchulz [18:47:49] * [AaronS] #wikimedia-operations #mediawiki-core [19:07:44] (per #mediawiki-core, have deleted the job queue key in redis, should get regenerated. also cleared screwed up log and restarted job runner service) [19:14:53] Yippee, build fixed! [19:14:54] Project beta-scap-eqiad build #43610: FIXED in 51 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43610/ [19:18:24] RECOVERY - Free space - all mounts on deployment-jobrunner01 is OK: OK: All targets OK [19:54:50] Project beta-scap-eqiad build #43619: FAILURE in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43619/ [20:11:56] 10Beta-Cluster: Beta cluster bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1079397 (10hashar) [20:11:57] 10Beta-Cluster: Caching makes it impossible to test JS changes when logged out - https://phabricator.wikimedia.org/T65034#1079398 (10hashar) [20:12:34] Yippee, build fixed! [20:12:35] Project beta-scap-eqiad build #43625: FIXED in 52 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/43625/ [20:13:57] 10Beta-Cluster: Beta cluster bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1079404 (10hashar) I have merged this task with {T65034}. I have no idea how we handle the bits cache invalidation on production, I thought it had an expiry of 5 minutes with revalidation a... [20:18:38] 10Beta-Cluster: Account creation throttling too restrictive on Beta Cluster - https://phabricator.wikimedia.org/T87704#1079426 (10hashar) 5Open>3Resolved a:3hashar The WMF office is no more throttled (T87841). One can further tweak InitialiseSettings-labs.php ([[ https://gerrit.wikimedia.org/r/#/c/191489... [20:22:24] 10Beta-Cluster, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation, 5ContentTranslation-Release4, 3LE-Sprint-83: Setup new wikis in Beta Cluster for Content Translation - https://phabricator.wikimedia.org/T90683#1079473 (10hashar) Note: the beta cluster is not that powerful and i... [20:25:29] (03CR) 10Werdna: "I notice https://gerrit.wikimedia.org/r/#/c/193555/ has been merged – does that mean I should now adjust this patch to work with composer?" [integration/config] - 10https://gerrit.wikimedia.org/r/191888 (owner: 10Werdna) [20:32:48] 10Beta-Cluster: Parser cache (memcached?) broken in beta labs - https://phabricator.wikimedia.org/T91310#1079554 (10Catrope) 3NEW [20:43:59] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #219: FAILURE in 10 min: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/219/ [20:47:39] Yippee, build fixed! [20:47:40] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #2: FIXED in 3 min 38 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/2/ [20:51:50] 10Beta-Cluster: Caching makes it impossible to test JS changes when logged out - https://phabricator.wikimedia.org/T65034#1079612 (10matmarex) If this is indeed the same issue as T90983, I think the priority should be bumped somewhat. It makes beta somewhat less than useful for, you know, actually verifying the... [20:53:38] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1079622 (10hashar) There is still a symlink on deployment-bastion: /usr/local/bin/scap-recompile -> /srv/... [20:54:09] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1079623 (10hashar) p:5Triage>3Normal [20:57:15] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1079642 (10bd808) There is a deb package that is used in production: ``` $ apt-cache show mediawiki-math-tex... [20:58:34] 10Beta-Cluster, 10Continuous-Integration, 10Math: beta-recompile-math-texvc-eqiad job fails with "/usr/local/bin/scap-recompile: No such file or directory" - https://phabricator.wikimedia.org/T91191#1079643 (10bd808) The old script from ... [21:05:59] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #493: FAILURE in 18 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/493/ [21:12:34] 10Continuous-Integration, 6Release-Engineering, 6Fundraising Sprint Enya, 10Fundraising Tech Backlog, and 2 others: Create CI slave instance for CiviCRM testing - https://phabricator.wikimedia.org/T89894#1079746 (10hashar) The integration labs project is out of quota: Cores: 79/80 RAM: 161792/2048... [21:16:05] 10Quality-Assurance, 10Gather, 10Gather Sprint C_, 5Patch-For-Review: Setup Gather browser tests Jenkins job - https://phabricator.wikimedia.org/T91082#1079812 (10Jdlrobson) [21:17:39] 10Continuous-Integration, 6Release-Engineering, 6Fundraising Sprint Enya, 10Fundraising Tech Backlog, and 2 others: Create CI slave instance for CiviCRM testing - https://phabricator.wikimedia.org/T89894#1048122 (10hashar) Once the instance is created, we will need to update the [[ https://wikitech.wikimed... [21:19:27] 10Continuous-Integration, 10Wikimedia-Fundraising, 10Wikimedia-Fundraising-CiviCRM: Add Fundraising Tech team to the labs Integration project - https://phabricator.wikimedia.org/T90472#1079829 (10hashar) And I have made Awight a project admin. That should not be needed, but in case it is you will not be bloc... [21:20:20] 10Continuous-Integration, 10Wikimedia-Fundraising, 10Wikimedia-Fundraising-CiviCRM: Add Fundraising Tech team to the labs Integration project - https://phabricator.wikimedia.org/T90472#1079832 (10awight) That was fast! Thanks for setting us up :) [21:35:12] !log (per #mediawiki-core, have deleted the job queue key in redis, should get regenerated. also cleared screwed up log and restarted job runner service) [21:35:14] Logged the message, Master [21:35:16] (for posterity) [21:35:28] thanks. I need to remember to do that [21:38:15] 10Continuous-Integration: "/usr/local/bin/zuul-cloner" broken on new instances - https://phabricator.wikimedia.org/T90984#1079891 (10hashar) /usr/local/src/zuul is a local git clone made by puppet. The files and modules are then installed via setup.py under /usr/local/lib/python2.7/dist-packages . The umask err... [21:46:10] 10Continuous-Integration: "/usr/local/bin/zuul-cloner" broken on new instances - https://phabricator.wikimedia.org/T90984#1079932 (10Krinkle) Ah, okay. After I applied the umask fix, I removed `/usr/local/src/zuul` to let puppet re-create it. But didn't know about `/usr/local/lib/python2.7`. Ran the following t... [21:53:56] 10Beta-Cluster: Parser cache (memcached?) broken in beta labs - https://phabricator.wikimedia.org/T91310#1079955 (10hashar) I barely know how memcached is configured. Last time there was a port conflict between nutcracker / twemproxy. I think @yuvipanda has been recreating instances on beta cluster to take adv... [22:02:00] 10Beta-Cluster: Parser cache (memcached?) broken in Beta Cluster - https://phabricator.wikimedia.org/T91310#1079999 (10greg) [22:02:08] 6Release-Engineering, 6Team-Practices: Make log responsibilities changes - https://phabricator.wikimedia.org/T89049#1080002 (10ggellerman) [22:25:55] 10Continuous-Integration: "/usr/local/bin/zuul-cloner" broken on new instances - https://phabricator.wikimedia.org/T90984#1080120 (10Krinkle) It's now installed on Trusty instances, but breaking on Precise still. ``` .. Mar 2 21:56:21 integration-slave1401 puppet-agent[8049]: (/Stage[main]/Contint::Packages::L... [23:04:27] 6Release-Engineering, 10Code-Review: Import all gerrit.wikimedia.org repositories with Diffusion - https://phabricator.wikimedia.org/T616#1080301 (10chasemp) [23:27:05] bd808, Krinkle: so I'm trying to get mwcore phpunit jobs to use composer instead of vendor...I'm running into https://integration.wikimedia.org/ci/job/mediawiki-phpunit-hhvm-composer/2/console right now. Do either of you understand that error? [23:27:50] legoktm: invalid LocalSettings [23:28:00] probably something appending ' Krinkle: but above that it said: [23:28:15] 23:04:17 + php -l /mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm-composer/LocalSettings.php [23:28:16] 23:04:17 No syntax errors detected in /mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm-composer/LocalSettings.php [23:28:23] That runs earlier [23:28:45] Hm.. [23:28:46] yeah [23:29:22] the very next command is running phpunit though, so I don't see where something would be able to touch LocalSettings.php... [23:30:21] (https://gerrit.wikimedia.org/r/#/c/193757/ is the configuration the job is currently using) [23:31:26] on slave1005 right now, /mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm-composer/LocalSettings.php looks ok [23:31:49] I don't see any rogue [23:32:26] but I've kind of always wonders why that file is the concat rather than just something that includes the separate files [23:32:50] the zend job had the same issue so I don't think it's a one-off random error (https://integration.wikimedia.org/ci/job/mediawiki-phpunit-zend-composer/2/console) [23:34:08] legoktm: What about php -l on the file on the server? [23:34:13] And if you run phpunit? [23:34:24] (using /slave-scripts/bin/ run phpunit) [23:34:31] I...don't think I have access to the labs slaves. [23:34:33] $ php -l /mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm-composer/LocalSettings.php [23:34:33] No syntax errors detected in /mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm-composer/LocalSettings.ph [23:35:46] I can repeat the error on the command line though [23:36:16] by using the mw-run-phpunit script? [23:36:54] sudo -u jenkins-slave php phpunit.php .... [23:37:00] hmm [23:37:16] But running php -l like that works [23:37:17] but why :||||| [23:37:24] eg no syntax error [23:37:38] so .... something funky in phpunit.php? [23:38:01] so the main difference between this job and the normal phpunit job is that core is cloned to "." and not "src" (and that we're using composer and not mw/vendor) [23:43:49] Can I get added to the labs integration project? I assume that's what I need to be able to ssh into slaves.. [23:44:59] Krinkle, bd808: ^ [23:55:48] <^d> done [23:56:08] thanks! [23:56:14] <^d> yw