[00:00:03] 10Phabricator: Verify identity of and remove 2fa for Sam Wilson - https://phabricator.wikimedia.org/T196522#4259650 (10greg) p:05Triage>03High [00:03:06] 10Phabricator: Verify identity of and remove 2fa for Sam Wilson - https://phabricator.wikimedia.org/T196522#4259650 (10kaldari) I can verify that Sam and I discussed this over Google Hangouts. [00:04:08] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Next), 10Patch-For-Review: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#4259668 (10Krenair) Anyway just to be more unhelpful I'm going to take that automatically generated table and add helpful manual comments to... [00:04:31] no_justification: yeah, nice. [00:05:36] Plus if we remove repos from the list cuz we stop deploying them a clean wouldn't get them on the old branch 😉 [00:05:36] seems like it stopped deleting branches after wmf/1.31.0-wmf.20 [00:05:49] for mw vendor [00:05:53] That sounds dumb [00:06:03] Also: I hate the vendor repo [00:06:07] PROBLEM - SSH on integration-slave-docker-1014 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:06:40] It's fine on mw-core [00:06:44] but... oops [00:06:51] mw/skins/Vector also has them since the same branch [00:06:58] I guess it might be something more general then [00:07:38] extensions seem to have them since wmf/1.32.0-wmf.1 [01:36:02] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T191052#4259738 (10Krinkle) [01:47:33] 10Release-Engineering-Team (Kanban), 10Scap, 10Goal: Automate the Train - https://phabricator.wikimedia.org/T196515#4259740 (10Reedy) [02:17:31] 10Phabricator: Verify identity of and remove 2fa for Sam Wilson - https://phabricator.wikimedia.org/T196522#4259763 (10Aklapper) 05Open>03Resolved a:03Aklapper <3! ``` aklapper@phab1001:~$ sudo /srv/phab/phabricator/bin/auth strip --all-types --user Samwilson These auth factors will be stripped: Samwil... [02:21:43] 10Phabricator: Verify identity of and remove 2fa for Sam Wilson - https://phabricator.wikimedia.org/T196522#4259768 (10Samwilson) Hurrah! I'm back. And I've re-added 2FA (still with no backup codes, but that's T85706 I think). Thanks everyone. :) [02:23:10] 10Phabricator, 10LDAP: Having difficulty logging into Phabricator via LDAP when multiple accounts returned for username - https://phabricator.wikimedia.org/T138672#4259771 (10Andrew) I doubt that there's any interaction between the two accounts (mech vs. smccandlish) and there's definitely no interaction betwe... [02:42:02] 10Phabricator: Verify identity of and remove 2fa for Sam Wilson - https://phabricator.wikimedia.org/T196522#4259776 (10Aklapper) Yay. You're welcome, and sorry it took me a bit longer. [02:55:53] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10Pywikibot-core, 10Pywikibot-tests: Magul's quick tests doesn't run anymore - https://phabricator.wikimedia.org/T186208#3937263 (10Dalba) I think it would be rather easy to add live tests to tox.ini so that Jenkins will run them. The problem is... [03:05:42] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10Pywikibot-core, 10Pywikibot-tests: Magul's quick tests doesn't run anymore - https://phabricator.wikimedia.org/T186208#4259793 (10Dvorapa) It seems too long to me, especially for patches which does not change the code at all (doc patches...). I... [04:36:51] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10GitHub-Mirrors, 10Pywikibot-core, 10Pywikibot-tests: AppVeyor test not running since months - https://phabricator.wikimedia.org/T183860#4259823 (10Dvorapa) [04:37:52] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10GitHub-Mirrors, 10Pywikibot-core, and 2 others: AppVeyor test not running since months - https://phabricator.wikimedia.org/T183860#3865572 (10Dvorapa) [05:53:04] 10Continuous-Integration-Infrastructure: Is there a way to specify different different inter-extension-dependencies for REL-Branches? - https://phabricator.wikimedia.org/T196454#4259863 (10Osnard) Thanks. I'd also love to see `extension.json` be evaluated for dependency resolution. So in the meantime the best wa... [06:33:20] 10Release-Engineering-Team (Kanban), 10Scap, 10Patch-For-Review, 10Scoring-platform-team (Current): Support git-lfs in scap - https://phabricator.wikimedia.org/T180627#4259920 (10awight) [08:09:50] 10Beta-Cluster-Infrastructure, 10RelEng-Archive-FY201718-Q1, 10media-storage, 10Patch-For-Review: deployment-ms-be03.deployment-prep and deployment-ms-be04.deployment-prep have high load / system CPU - https://phabricator.wikimedia.org/T160990#4260001 (10hashar) Swift triggers a lot of name resolution to `... [08:16:11] PROBLEM - SSH on integration-slave-docker-1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:19:32] (03PS1) 10Hashar: BlueSpice REL1_27 should not depend on ExtJSBase [integration/config] - 10https://gerrit.wikimedia.org/r/437680 (https://phabricator.wikimedia.org/T196454) [08:20:01] (03CR) 10Hashar: [C: 04-1] "I will need to write a test for that." [integration/config] - 10https://gerrit.wikimedia.org/r/437680 (https://phabricator.wikimedia.org/T196454) (owner: 10Hashar) [08:21:03] RECOVERY - SSH on integration-slave-docker-1021 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [08:21:20] 10Continuous-Integration-Infrastructure, 10Patch-For-Review: Is there a way to specify different different inter-extension-dependencies for REL-Branches? - https://phabricator.wikimedia.org/T196454#4260020 (10hashar) >>! In T196454#4259863, @Osnard wrote: > Thanks. I'd also love to see `extension.json` be eval... [08:27:33] 10Continuous-Integration-Config, 10ContentTranslation, 10ContentTranslation-Deployments, 10Jenkins, 10Language-2018-Apr-June: Add Backport Repository to Jenkins Package Build - https://phabricator.wikimedia.org/T196037#4260049 (10hashar) [08:35:51] (03PS1) 10Hashar: Use backports for apertium-apy [integration/config] - 10https://gerrit.wikimedia.org/r/437685 (https://phabricator.wikimedia.org/T196037) [08:38:11] 10Continuous-Integration-Config, 10ContentTranslation, 10ContentTranslation-Deployments, 10Jenkins, and 2 others: Add Backport Repository to Jenkins Package Build - https://phabricator.wikimedia.org/T196037#4260072 (10hashar) The repository `debian/changelog` uses `jessie` as the distribution. If you make... [08:38:15] (03CR) 10Hashar: [C: 032] Use backports for apertium-apy [integration/config] - 10https://gerrit.wikimedia.org/r/437685 (https://phabricator.wikimedia.org/T196037) (owner: 10Hashar) [08:39:38] (03Merged) 10jenkins-bot: Use backports for apertium-apy [integration/config] - 10https://gerrit.wikimedia.org/r/437685 (https://phabricator.wikimedia.org/T196037) (owner: 10Hashar) [08:54:44] kart_: apertium-apy now uses a Jenkins job that always enables BACKPORTS. It still fails though but for a different reason https://phabricator.wikimedia.org/T196037#4260160 [08:55:01] some issue with distutils not recognizing long_description_content_type = 'text/markdown; charset=UTF-8' [08:56:27] 10Continuous-Integration-Config, 10ContentTranslation, 10ContentTranslation-Deployments, 10Jenkins, and 2 others: Add Backport Repository to Jenkins Package Build - https://phabricator.wikimedia.org/T196037#4260160 (10hashar) With the CI job always injecting BACKPORTS and the patch targetting `jessie-wikim... [08:57:17] 10Continuous-Integration-Config, 10ContentTranslation, 10ContentTranslation-Deployments, 10Jenkins, and 2 others: Add Backport Repository to Jenkins Package Build - https://phabricator.wikimedia.org/T196037#4260163 (10hashar) Another note is that upstream uses Travis and refers to Trusty / python3.4. Appar... [09:10:05] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260200 (10hashar) [09:10:34] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing with permission denied for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260203 (10MarcoAurelio) [09:20:15] (03PS1) 10Hashar: Reflect depends on LiquidThreads [integration/config] - 10https://gerrit.wikimedia.org/r/437690 (https://phabricator.wikimedia.org/T196532) [09:20:25] (03CR) 10Hashar: [C: 032] Reflect depends on LiquidThreads [integration/config] - 10https://gerrit.wikimedia.org/r/437690 (https://phabricator.wikimedia.org/T196532) (owner: 10Hashar) [09:21:42] (03Merged) 10jenkins-bot: Reflect depends on LiquidThreads [integration/config] - 10https://gerrit.wikimedia.org/r/437690 (https://phabricator.wikimedia.org/T196532) (owner: 10Hashar) [09:26:19] (03Draft1) 10MarcoAurelio: Mark repository as read-only [extensions/SolrStore] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437692 [09:26:22] (03PS2) 10MarcoAurelio: Mark repository as read-only [extensions/SolrStore] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437692 [09:26:40] (03CR) 10MarcoAurelio: [V: 032 C: 032] "Please submit." [extensions/SolrStore] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437692 (owner: 10MarcoAurelio) [09:26:48] (03PS1) 10Hashar: Migrate Reflect to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437693 (https://phabricator.wikimedia.org/T183512) [09:27:08] (03CR) 10Hashar: [C: 032] Migrate Reflect to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437693 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [09:27:17] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260272 (10hashar) [09:28:16] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<33.33%) [09:28:28] (03Merged) 10jenkins-bot: Migrate Reflect to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437693 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [09:30:45] (03PS3) 10MarcoAurelio: Mark repository as read-only [extensions/SolrStore] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437692 (https://phabricator.wikimedia.org/T196513) [09:30:56] (03CR) 10MarcoAurelio: Mark repository as read-only [extensions/SolrStore] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437692 (https://phabricator.wikimedia.org/T196513) (owner: 10MarcoAurelio) [09:32:38] RECOVERY - Puppet errors on deployment-jobrunner03 is OK: OK: Less than 1.00% above the threshold [0.0] [09:35:32] (03CR) 10Hashar: [V: 032 C: 032] "Excellent :]" [extensions/SolrStore] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437692 (https://phabricator.wikimedia.org/T196513) (owner: 10MarcoAurelio) [09:37:15] hashar: https://gerrit.wikimedia.org/r/#/c/437689/ before another merge conflict svp? [09:37:53] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260306 (10hashar) [09:43:57] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260346 (10hashar) [09:45:00] 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikidata-Campsite, 10Wikidata-Ministry-Of-Magic-Tech-Debt, and 5 others: Run Wikibase daily browser tests on Jenkins - https://phabricator.wikimedia.org/T167432#4260353 (10RazShuty) [09:53:46] (03Draft1) 10MarcoAurelio: Mark repository as read only [extensions/Genderize] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437702 [09:53:48] (03PS2) 10MarcoAurelio: Mark repository as read only [extensions/Genderize] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437702 [09:54:13] (03CR) 10MarcoAurelio: [C: 04-2] "Not until https://gerrit.wikimedia.org/r/#/c/437701/ is merged." [extensions/Genderize] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437702 (owner: 10MarcoAurelio) [09:55:39] (03PS3) 10MarcoAurelio: Mark repository as read only [extensions/Genderize] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437702 (https://phabricator.wikimedia.org/T196108) [10:08:16] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [10:14:16] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<22.22%) [10:23:37] 10Release-Engineering-Team (Kanban), 10Scap, 10Patch-For-Review, 10Scoring-platform-team (Current): Support git-lfs in scap - https://phabricator.wikimedia.org/T180627#4260482 (10awight) @mmodell Awesome, thanks for this workaround. I confirmed that running your command from deploy1001 made the following... [10:35:17] (03CR) 10Hashar: [V: 032 C: 032] "I have merged https://gerrit.wikimedia.org/r/#/c/437701/ :]" [extensions/Genderize] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437702 (https://phabricator.wikimedia.org/T196108) (owner: 10MarcoAurelio) [10:37:56] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing with permission denied for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260500 (10hashar) [10:39:34] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing with permission denied for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260203 (10hashar) The rsync error is due to castor restoring the composer/npm cache. But it is ignored. The actual error is... [10:44:45] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing with permission denied for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260519 (10Reedy) >>! In T196530#4260500, @hashar wrote: > That follow the upgrade of composer on Friday. @Legoktm and @Reedy... [10:54:16] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [11:00:17] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<22.22%) [11:04:45] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260536 (10MarcoAurelio) [11:05:28] thcipriani: The services deployment window overlaps with the train on Wednesdays. Can you confirm that this is crazy and we should move the window? [11:07:39] 10Continuous-Integration-Infrastructure, 10Patch-For-Review: Is there a way to specify different different inter-extension-dependencies for REL-Branches? - https://phabricator.wikimedia.org/T196454#4260539 (10Osnard) Awesome, thanks! Yes, ExtJSBase was created after REL1_27. But as I own the extension, I coul... [11:10:06] (03Draft1) 10MarcoAurelio: Mark repository as read only [extensions/SkelJS] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437727 (https://phabricator.wikimedia.org/T196509) [11:10:09] (03PS2) 10MarcoAurelio: Mark repository as read only [extensions/SkelJS] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437727 (https://phabricator.wikimedia.org/T196509) [11:14:15] (03CR) 10MarcoAurelio: [V: 031] Mark repository as read only [extensions/SkelJS] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/437727 (https://phabricator.wikimedia.org/T196509) (owner: 10MarcoAurelio) [11:20:04] 10Release-Engineering-Team, 10Cleanup, 10GitHub-Mirrors, 10Repository-Admins, 10User-MarcoAurelio: Archive the CategorySlideShow extension - https://phabricator.wikimedia.org/T186969#4260562 (10MarcoAurelio) So somebody can take care of deleting that GitHub mirror. [11:31:48] leszek_wmde, addshore, hasharAway: if you have a minute to review this, it would be great https://gerrit.wikimedia.org/r/#/c/437718/ [11:32:32] zeljkof: done, thanks! [11:32:41] leszek_wmde: thanks! [11:39:40] 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikidata-Campsite, 10Wikidata-Ministry-Of-Magic-Tech-Debt, and 5 others: Run Wikibase daily browser tests on Jenkins - https://phabricator.wikimedia.org/T167432#4260579 (10zeljkofilipin) [11:45:03] 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikidata-Campsite, 10Wikidata-Ministry-Of-Magic-Tech-Debt, and 5 others: Run Wikibase daily browser tests on Jenkins - https://phabricator.wikimedia.org/T167432#4260590 (10zeljkofilipin) [11:46:13] 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikidata-Campsite, 10Wikidata-Ministry-Of-Magic-Tech-Debt, and 5 others: Run Wikibase daily browser tests on Jenkins - https://phabricator.wikimedia.org/T167432#3687367 (10zeljkofilipin) a:05zeljkofilipin>03None Nothing left for me to do here, so un-as... [11:56:41] leszek_wmde: one more :) https://gerrit.wikimedia.org/r/#/c/437737/ [11:57:37] zeljkof: can I run npm-based selenium tests in a real browser? [11:57:55] zeljkof: oh, thanks! [11:58:00] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Add Backport Repository to Jenkins Package Build - https://phabricator.wikimedia.org/T196037#4260621 (10KartikMistry) a:03hashar [11:58:18] tgr: they always run in a real browser! :D [11:58:38] tgr: what do you want to do, see the browser? is it hidden by default on your machine? [11:59:47] zeljkof: I run the tests in a labs-vagrant box on WMCloud [12:00:03] I have set up X11 forwarding but don't see the screen [12:00:07] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Add Backport Repository to Jenkins Package Build - https://phabricator.wikimedia.org/T196037#4260626 (10KartikMistry) >>! In T196037#4260160, @hashar wrote: > I guess... [12:00:31] tgr: uh, I have no experience with labs-vagrant :/ [12:01:08] well, it's vagrant inside an OpenStack VM, nothing special [12:01:10] and I have limited experience with x11 forwarding, but only from a vm on my machine to my main maching [12:01:29] tgr: sorry, I have no clue what could go wrong there [12:01:38] is there a reason you don't run the tests locally? [12:01:52] why are you running the tests in labs-vagrant? [12:02:07] does something work locally, but fails there? and you want to debug? [12:02:12] I would have to have the exact same setup (MediaWiki with dozens of roles) locally and on the target machine [12:02:30] ok, is there a problem? [12:02:37] and so would everyone else who is testing the same project [12:02:49] you should be able to record a video of the test run, would that work for you? [12:03:15] so it was easier to have one central box for testing, which can have all the complex setup (centralauth, DB replication, whatever) [12:03:20] I really have no clue how x11 forwarding works and what could go wrong [12:03:27] that's my fallback plan, yes [12:03:36] ok, thanks [12:03:55] no obvious switch then, like in the old ruby-based setup? [12:04:28] can you get it working with ruby tests? (x11 forwarding) [12:04:33] I had no clue anybody used it [12:04:47] we had support for it?! :D [12:04:50] last time I did was two years ago I think? [12:04:55] but yeah it worked [12:05:03] HEADLESS=false or something like that [12:05:25] ah, here it's DISPLAY=:1 [12:05:48] :1 is the port, or whatever it's called with xvfb [12:06:06] let me see the docs [12:06:55] https://phabricator.wikimedia.org/source/mediawiki/browse/master/tests/selenium/ [12:07:30] > By default, Chrome will run in headless mode. If you want to see Chrome, set DISPLAY environment variable to any value: [12:07:33] > DISPLAY=1 npm run selenium [12:07:56] tgr: if you add tests to a repo that did not have them, please add the repo here https://www.mediawiki.org/wiki/Selenium/Node.js#write-tests [12:12:49] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260642 (10Reedy) https://github.com/wikimedia/integration-config/commit/6d9035c59834cb4c49f7d82c35fc90be7ea6d7b4 updated composer-hhvm to 0.2.4 and... [12:13:56] (03Draft2) 10Reedy: Bump version for composer-package-hhvm-docker [integration/config] - 10https://gerrit.wikimedia.org/r/437740 (https://phabricator.wikimedia.org/T196530) [12:14:44] (03CR) 10Reedy: [C: 032] Bump version for composer-package-hhvm-docker [integration/config] - 10https://gerrit.wikimedia.org/r/437740 (https://phabricator.wikimedia.org/T196530) (owner: 10Reedy) [12:15:35] zeljkof: thanks, that sort of works! [12:16:01] I still don't see anything but it does use the X11 connection [12:16:19] so probably the rest of the problem is on my machine [12:16:32] (03Merged) 10jenkins-bot: Bump version for composer-package-hhvm-docker [integration/config] - 10https://gerrit.wikimedia.org/r/437740 (https://phabricator.wikimedia.org/T196530) (owner: 10Reedy) [12:17:59] leszek_wmde: argh! `No Rakefile found` working on it [12:24:33] leszek_wmde: the last one, I hope :/ https://gerrit.wikimedia.org/r/#/c/437741/ [12:25:02] it's been a while since I have used Ruby Selenium framework, I have forgot all the files it needs :/ [12:29:26] 10Continuous-Integration-Config, 10Tool-stewardbots, 10Zuul: composer-hhvm-docker failing for labs/tools/stewardbots - https://phabricator.wikimedia.org/T196530#4260686 (10Reedy) 05Open>03Resolved a:03Reedy Looks like that did the job [12:29:42] addshore, hasharAway: if you have a minute, this is blocking me from getting selenium-WikibaseLexeme-chrome working https://gerrit.wikimedia.org/r/#/c/437741/ [12:32:08] zeljkof: dont you need a Gemfile as well ? :] [12:32:24] hashar: it's there already [12:32:45] hashar: https://gerrit.wikimedia.org/r/#/c/437737/ [12:34:21] bundle exec rake test [12:34:21] Running RuboCop... [12:34:22] No such file or directory @ rb_sysopen - /home/hashar/projects/mediawiki/extensions/WikibaseLexeme/.rubocop.yml [12:34:37] lets drop rubocop entirely [12:34:46] hashar: we never run `rake test` for the repo :D [12:34:50] PROBLEM - SSH on integration-slave-docker-1015 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:35:01] only `rake selenium` [12:35:26] hashar: I could not care less about rubocop at this point :/ [12:35:58] zeljkof: so without X11 forwarding the tests run normally (sort of - they all fail but it's an error in the test, not the test runner) [12:36:16] tgr: cool! [12:36:46] with X11 forwarding it opens a new browser window, which freezes immediately (no content, just an empty frame) and after a while the test runner dies with ESOCKETTIMEDOUT [12:36:47] tgr: do you need help fixing the tests? [12:36:57] also selenium-Wikibase-chrome is on permanent slaves DebianJessie && contintLabsSlave [12:37:00] with a chrome -suffix [12:37:15] tgr: can you get x11 forwarding working on local mw-vagrant? [12:37:34] I'm not touching any tests, we want to run some pervasive core changes through as many existing extension tests as possible [12:37:46] hashar: that job is making me more and more sad :/ [12:38:05] zeljkof: let me check [12:38:12] and there is https://integration.wikimedia.org/ci/job/selenium-WikibaseLexeme-chrome-434016/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=DebianJessie%20&&%20contintLabsSlave/7/console [12:39:05] tgr: `npm run selenium` will find all node selenium tests on a local machine, if it follows a convention https://phabricator.wikimedia.org/source/mediawiki/browse/master/tests/selenium/wdio.conf.js;dc6d8d2c3e4822ec9a4387255218e63a60aea6fe$44-50 [12:39:45] hashar: looking into that https://phabricator.wikimedia.org/T194252 [12:39:59] hashar: that's after https://gerrit.wikimedia.org/r/#/c/437741/ is merged [12:40:21] hashar: https://integration.wikimedia.org/ci/job/selenium-WikibaseLexeme-chrome-434016/jobConfigHistory/showDiffFiles?timestamp1=2018-06-06_11-33-04×tamp2=2018-06-06_12-28-05 [12:40:37] ah [12:41:19] hashar: the repo never run in CI, looks like, so it probably is set up to run from tests/browser and the job runs if from root of the repo [12:41:23] hashar: something like that [12:41:44] * zeljkof is out of lunch, back for SWAT [12:41:45] and why is it running on permanent slaves when selenium-CirrusSearch-jessie and selenium-RelatedArticles-jessie are on nodepool instances? [12:42:29] hashar: this is not a -jessie job, but a -chrome job, looks like they use different slaves [12:42:37] selenium-WikibaseLexeme-chrome [12:43:30] and selenium-CirrusSearch-jessie and selenium-RelatedArticles-jessie seems to use wdio [12:47:43] zeljkof: leszek_wmde beat me to it! [12:48:50] zeljkof: and the jobs requires Wikibase to be cloned as well [12:48:54] it tries to require /../../../../../Wikibase/tests/browser/features/support/modules [12:54:41] RECOVERY - SSH on integration-slave-docker-1015 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [12:55:59] 10Phabricator: Private space for tracking internal security team activities - https://phabricator.wikimedia.org/T196542#4260758 (10Reedy) [13:00:18] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<22.22%) WARN: deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [13:00:33] zeljkof: FWIW I documented how to set up X11 forwarding (which had some unrelated issues too) at https://wikitech.wikimedia.org/wiki/Help:MediaWiki-Vagrant_in_Cloud_VPS#SSH_to_the_Vagrant_box [13:00:50] this sets the display automatically [13:01:48] tgr: cool! [13:02:31] hashar: I'm back, looking into it [13:02:53] addshore, leszek_wmde: thanks! :) the job should run now, there will be failures, I'll explain in the task [13:03:06] zeljkof: ok, cool [13:03:22] zeljkof: adding rubocop.yml file to the patch [13:04:04] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260773 (10hashar) [13:04:10] leszek_wmde: oh, do we need it? [13:04:36] zeljkof: yeah, as hashar also noticed [13:05:04] leszek_wmde: it's just hashar being pedantic and running rubocop locally :) [13:05:31] I think we only run selenium rake target, not rubocop, so it doesn't matter if rubocop is set up or not [13:05:40] zeljkof: oh [13:05:41] but rubocop config will not hurt [13:05:45] added it any way [13:06:10] yeah, it will not hurt, but it is not set up to run for the repo, as far as I can see [13:07:12] leszek_wmde: I would actually prefer if rubocop was set up separatelly :/ [13:07:24] zeljkof: ack [13:07:30] now that patch is a mess of rake/rubocop setup plus rubocop fixes [13:09:00] it took me a while to remember how to run ruby tests in CI and how to debug at CI, my several patches could be merged, but rubocop setup and fixes should be a separate patch, it's not at all related, I don't think the job will fail [13:09:06] (without rubocop) [13:11:23] zeljkof: understood. moved the robocop out [13:13:16] leszek_wmde: thanks! [13:13:23] we can set it up as the next step [13:14:26] leszek_wmde: now if the job fails because of missing rubocop... :') that would be funny, right? right? :D [13:16:00] leszek_wmde: did you see that selenium-Wikibase-chrome/MEDIAWIKI_ENVIRONMENT=test finished with just a few failures? in 9.5 hours! :D https://phabricator.wikimedia.org/T167432 [13:28:30] (03PS3) 10Zfilipin: Added WikibaseLexeme for a daily Ruby selenium test run [integration/config] - 10https://gerrit.wikimedia.org/r/434016 (https://phabricator.wikimedia.org/T194252) (owner: 10WMDE-leszek) [13:29:13] (03CR) 10Zfilipin: "PS3 removes `Depends-On` from commit message." [integration/config] - 10https://gerrit.wikimedia.org/r/434016 (https://phabricator.wikimedia.org/T194252) (owner: 10WMDE-leszek) [13:30:05] (03CR) 10Zfilipin: [C: 032] Added WikibaseLexeme for a daily Ruby selenium test run [integration/config] - 10https://gerrit.wikimedia.org/r/434016 (https://phabricator.wikimedia.org/T194252) (owner: 10WMDE-leszek) [13:31:42] (03PS1) 10Hashar: Migrate SpellingDictionary to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437755 (https://phabricator.wikimedia.org/T183512) [13:31:53] (03Merged) 10jenkins-bot: Added WikibaseLexeme for a daily Ruby selenium test run [integration/config] - 10https://gerrit.wikimedia.org/r/434016 (https://phabricator.wikimedia.org/T194252) (owner: 10WMDE-leszek) [13:33:38] (03CR) 10Hashar: [C: 032] Migrate SpellingDictionary to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437755 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [13:35:03] (03Merged) 10jenkins-bot: Migrate SpellingDictionary to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437755 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [13:35:19] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260903 (10hashar) [13:36:11] (03PS1) 10Hashar: Migrate Theme to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437756 (https://phabricator.wikimedia.org/T183512) [13:36:22] (03CR) 10Hashar: [C: 032] Migrate Theme to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437756 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [13:37:00] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260912 (10hashar) [13:37:28] hi [13:37:43] (03Merged) 10jenkins-bot: Migrate Theme to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437756 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [13:38:09] we are planning on doing a gerrit database failover, heads up to check -operations [13:50:02] reminder: I'm going to start rebooting VMs in a few minutes; all will be chaos in the beta cluster until I finish [13:59:36] addshore: do you have a few minutes for a few questions about https://github.com/addshore/mediawiki-docker-dev#6-configure-the-environment? [14:00:01] where should local.env be located? [14:00:19] o/ [14:00:25] in the root of that repo :) [14:00:53] addshore: ah, makes sense :) [14:01:32] I got confused with docker being in one place, your repo at another, mediawiki at yet another... [14:03:40] :D [14:04:34] ok, so most of the stuff happens in your repo, but composer install in mw repo [14:06:45] (03PS1) 10Hashar: Migrate TranslateSvg to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437763 (https://phabricator.wikimedia.org/T179774) [14:07:08] zeljkof: yup [14:07:33] (03CR) 10Hashar: [C: 032] Migrate TranslateSvg to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437763 (https://phabricator.wikimedia.org/T179774) (owner: 10Hashar) [14:08:47] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4261031 (10hashar) [14:08:58] (03Merged) 10jenkins-bot: Migrate TranslateSvg to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/437763 (https://phabricator.wikimedia.org/T179774) (owner: 10Hashar) [14:15:58] addshore: it's working! :D [14:16:07] :D [14:16:12] It is meant to be easy ;) [14:16:24] one thing to note, is that currently php always runs in debug mode, so php requests can be a bit slow :) [14:16:26] :( [14:16:37] that's reasonable [14:16:42] it was easy enough [14:17:05] i should be able to migrate it to a kubernetes solution pretty easily too, but dont tihnk I'll bother yet [14:23:15] PROBLEM - Host integration-slave-docker-1003 is DOWN: CRITICAL - Host Unreachable (10.68.23.87) [14:23:19] PROBLEM - Host deployment-puppetmaster03 is DOWN: CRITICAL - Host Unreachable (10.68.23.29) [14:23:46] gerrit db failover done, check if you see any issues [14:23:57] you may have suffered from temporary slow request to gerrit [14:28:25] addshore: everything worked up until I tried to run this https://phabricator.wikimedia.org/diffusion/EWLE/browse/master/README.md;6bf24252d28797b1d69d7d8512c6295dec759f09$44 [14:28:58] `addshore/mediawiki-docker-dev$ docker run -it --rm --user $(id -u):$(id -g) -v ~/.composer:/composer -v $(pwd):/app docker.io/composer install` [14:29:17] `Composer could not find a composer.json file in /app` [14:29:43] where did you run it from? [14:29:53] you should run that in the WikibaseLexeme dir [14:29:54] addshore/mediawiki-docker-dev [14:29:58] ah [14:30:06] :) [14:30:08] of course [14:30:11] phabricator db is next- so I will need your help later or tomorrow [14:30:21] $(pwd):/app needs to mount the dir with the composer.json in it [14:31:06] addshore: but wait, the readme does not say to clone WikibaseLexeme [14:31:16] just Wikibase [14:31:41] zeljkof: in the Wikibase dir then, sorry I didn't see where you were in the readme [14:31:51] ah, thanks [14:32:04] Wikibase/WikibaseLexeme confuse me :) [14:32:11] not sure what is what [14:32:17] or what's the difference [14:33:07] it looks like the readme is slightly conflicting, as it tells you to run composer install for the Wikibase install, but then also tell you to run it using the composer merge plugin later [14:33:18] zeljkof: they are 2 different extensions :) [14:33:23] tarrow: ^^ fyi re the readme [14:35:41] addshore, tarrow: sorry for more questions :/ but I don't understand where to run this from https://phabricator.wikimedia.org/diffusion/EWLE/browse/master/README.md;6bf24252d28797b1d69d7d8512c6295dec759f09$65 [14:36:05] or do I need to set PWD environment variable? [14:36:07] I think you should be able to run that anywhere [14:36:39] unless it needs to be in the same folder as the docker-compose.yml [14:36:44] tarrow: thanks! will try running from mediawiki-docker-dev [14:36:47] tarrow: no, docker-compose needs to be run from the mediawiki-docker-dev repo :) [14:37:05] ok, then I guessed it correctly :) [14:37:06] otherwise it doesnt know what containers to talk to as it doesnt know where its compose file is [14:37:25] maybe the readme should be slightly more explicit :) [14:37:57] yep! Thanks for being the guinea pig :) [14:39:15] tarrow: also, the readme uses ssh in line 75 and https in line 33 for cloning repos [14:39:50] and ssh clone fails :) [14:40:09] `z@gerrit.wikimedia.org: Permission denied (publickey).` [14:40:15] will try via https [14:40:26] zeljkof: you should setup your ssh key properly ;) [14:40:41] Awesome, I'll change that [14:41:12] addshore: it should be set up :P looks like this repo is doing something it was not ready for :) [14:42:01] zeljkof: as in the key for your gerrit.wikimedia.org is in your sshconfig / provided by an agent? :O [14:42:06] interesting :P [14:42:40] addshore: I remember vagrant setup required some fancy setup [14:43:02] addshore: the README suggests you should run all the composer commands from the root dir of mediawiki because the composer.local.json should be set up [14:44:28] tarrow, addshore: you got me really confused with where to run docker from :) [14:46:02] tarrow, addshore: ok, this did not work, will paste the error to the task https://phabricator.wikimedia.org/diffusion/EWLE/browse/master/README.md;6bf24252d28797b1d69d7d8512c6295dec759f09$102 [14:46:49] cool, where is the task? [14:49:09] tarrow: https://phabricator.wikimedia.org/T194252#4261170 [14:50:49] Oooh, I'll have a look [14:56:09] I'm not a composer expert but I wonder if you can try deleting your composer.lock file? There should be one in the root folder and also one in Wikibase and WikibaseLexeme [14:56:21] zeljkof: ^^ [14:57:35] tarrow: will do, so all three lock files should be deleted? [14:58:17] If you can try that. And does your composer.local.json look like in the readme? [15:01:10] tarrow: I have copy/pasted composer.local.json so it is exactly the same :) [15:02:47] If it still doesn't work then I suggest deleting composer.local.json and running `docker run -it --rm --user $(id -u):$(id -g) -v ~/.composer:/composer -v $(pwd):/app docker.io/composer install` 3 times. Once from the root, once from Wikibase and, once from WikibaseLexeme [15:03:33] it works now! (after deleting composer.lock from mediawiki and wikibase folders, it did not exist in lexeme folder) [15:03:45] woo! [15:04:20] 10Project-Admins, 10PAWS: Create "JupyterHub 0.9" milestone for PAWS project - https://phabricator.wikimedia.org/T196559#4261206 (10Chicocvenancio) [15:04:33] tarrow: please add deleting lock files to the readme! :D [15:04:58] I think it is only needed because you ran from extensions/Wikibase (I think) [15:05:27] (which I think Adam asked you to do) [15:05:29] I was strictly following instructions! [15:05:45] oh, maybe I did do something wrong then :) [15:05:49] 10Phabricator: Private space for tracking internal security team activities - https://phabricator.wikimedia.org/T196542#4261217 (10Aklapper) a:05Aklapper>03None Please see https://www.mediawiki.org/wiki/Phabricator/Creating_and_renaming_projects#Restricting_access_via_Space_policies for more information. [15:06:39] but that was still really my fault because the readme wasn't clear in the first place; sorry it took so long [15:07:05] tarrow: no problem at all, you were very responsive [15:07:24] I am not really experienced with docker, so I get confused easily [15:10:19] Great! I'll add some lines about exactly where to run things [15:10:46] it is confusing with both docker and docker-compose and how they sometimes care where they are run and others they don't [15:17:09] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document running Selenium tests targeting custom MediaWiki install - https://phabricator.wikimedia.org/T196561#4261284 (10zeljkofilipin) [15:19:11] thcipriani: I have some crazy train and MCR related idea for when you are around :) [15:19:12] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document running Selenium tests targeting custom MediaWiki install - https://phabricator.wikimedia.org/T196561#4261284 (10zeljkofilipin) p:05Triage>03Normal [15:20:30] addshore: oh boy. :) What's up? [15:21:08] :D [15:21:10] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document running Selenium tests targeting custom MediaWiki install - https://phabricator.wikimedia.org/T196561#4261330 (10zeljkofilipin) [15:21:19] So, we have these MCR patches that we want to merge, in mediawiki core only [15:21:38] is there any way to have them hang around on testwikis / group0 but not follow the rest of the train? [15:22:25] or, my other thought, have them deployed to testwikis / group0 next week after the train has rolled, and have them on there for the week of no deploys to extend the time for bug spotters [15:23:00] MCR? ...My Chemical Romance? Sorry, it's early :) [15:23:11] :D [15:23:21] MultiContentRevisions, the great and scary ;) [15:23:25] ahhhh [15:24:08] I think the typical pattern is to put the code behind a feature flag that is controlled in mediawiki-config [15:24:37] is that a solution that could work? [15:24:52] not really, i mean, its around 2500 lines :P [15:25:06] I see :) [15:25:11] or more depending on how many patches we get merged [15:26:14] other question: is there a week of no deploys coming up? [15:26:27] In my head I was thinking about once the branch was made, making another branch called wmf.xx+mcr or something,and then switching group0 to that at some point [15:26:41] Week of June 18th apparently, SRE Offsite this week, no non-emergency deploys [15:26:45] ah [15:29:16] we could do what you're suggesting. Swap out group0 on Thursday once that week's train is deployed everywhere. I'd prefer if we could somehow turn this on and off with changes to configuration, but failing that that plan seems possible. [15:30:22] we'd want to run that by greg-g (greg-g see ^) to get the official stamp of approval [15:30:43] that is, swap out group0 on thursday and let +mcr bake there for a week [15:30:58] thcipriani: okay [15:31:01] :) [15:31:16] I wonder how much of our tooling will break if we add +mcr to the version :) [15:31:23] I should check on that. [15:31:40] I'm less than excited about doing this without a feature flag [15:31:42] hahaa, I was also thinking that, but perhaps we could change it to wmf.0 for fun? ;) [15:32:37] * greg-g goes into a meeting [15:33:49] Phabricator seems very slow [15:34:02] https://news.ycombinator.com/item?id=17245649 sadly [15:34:35] I see [15:34:39] addshore: could you look into the feasiblity of putting this behind something in mw-config and releng will investigate the alternative to see if there are any show-stoppers? [15:34:55] thcipriani: can do [15:35:00] cool, thanks [15:36:50] also also: is there any gotchas with some wikis going to this new branch/feature flag and then not? Are things backwards compat? [15:36:56] cc addshore ^ :) [15:38:21] greg-g: should not be, this doesn't do any new db writing, that functionality is there but behind feature flags / migration flags [15:38:57] greg-g: thcipriani i'll touch base again later [15:39:04] addshore: great, thanks! [15:46:46] aww so sweet, we can still be slashdotted [15:47:30] makes one feel ten years younger [15:49:30] 10Continuous-Integration-Infrastructure, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add Reedy to contint-docker group - https://phabricator.wikimedia.org/T196192#4261418 (10RobH) 05Open>03Resolved a:03RobH This has been merged live. All affected servers will call into puppet and get th... [15:50:21] Nikerabbit ah i can reproduce too [15:50:28] i suspect this is the apache problem [15:50:33] there's a task for this some where [15:50:42] * paladox files a task [15:50:45] but it's sloww heh [15:51:09] cc twentyafterfour ^^ [15:53:46] Nikerabbit https://phabricator.wikimedia.org/T196565 [15:55:23] 10Phabricator, 10Operations: Phabricator is very slow to load - https://phabricator.wikimedia.org/T196565#4261464 (10Paladox) [15:55:57] or is it just overloaded by the additional traffic? [15:55:58] currently slashdotted: https://news.ycombinator.com/item?id=17245649 [15:56:01] 10Phabricator, 10Operations: Phabricator is very slow to load - https://phabricator.wikimedia.org/T196565#4261482 (10Paladox) p:05Triage>03Unbreak! [15:56:40] heh [15:56:55] so phab is overloaded now? I wonder is there a way to increase capacity? [15:57:05] moar servers [15:57:13] == $$$ [15:57:14] 10Phabricator, 10Operations: Phabricator is very slow to load - https://phabricator.wikimedia.org/T196565#4261464 (10greg) See also: https://news.ycombinator.com/item?id=17245649 [15:57:16] Well phab is being upgraded stretch [15:57:24] to fix a apche issue [15:57:25] apache [15:57:33] i wonder would php-fpm increase capacity? [15:58:09] that's cool twentyafterfour, crazy times :) [15:58:27] I just restarted apache, that seems to have helped [15:58:54] PROBLEM - Puppet errors on deployment-redis02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:58:54] PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:59:03] Project beta-scap-eqiad build #210679: 04FAILURE in 5 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210679/ [15:59:13] 10Phabricator, 10Operations: Phabricator is very slow to load - https://phabricator.wikimedia.org/T196565#4261505 (10mmodell) Restarted apache to free up some stuck processes, this seems to have helped quite a bit, I'm not sure for how long though. [15:59:23] PROBLEM - Host deployment-mediawiki-09 is DOWN: CRITICAL - Host Unreachable (10.68.17.159) [15:59:32] thanks twentyafterfour! [16:00:24] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:00:26] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:00:30] PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [16:00:51] PROBLEM - Puppet errors on deployment-jobrunner03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [16:00:51] PROBLEM - Host integration-slave-docker-1013 is DOWN: CRITICAL - Host Unreachable (10.68.23.152) [16:00:57] PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:01:06] vm restart probably ^^ [16:06:50] Can someone make me a member of deployment-prep? [16:09:22] Yippee, build fixed! [16:09:22] Project beta-scap-eqiad build #210680: 09FIXED in 5 min 36 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210680/ [16:11:32] and the problem is back [16:11:46] (intermittent now) slow at some point then fast again. [16:16:24] James_F: what's your shell name? [16:17:04] thcipriani: jforrester. [16:18:21] horizon is thinking about it [16:19:01] James_F: you're listed now [16:20:46] thcipriani: Thanks! [16:32:22] thcipriani: greg-g so, feature flag probably isn't possible [16:34:31] unless we did some crazy stuff with class aliases and stuff, and the total lines of code is closer to ~7000 or something [17:23:13] Project beta-update-databases-eqiad build #25986: 04FAILURE in 3 min 11 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/25986/ [17:47:51] Project mwext-phpunit-coverage-publish build #5244: 04FAILURE in 29 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/5244/ [17:52:56] 10Phabricator, 10Operations: Phabricator is very slow to load - https://phabricator.wikimedia.org/T196565#4261908 (10mmodell) I can't reproduce currently, load average isn't particularly high and phabricator has been snappy fast for a while now. I think we can close this as resolved. [18:00:56] Project beta-scap-eqiad build #210692: 04FAILURE in 7 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210692/ [18:05:35] (03PS1) 10Thcipriani: deploy-promote: Remove extra newline from results [tools/release] - 10https://gerrit.wikimedia.org/r/437798 [18:05:40] (03PS1) 10Thcipriani: deploy-promote: Remove extra newline from results [tools/release] - 10https://gerrit.wikimedia.org/r/437798 [18:06:39] Yippee, build fixed! [18:06:39] Project mwext-phpunit-coverage-publish build #5245: 09FIXED in 6 min 8 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/5245/ [18:06:50] 10Phabricator, 10Operations, 10User-greg: Phabricator is very slow to load - https://phabricator.wikimedia.org/T196565#4262003 (10greg) 05Open>03Resolved a:03greg Please reopen if something looks off in the future. [18:20:21] Project beta-update-databases-eqiad build #25987: 04STILL FAILING in 20 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/25987/ [18:24:08] Project beta-scap-eqiad build #210693: 04STILL FAILING in 20 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210693/ [18:25:03] re update: previous failure was an out of date composer.lock. Latest is waiting for DB lag. [18:25:09] wait for the next one I suppose :/ [18:26:48] (03PS7) 10Dduvall: Perform helm deployment in service-pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/425936 (https://phabricator.wikimedia.org/T188935) [18:26:49] (03PS7) 10Dduvall: Perform helm deployment in service-pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/425936 (https://phabricator.wikimedia.org/T188935) [18:27:30] Can someone disconnect and reconnect wikibugs? The double-posting is a pain. [18:29:57] Project beta-scap-eqiad build #210694: 04STILL FAILING in 3 min 57 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210694/ [18:34:15] just came here to ask the same :) [18:34:42] do I still have access... [18:37:05] Project beta-scap-eqiad build #210695: 04STILL FAILING in 3 min 20 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210695/ [18:37:58] hurrah [18:44:23] Reedy: That killed it… is there a replacement coming back? :-) [18:44:29] God knows [18:44:43] $ fab start_jobs [18:44:43] [tools-login.wmflabs.org] Executing task 'start_jobs' [18:44:43] [tools-login.wmflabs.org] sudo: /usr/bin/jsub -N wb2-phab -l release=trusty -mem 1G -once -v PYTHONIOENCODING="utf8:backslashreplace" -continuous /data/project/wikibugs/py-wikibugs2/bin/python /data/project/wikibugs/wikibugs2/wikibugs.py --logfile /data/project/wikibugs/wikibugs.log [18:44:43] [tools-login.wmflabs.org] out: Please set git user and e-mail *in your own $HOME* before running git [18:44:43] [tools-login.wmflabs.org] out: in this project. In addition, make sure this tool user can read your [18:45:00] It just then spewed... [18:45:01] [tools-login.wmflabs.org] out: [18:45:01] [tools-login.wmflabs.org] out: [18:46:01] Great. [18:46:50] Project beta-scap-eqiad build #210696: 04STILL FAILING in 3 min 5 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210696/ [18:48:49] * Reedy tries fixing his git config on tools as per the spam [18:50:19] paramiko.ssh_exception.SSHException: Timeout openning channel. [18:55:23] https://gerrit-review.googlesource.com/c/gerrit/+/116790 [18:56:46] Project beta-scap-eqiad build #210697: 04STILL FAILING in 2 min 56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210697/ [18:57:29] hrm, maybe keyholder is unarmed in beta [18:58:24] sho'nuff [18:58:46] !log @deployment-tin: sudo keyholder arm [18:58:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:01:08] ah right, reboots [19:01:32] Didn't we have a check for keyholder not being armed? Or is that prod? [19:02:17] thcipriani: sanity check https://gerrit.wikimedia.org/r/#/c/436601/ ? [19:03:03] I guess the state on deploy1001 was copied from tin? I see the repo does exist there, that's good. [19:08:28] Yippee, build fixed! [19:08:28] Project beta-scap-eqiad build #210698: 09FIXED in 4 min 48 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/210698/ [19:14:37] !log github: deleted https://github.com/wikimedia/mediawiki-extensions-SolrStore (archived repo) | T196513 [19:14:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:14:40] T196513: Archive the SolrStore extension - https://phabricator.wikimedia.org/T196513 [19:22:17] Yippee, build fixed! [19:22:17] Project beta-update-databases-eqiad build #25988: 09FIXED in 2 min 16 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/25988/ [19:25:08] \o [19:25:29] So, greg-g thcipriani feature flag basically not possible :) [19:31:05] James_F: Jobs have just been submitted... So hopefully it should come back in a few mins [19:31:39] * James_F crosses fingers. [19:32:33] Maybe not though [19:32:33] Warning: sudo() received nonzero return code 1 while executing '/usr/bin/jsub -N wb2-phab -l release=trusty -mem 1G -once -v PYTHONIOENCODING="utf8:backslashreplace" -continuous /data/project/wikibugs/py-wikibugs2/bin/python /data/project/wikibugs/wikibugs2/wikibugs.py --logfile /data/project/wikibugs/wikibugs.log'! [19:32:35] * Reedy files a bug [19:33:02] =o [19:35:57] Is `release=trusty` likely to be the reason? [19:36:20] don't think so [19:36:24] it does 3 jobs... [19:36:24] [tools-login.wmflabs.org] sudo: /usr/bin/jsub -N wb2-irc -l release=trusty -mem 1G -once -v PYTHONIOENCODING="utf8:backslashreplace" -continuous /data/project/wikibugs/py-wikibugs2/bin/python /data/project/wikibugs/wikibugs2/redis2irc.py --logfile /data/project/wikibugs/redis2irc.log [19:36:24] [tools-login.wmflabs.org] out: Your job 9958477 ("wb2-irc") has been submitted [19:36:33] other two look alright [19:37:21] Right. [19:37:40] * Reedy looks what qstat is doing [19:37:42] (1G of RAM?! Oy veh. In my day…) [19:37:54] 1G of ram is a rounding error :P [19:38:08] 9956833 0.30003 wb2-phab tools.wikibu r 06/06/2018 18:45:38 continuous@tools-exec-1406.eqi 1 [19:38:08] 9958476 0.30000 wb2-grrrri tools.wikibu r 06/06/2018 19:30:39 continuous@tools-exec-1412.too 1 [19:38:08] 9958477 0.30000 wb2-irc tools.wikibu r 06/06/2018 19:30:41 continuous@tools-exec-1437.too 1 [19:44:22] https://www.mediawiki.org/wiki/Wikibugs#Muting_wikibugs [19:44:25] uh [19:44:43] https://www.mediawiki.org/wiki/Wikibugs#Deploying_changes [19:47:09] !log github: delete https://github.com/wikimedia/mediawiki-extensions-Genderize (archived repository) | T196108 [19:47:12] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:47:12] T196108: Archive the Genderize extension - https://phabricator.wikimedia.org/T196108 [19:59:09] !log Update mobileapps to 3bf9be5 on BC [19:59:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:09:45] addshore: :/ (re feature flag) [20:11:29] :/ [20:12:07] so, the only reason all the code is scary is because it's tearing apart WikiPage, which is one of the hearts of mediawiki ;) [20:12:26] and when we did the same thing with Revision it probably would have been nice to have even more time on test [20:13:04] and to answer the earlier question a rollback from this would indeed be totally possible [20:13:51] kk [20:15:15] so, is the evil plan possible? :) [20:16:03] I'm not sure... my brain isn't focused enough on it right now. Can you write it down somewhere else other than IRC (phab task?) and we can iterate? [20:16:50] greg-g: yes :0 [20:16:51] :) [20:16:55] cool, thanks :) [20:22:10] hello beta cluster folks [20:22:23] i have a testing request from the parsoid team [20:22:44] no testing [20:22:47] just use enwiki [20:22:55] we're working on language variant support in parsoid. we just deployed it to beta -- but guess what? none of the beta/labs wikis are in languages with variants [20:23:10] Reedy: can't use enwiki, unless you enable pig latin ;) [20:23:14] :D [20:23:30] How urgently do you need a new wiki? And do you have a preferred variant language? [20:23:43] which would also be fine. you could enable pig latin on en.wikipedia.beta.wmflabs.org/ or simple.wikipedia.beta.wmflabs.org/ [20:24:37] Reedy: Well, our work around for now would be to create a test page (eg on deployment.wikimedia.beta.wmflabs) with page-language set to serbian or kurdish [20:24:48] i think I need admin rights in order to reset the page language for an article IIRC [20:25:08] !log deploying 65e979f on ores [20:25:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:25:50] cscott: What's your username on beta? [20:27:04] bugs upon bugs -- i was just trying to create an account on beta... and the captcha image is blank [20:27:19] and it won't let me create an account without entering in the captcha text which i can't see [20:28:14] (firefox 52.8.0 on debian) [20:29:06] Reedy: but my preferred beta wiki for testing language converter would probably be crhwiki [20:29:21] and/or srwiki [20:30:12] Well, as I asked about how urgent it was... I'm creating some prod wikis tomorrow, and I should be able to do some on beta at the same time (deploy windows are a bit busy atm) [20:30:35] ok, crhwiki and srwiki would be greatly appreciated! [20:31:50] The captcha problem is "Failed to load resource: the server responded with a status of 400" from https://deployment.wikimedia.beta.wmflabs.org/w/index.php?title=Special:Captcha/image&wpCaptchaId=32531770 [20:32:06] tested in chrome, too, same thing (blank captcha image) [20:32:13] 10Beta-Cluster-Infrastructure: Create crhwiki and srwiki on beta - https://phabricator.wikimedia.org/T196583#4262403 (10Reedy) [20:32:21] Could be related to the reboots [20:33:04] 10Beta-Cluster-Infrastructure: Create crhwiki and srwiki on beta - https://phabricator.wikimedia.org/T196583#4262414 (10cscott) For future documentation: this is to help testing language converter support in Parsoid/RESTbase. [20:34:23] cscott: I can create you an account and send you a password by email... [20:34:28] What username do you want? [20:34:37] cscott [20:35:01] "There seems to be a problem with your login session; this action has been canceled as a precaution against session hijacking. Please resubmit the form." [20:35:05] Sessions are fscked then [20:35:11] 10Beta-Cluster-Infrastructure: Captchas on beta are blank, so account creation is impossible - https://phabricator.wikimedia.org/T196584#4262415 (10cscott) [20:36:35] 10Beta-Cluster-Infrastructure: Captchas on beta are blank, so account creation is impossible - https://phabricator.wikimedia.org/T196584#4262427 (10cscott) [20:36:44] !log deploying 65ce165 on ores [20:36:46] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:36:58] or cscott you can just create a page ... and Reedy can update the page language as an admin. [20:37:04] subbu: I can't login xD [20:37:08] oh :) [20:37:10] right. [20:37:16] never mind then. [20:37:24] heh [20:37:53] Reedy, I think there is a task about session problems on beta [20:38:03] Yeah, it's dejavu [20:40:25] 10Release-Engineering-Team, 10Multi-Content-Revisions, 10User-Addshore: Investigate possibility of having some MCR related patches on test / group0 for an extended period - https://phabricator.wikimedia.org/T196585#4262430 (10Addshore) [20:40:36] 10Release-Engineering-Team, 10Multi-Content-Revisions, 10User-Addshore: Investigate possibility of having some MCR related patches on test / group0 for an extended period - https://phabricator.wikimedia.org/T196585#4262440 (10Addshore) One idea would be to create an extra branch for the train / week. So for... [20:40:48] greg-g: thcipriani ^^ :) [20:40:51] 10Release-Engineering-Team, 10Multi-Content-Revisions, 10User-Addshore: Investigate possibility of having some MCR related patches on test / group0 for an extended period - https://phabricator.wikimedia.org/T196585#4262441 (10Addshore) [20:41:01] thanks addshore [20:41:11] feel free to fire any questions my way [20:41:58] Reedy, that error would be triggered by a failed CaptchaCacheStore::retrieve [20:42:10] reedy@deployment-tin:/srv/mediawiki-staging$ mwscript extensions/ConfirmEdit/maintenance/CountFancyCaptchas.php --wiki=aawiki [20:42:10] Current number of captchas is 10000. [20:42:15] so either the key didn't get set or ObjectCache::getMainStashInstance() is pointing to something bad [20:42:44] with sessions being broken I wouldn't be surprised if something's up with the cache servers [20:44:13] the count script only checks what's in swift [20:46:30] 10Release-Engineering-Team (Kanban), 10Wikibugs, 10Patch-For-Review: Deprecate -devtools and redirect to -releng? - https://phabricator.wikimedia.org/T185285#4262451 (10Peachey88) I don't see a need in redirecting the channel/closing it, I've actually seen it a few times when people didn't want to use a busy... [20:47:19] generating new captchas to see if that fixes it [20:47:31] will take 5 minutes [20:47:44] based on https://phabricator.wikimedia.org/T164047#4021553 [20:50:41] Probably no use having 10k on beta with a limited word list [20:54:33] 10Phabricator, 10Developer-Relations (Apr-Jun-2018), 10Patch-For-Review: Try to identify new developers (via assignee field) in Phab tasks and potentially follow up - https://phabricator.wikimedia.org/T195780#4236793 (10bd808) @Aklapper have you looked to see if it would be possible to get the data you need... [21:18:47] jeez [21:18:52] Generated 10000 captchas in 921.3 seconds [21:19:00] Copied 10000 captchas to storage in 159.7 seconds [21:19:09] Deleted 10000 old captchas in 143.4 seconds [21:20:24] hhvm? [21:20:46] Krenair: You should've seen the script before I did a lot of work on it :P [21:23:24] Most of that time is the python script [21:38:27] anyway it doesn't appear to have helped [21:39:02] heh [21:40:36] !log BC: Update mobileapps to 5ea008c [21:40:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:07:44] (03PS1) 10Legoktm: makerelease: Use mediawiki/vendor instead of composer [tools/release] - 10https://gerrit.wikimedia.org/r/437867 [22:09:04] (03PS2) 10Legoktm: makerelease: Use mediawiki/vendor instead of composer [tools/release] - 10https://gerrit.wikimedia.org/r/437867 [22:10:02] (03CR) 10Legoktm: makerelease: Use mediawiki/vendor instead of composer (031 comment) [tools/release] - 10https://gerrit.wikimedia.org/r/437867 (owner: 10Legoktm) [22:31:03] login not working yet? [22:32:22] I don't think anyone has tried to fix it [22:32:47] (03PS3) 10Legoktm: makerelease: Don't run composer [tools/release] - 10https://gerrit.wikimedia.org/r/437867 [22:42:21] probably means sessions are buggared up... [22:45:23] (03CR) 10Krinkle: [C: 031] makerelease: Don't run composer [tools/release] - 10https://gerrit.wikimedia.org/r/437867 (owner: 10Legoktm) [22:51:43] Reedy: something something kick redis? [23:02:29] probably [23:14:12] is logstash beta behind a hardcoded username/password? [23:19:42] yeah [23:20:04] Chrome doesn't show it (stupidly) but it's in the popup notification when trying to login [23:20:09] it == how to get it from tin [23:20:27] The site says: “Logstash (ssh deployment-tin.eqiad.wmflabs sudo cat /root/secrets.txt)” [23:23:46] confd.service: Unit entered failed state. [23:23:49] for deployment-redis [23:23:54] 2018-06-06T23:23:27Z deployment-redis05 /usr/bin/confd[1649]: FATAL Cannot get nodes from SRV records lookup _etcd._tcp.-scheme: invalid domain name [23:23:58] DNS issues? [23:24:49] I love how a fatal is INFO [23:37:47] greg-g: confd won't start for some reason, seems to be causing most of the problems [23:44:58] :/ [23:45:13] And no one from ops seems to be around to help [23:45:23] (ignoring cloud as they're pretty busy) [23:54:31] 10Beta-Cluster-Infrastructure, 10Operations: confd broken on deployment-redis hosts - https://phabricator.wikimedia.org/T196596#4262770 (10Reedy) [23:55:20] 10Beta-Cluster-Infrastructure, 10Operations: confd broken on deployment-redis hosts - https://phabricator.wikimedia.org/T196596#4262785 (10Reedy) p:05Triage>03High [23:56:27] 10Beta-Cluster-Infrastructure, 10Operations: confd broken on deployment-redis hosts - https://phabricator.wikimedia.org/T196596#4262770 (10Reedy)