[00:14:47] 10Release-Engineering-Team (Pipeline), 10Wikibugs: wikibugs should not notify IRC about "business as usual" messages from PipelineBot - https://phabricator.wikimedia.org/T235009 (10MaxSem) [00:24:26] 10Release-Engineering-Team, 10Wikibugs, 10Patch-For-Review: Exclude secondary jenkins-bot/PipelineBot messages from Gerrit in Wikibugs on IRC - https://phabricator.wikimedia.org/T201261 (10Krinkle) [00:24:29] 10Release-Engineering-Team (Pipeline), 10Wikibugs: wikibugs should not notify IRC about "business as usual" messages from PipelineBot - https://phabricator.wikimedia.org/T235009 (10Krinkle) [05:01:39] PROBLEM - Gerrit JSON on gerrit.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Gerrit%23Monitoring [05:02:01] PROBLEM - Gerrit Health Check on gerrit.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://gerrit.wikimedia.org/r/config/server/healthcheck%7Estatus [05:09:02] Project mediawiki-core-doxygen-docker build #10471: 04FAILURE in 5 min 0 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen-docker/10471/ [05:13:24] Project beta-code-update-eqiad build #267139: 04FAILURE in 10 min: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/267139/ [05:13:57] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10serviceops-radar, 10Upstream: Gerrit account cache has a faulty reentrant lock causing http/sendemail threads to stall completely - https://phabricator.wikimedia.org/T224448 (10Marostegui) I have restarted ger... [05:14:29] RECOVERY - Gerrit JSON on gerrit.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 25575 bytes in 0.042 second response time https://wikitech.wikimedia.org/wiki/Gerrit%23Monitoring [05:14:48] Project beta-code-update-eqiad build #267140: 04STILL FAILING in 1 min 24 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/267140/ [05:14:49] RECOVERY - Gerrit Health Check on gerrit.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 865 bytes in 0.057 second response time https://gerrit.wikimedia.org/r/config/server/healthcheck%7Estatus [05:24:26] Yippee, build fixed! [05:24:26] Project beta-code-update-eqiad build #267141: 09FIXED in 1 min 25 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/267141/ [06:10:35] Yippee, build fixed! [06:10:36] Project mediawiki-core-doxygen-docker build #10472: 09FIXED in 6 min 33 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen-docker/10472/ [07:01:24] Yippee, build fixed! [07:01:25] Project mwcore-phpunit-coverage-master build #225: 09FIXED in 4 hr 1 min: https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/225/ [07:16:53] awight: kostajh: good morning. Eventually Quibble 0.0.36 I have cut yesterday night is broken ... :) [07:18:15] hashar: lmk if I can do anything! [07:18:43] I broke it out of a dumb mistake :-\\\ [07:18:53] when moving to setup.cfg, some of the section/setting was plain wrong [07:20:24] ah hehe I probably shouldn't have reviewed those, since I've never used personally [07:20:36] well stuff breaks :-] [07:20:45] until we get a full scale integration test suite, those would be hard to catch [07:22:00] That setuptools-scm trick is cool... [07:22:08] (03PS1) 10Hashar: Release Quibble 0.0.37 [integration/quibble] - 10https://gerrit.wikimedia.org/r/541748 [07:22:32] +1 I've been thinking about gate-and-submit for quibble, basically just running the mw-core job. [07:23:56] yeah the devil is that we would need a docker image having all the binary we need [07:24:08] which are the releng/quibble* containers [07:24:23] but those lack tox or would not recognize installing quibble out of the blue [07:24:44] (03CR) 10Hashar: [C: 03+2] Release Quibble 0.0.37 [integration/quibble] - 10https://gerrit.wikimedia.org/r/541748 (owner: 10Hashar) [07:25:42] (03Merged) 10jenkins-bot: Release Quibble 0.0.37 [integration/quibble] - 10https://gerrit.wikimedia.org/r/541748 (owner: 10Hashar) [07:26:31] Maybe we could just run quibble "bare"? [07:27:47] awight: bare? [07:28:08] !log Tag Quibble 0.0.37 @ 387d33c13 [07:28:09] Meaning, not as a docker image but just the python package. [07:28:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:28:36] Although, that might end up being more complex than just building a temporary docker image. [07:30:11] (03PS1) 10Hashar: changelog: being new version cycle [integration/quibble] - 10https://gerrit.wikimedia.org/r/541750 [07:30:31] (03CR) 10Hashar: [C: 03+2] changelog: being new version cycle [integration/quibble] - 10https://gerrit.wikimedia.org/r/541750 (owner: 10Hashar) [07:30:43] anyway awight, kudos for the idea of wrapping the generated LocalSettings.php [07:30:58] One of my rare insights ;-) [07:31:20] decoupling the build plan from actual execution was a great one as well [07:31:24] and all the tests! [07:31:28] (03Merged) 10jenkins-bot: changelog: being new version cycle [integration/quibble] - 10https://gerrit.wikimedia.org/r/541750 (owner: 10Hashar) [07:32:15] I had a crazy idea yesterday, which I muddled by trying to claim it was "monadic". it is not... but the idea is https://gerrit.wikimedia.org/r/#/c/integration/quibble/+/519776/ [07:32:22] (don't read now :-) [07:33:36] tl;dr, I think I have a way to make all pipelines and steps the same type of computation, with the ability to expand into parallelizable steps as needed. [07:34:18] You probably noticed that I've been obsessing over this detail for months, with nothing to show for it :-) This is probably just more of the same, but it finally feels "right". [07:35:18] & it gives us a one-repo, unit-y phase with minimal cloning, followed by the full integration tests [07:35:35] (03PS1) 10Hashar: docker: bump quibble to 0.0.37 [integration/config] - 10https://gerrit.wikimedia.org/r/541751 [07:35:49] (03CR) 10Hashar: [C: 03+2] docker: bump quibble to 0.0.37 [integration/config] - 10https://gerrit.wikimedia.org/r/541751 (owner: 10Hashar) [07:36:44] awight: do not blame yourself, it surely takes a lot of thought / trial and errors before figuring out a solution [07:37:00] For dependencies (e.g. `npm install` before certain steps), my idea is to pass in a command object to each dependent step to inject before itself. The NpmInstall command will be idempotent. It's just a start... [07:37:26] I should have used a Makefile (famous last words) [07:37:28] (03Merged) 10jenkins-bot: docker: bump quibble to 0.0.37 [integration/config] - 10https://gerrit.wikimedia.org/r/541751 (owner: 10Hashar) [07:37:38] Well I enjoy trial & error, the only thing that really bothers me about this particular issue is that I cannot get perspective on whether there's actually a problem I'm solving or not. [07:37:45] !log Build Quibble 0.0.37 docker containers [07:37:46] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:37:56] Hahaha no way. I've spent years in the Makefile trenches, it's not a nice place. [07:38:07] actually [07:38:36] my first iteration ( to what would eventually become Quibble ) has been a Makefile [07:39:15] I secretly love makefiles. But u don't want them to get too complex, IMO [07:39:29] And it would severely restrict the number of people who could contribute. [07:39:39] definitely [07:39:41] so I dished that out [07:39:51] wasted a couple hours trying to write something in PHP [07:40:02] (which I hate for that kind of tools) [07:40:19] and just went porting the existing jjb/shell script to a single python script [07:40:41] This is the sort of origin story that billion-dollar startups are made of ;-) [07:41:55] I don't think that project is going to be worth an IPO anytime soon! [07:42:03] I certainly hope not [08:10:00] * hashar tries docker run -v /home/hashar/projects:/srv/git:ro docker-registry.wikimedia.org/releng/quibble-stretch-php72:0.0.37 [08:35:02] [0-5] Error in "User should be able to log in @daily" [08:35:04] Can't call getText on element with selector "#pt-userpage" because element wasn't found [08:35:06] :-\ [08:35:08] really [08:35:10] I should have build hte container locally before cutting a new release [08:35:12] grr [09:17:14] awight: sorry for missing the problem [09:17:46] With setup.cfg [09:18:03] Anyway, can I do something to help, is 0.0.37 having issues too? [09:28:09] kostajh: yeah it is broken :-\ [09:28:15] setting $wgServer mess up with session somehow [09:28:19] causing a selenium test to fail bah [09:28:23] https://phabricator.wikimedia.org/T235023 [09:36:43] I guess there are some confusion between http://localhost vs http://127.0.0.1 [09:36:50] which confuses the session id somehow [09:38:58] Is it possible to rollback the $wgServer patch for now? [09:52:18] probably yes [09:52:27] eventually tracked it down to https://phabricator.wikimedia.org/T235023#5558660 [09:52:36] we set MW_SERVER=127.0.0.1:9412 [09:52:50] but have the installer use $wgServer=http://localhost:9412 [09:52:57] Very nice! [09:53:03] so one login properly and the cookie is set for 127.0.0.1 [09:53:11] then Special:UserLogin redirects to http://localhost:9412 [09:53:14] and there is no cookie [09:53:20] so that is an anonymous session :-\ [09:53:29] guess wgServer should be set to 127.0.0.1 instead [09:53:30] bah [09:53:48] 10Continuous-Integration-Infrastructure, 10MediaWiki-General, 10Quibble: Quibble should not rely on dynamically detecting the value of $wgServer - https://phabricator.wikimedia.org/T233140 (10hashar) Setting $wgServer leads to a Selenium test to fail :-\ T235023 [09:54:03] 10Continuous-Integration-Infrastructure, 10MediaWiki-General, 10Quibble: Quibble should not rely on dynamically detecting the value of $wgServer - https://phabricator.wikimedia.org/T233140 (10hashar) [09:54:28] so hmm , needs some fixes in Quibble regarding install.php --server [09:54:29] bah [09:54:46] might handle that this afternoon. At leat the new quibble containers are not deployed yet [09:57:23] PROBLEM - Host deployment-cache-upload05 is DOWN: CRITICAL - Host Unreachable (172.16.6.210) [09:59:03] (03PS1) 10Hashar: Fix $wgServer to match MW_SERVER [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) [09:59:08] will recheck later today. For now I gotta head out [09:59:47] (03CR) 10jerkins-bot: [V: 04-1] Fix $wgServer to match MW_SERVER [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) (owner: 10Hashar) [10:02:46] (03PS2) 10Awight: Fix $wgServer to match MW_SERVER [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) (owner: 10Hashar) [10:03:10] (03CR) 10Awight: "PS 2: fix test to match code" [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) (owner: 10Hashar) [10:03:50] 10Continuous-Integration-Config, 10Test-Coverage: phpunit:coverage-edit - Add configuration flag so it can replace phpunit-suite-edit.py - https://phabricator.wikimedia.org/T235031 (10kostajh) [10:06:25] (03CR) 10Kosta Harlan: [C: 03+1] Fix $wgServer to match MW_SERVER [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) (owner: 10Hashar) [10:06:26] RECOVERY - Host deployment-cache-upload05 is UP: PING OK - Packet loss = 0%, RTA = 2.77 ms [10:09:10] (03CR) 10Awight: [C: 03+2] Fix $wgServer to match MW_SERVER [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) (owner: 10Hashar) [10:09:54] (03Merged) 10jenkins-bot: Fix $wgServer to match MW_SERVER [integration/quibble] - 10https://gerrit.wikimedia.org/r/541769 (https://phabricator.wikimedia.org/T235023) (owner: 10Hashar) [10:10:57] (03CR) 10Daimona Eaytoy: [C: 03+1] jjb: Replace docker-ci-src-setup-mw with docker-zuul-cloner followed by docker-ci-src-setup-simple [integration/config] - 10https://gerrit.wikimedia.org/r/539987 (https://phabricator.wikimedia.org/T234062) (owner: 10Jforrester) [10:24:25] PROBLEM - Host deployment-ms-be06 is DOWN: CRITICAL - Host Unreachable (172.16.7.115) [10:24:39] PROBLEM - Host deployment-cumin02 is DOWN: CRITICAL - Host Unreachable (172.16.6.176) [10:34:28] RECOVERY - Host deployment-ms-be06 is UP: PING OK - Packet loss = 0%, RTA = 2.87 ms [10:34:33] RECOVERY - Host deployment-cumin02 is UP: PING OK - Packet loss = 0%, RTA = 3.74 ms [10:37:28] PROBLEM - Host deployment-aqs03 is DOWN: CRITICAL - Host Unreachable (172.16.1.50) [10:47:28] RECOVERY - Host deployment-aqs03 is UP: PING OK - Packet loss = 0%, RTA = 2.77 ms [12:10:06] zeljkof: Fairly safe and concise change, if you feel like reviewing? https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/TwoColConflict/+/541256/ [12:11:30] awight: sure! [12:12:10] (as I claimed with the previous two patches,) this might be the one that fixes everything :-) [12:18:34] awight: I've noticed a different way of disabling VE: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/539942 [12:18:37] would that work for you? [12:28:54] zeljkof: Nice find! replied in CR [12:29:13] * zeljkof is looking... [12:33:30] awight: as far as I can see, it doesn't break anything :D +2ed [12:36:09] I agree that I'm probably missing some glaringly obvious way to disable VE... [12:56:19] awight: daily job is green! https://integration.wikimedia.org/ci/job/selenium-daily-beta-TwoColConflict/ [12:57:44] /o [12:57:53] /o\ \o/ [12:58:00] zeljkof: Thanks for all the help :-) [12:58:13] :D [15:19:24] Do we have any examples of test coverage analysis for C/C++? [15:22:46] RECOVERY - Free space - all mounts on deployment-mwmaint01 is OK: OK: All targets OK [15:33:43] https://gerrit-review.googlesource.com/c/gerrit/+/240194 - performance improvements in JGIT (with NoteDB)! [15:33:55] cc thcipriani ^ [15:49:57] (03CR) 10Jforrester: [C: 03+2] docker: [mediawiki-phan] Avoid unbound variable [integration/config] - 10https://gerrit.wikimedia.org/r/541779 (https://phabricator.wikimedia.org/T235049) (owner: 10Daimona Eaytoy) [15:51:36] (03Merged) 10jenkins-bot: docker: [mediawiki-phan] Avoid unbound variable [integration/config] - 10https://gerrit.wikimedia.org/r/541779 (https://phabricator.wikimedia.org/T235049) (owner: 10Daimona Eaytoy) [15:55:09] !log Docker: [mediawiki-phan] Avoid unbound variable -> 0.5.3 [15:55:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:59:51] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (201910), 10Release Pipeline, 10Maps (Kartotherian): Deployment Pipeline fails with CPS error for Kartotherian - https://phabricator.wikimedia.org/T233316 (10Jdforrester-WMF) Now you're back to the `lerna: not found` error we had before... [16:19:03] (03PS1) 10Jforrester: jjb: Upgrade phan jobs to 0.5.3 [integration/config] - 10https://gerrit.wikimedia.org/r/541846 [16:19:25] (03CR) 10Jforrester: [C: 03+2] jjb: Upgrade phan jobs to 0.5.3 [integration/config] - 10https://gerrit.wikimedia.org/r/541846 (owner: 10Jforrester) [16:21:28] (03Merged) 10jenkins-bot: jjb: Upgrade phan jobs to 0.5.3 [integration/config] - 10https://gerrit.wikimedia.org/r/541846 (owner: 10Jforrester) [16:23:57] (03CR) 10Daimona Eaytoy: "Thanks!" [integration/config] - 10https://gerrit.wikimedia.org/r/541846 (owner: 10Jforrester) [16:28:06] (03CR) 10Jforrester: [C: 03+2] Add a sniff to ensure that setUp and tearDown have :void typehints [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/541813 (https://phabricator.wikimedia.org/T192167) (owner: 10Daimona Eaytoy) [16:29:04] (03Merged) 10jenkins-bot: Add a sniff to ensure that setUp and tearDown have :void typehints [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/541813 (https://phabricator.wikimedia.org/T192167) (owner: 10Daimona Eaytoy) [16:29:34] (03CR) 10jenkins-bot: Add a sniff to ensure that setUp and tearDown have :void typehints [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/541813 (https://phabricator.wikimedia.org/T192167) (owner: 10Daimona Eaytoy) [16:29:49] James_F: thank you! I'll tag 0.29 shortly [16:30:10] Cool. [16:30:43] (03CR) 10Lucas Werkmeister (WMDE): Add a sniff to ensure that setUp and tearDown have :void typehints (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/541813 (https://phabricator.wikimedia.org/T192167) (owner: 10Daimona Eaytoy) [16:32:05] (03CR) 10Daimona Eaytoy: Add a sniff to ensure that setUp and tearDown have :void typehints (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/541813 (https://phabricator.wikimedia.org/T192167) (owner: 10Daimona Eaytoy) [16:32:07] (03CR) 10Jforrester: Add a sniff to ensure that setUp and tearDown have :void typehints (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/541813 (https://phabricator.wikimedia.org/T192167) (owner: 10Daimona Eaytoy) [16:33:31] hi again [16:46:20] 10MediaWiki-Codesniffer, 10LibUp: Upgrade PHPCS to 28.0.0 in all repos - https://phabricator.wikimedia.org/T235113 (10Daimona) [16:50:44] (03PS1) 10Jforrester: Start branching WebAuthn for wmf/ [tools/release] - 10https://gerrit.wikimedia.org/r/541855 (https://phabricator.wikimedia.org/T227242) [16:56:10] (03CR) 10Umherirrender: "I see there no problem when sniffs get suppressed at the begin and than worked on to get it enabled." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/540813 (owner: 10Umherirrender) [17:10:10] (03PS2) 10Hashar: Consistent http host [integration/quibble] - 10https://gerrit.wikimedia.org/r/541806 [17:12:47] (03PS3) 10Hashar: Consistent http host [integration/quibble] - 10https://gerrit.wikimedia.org/r/541806 [17:12:49] (03PS1) 10Hashar: changelog: begin new version cycle [integration/quibble] - 10https://gerrit.wikimedia.org/r/541860 [17:36:52] (03PS1) 10Hashar: releasing: do run quibble before tagging [integration/quibble] - 10https://gerrit.wikimedia.org/r/541862 [17:51:18] !sal [17:51:19] https://tools.wmflabs.org/sal/releng [17:52:12] !log Tagged Quibble 0.0.38 @ d22e0f55b earlier today to fix $wgServer # T235023 [17:52:14] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:52:15] T235023: Selenium test "User should be able to log in @daily" fails with Quibble 0.0.37 - https://phabricator.wikimedia.org/T235023 [17:52:27] !log Build Quibble 0.0.38 CI Docker images [17:52:29] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:55:10] 10Continuous-Integration-Config, 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations: Fix operations/puppet.git "rebase hell" - https://phabricator.wikimedia.org/T224033 (10CDanis) Was this discussed during the Monday meeting? What was the outcome? [17:55:33] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10Quibble: Create an integration test running Quibble with mediawiki/core - https://phabricator.wikimedia.org/T235118 (10hashar) [17:57:44] (03PS2) 10Hashar: releasing: do run quibble before tagging [integration/quibble] - 10https://gerrit.wikimedia.org/r/541862 (https://phabricator.wikimedia.org/T235118) [17:58:39] (03CR) 10Hashar: "I have filled a task to add a test / CI job that actually runs Quibble: T235118 . PS2 points to the task." [integration/quibble] - 10https://gerrit.wikimedia.org/r/541862 (https://phabricator.wikimedia.org/T235118) (owner: 10Hashar) [17:58:59] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10Quibble, 10Patch-For-Review: Create an integration test running Quibble with mediawiki/core - https://phabricator.wikimedia.org/T235118 (10hashar) p:05Triage→03Normal [18:15:34] (03CR) 10Daimona Eaytoy: "> I see there no problem when sniffs get suppressed at the begin and" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/540813 (owner: 10Umherirrender) [18:16:44] 10MediaWiki-Codesniffer, 10LibUp: Upgrade PHPCS to 28.0.0 in all repos - https://phabricator.wikimedia.org/T235113 (10Daimona) As seen in https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/541867/, another thing to do: - disable PSR12.Properties.ConstantVisibility.NotFound in .phpcs.xml if... [18:20:25] (03CR) 10Reedy: [C: 03+2] "Not gonna do any harm" [tools/release] - 10https://gerrit.wikimedia.org/r/541855 (https://phabricator.wikimedia.org/T227242) (owner: 10Jforrester) [18:21:01] (03Merged) 10jenkins-bot: Start branching WebAuthn for wmf/ [tools/release] - 10https://gerrit.wikimedia.org/r/541855 (https://phabricator.wikimedia.org/T227242) (owner: 10Jforrester) [18:42:18] James_F: thank you again! Just a question: is it intentional not to add the PHP7.2 requirement in composer.json? [18:43:05] Daimona: Do we normally do that? I can add it to my script, but I don't think I've seen it in extensions before? Normally we just do it in MediaWiki itself, don't we? [18:43:35] Apparently, yes: https://codesearch.wmflabs.org/deployed/?q=%22php%22%3A&i=nope&files=composer%5C.json&repos= [18:44:09] The new PHPCS will allow PHP72 features, and add void typehints to test, so... [18:44:20] Huh. Yeah, we should probably fix all of those. Maybe drop them all? [18:44:48] Or at least, drop where it's lower than 7.3. [18:44:58] If an extension is using PHP 7.3+ features, that should remain. [18:45:04] Heh, yes [18:45:09] I don't know if dropping is good [18:45:27] The other problem may be if some extensions are still supporting old PHP [18:45:29] `require` rather than `platform`? [18:45:46] Well, they have old branches to continue to try to support that. [18:45:52] For instance, I know that Yaron's extensions still support PHP 5.3 [18:45:58] * James_F sighs. [18:46:10] I wonder if we could somehow respect that. [18:46:13] File a task and we can decide in slower time with more voices? [18:46:56] About breaking old PHP compat or about removing the explicit PHP requirement? [18:47:43] About deciding if we're removing or adding the item. [18:47:49] Or ignoring it. [18:48:12] E.g. the one for Vector is definitely wrong – it relies on MediaWiki code in 1.34, so it doesn't support PHP 5.3. [18:48:58] Sure [18:49:41] Eurgh. `Declaration of function setForceHTTPSCookie($set, ?\MediaWiki\Session\SessionBackend $backend, \WebRequest $request) should be compatible with function setForceHTTPSCookie(bool $set, ?\MediaWiki\Session\SessionBackend $backend = null, \WebRequest $request` [18:49:57] I guess we're going to have to land the core one first and then unbreak repos? Fun. [18:50:37] Heh, it depends. The one for nullables will require changing some signatures - not that many, see https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/541094/ [18:51:51] So, tomorrow I'm going to open a task for that (or feel free to open it now if you wish, I really cannot at the moment). As for nullables, I think that whatever we change first (i.e. core or extensions) will break the other. But as I said, hopefully there aren't many overridden methods [18:53:06] Oh, actually, I've already checked all methods in https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/541094/ and that's the only one which is overridden in an extension [18:53:24] So it's easier than I thought [18:53:42] Well, got to go now. Thanks again! [19:12:41] 10Release-Engineering-Team-TODO, 10Operations, 10serviceops, 10wikitech.wikimedia.org, and 3 others: switch wikitech to PHP 7.2 - https://phabricator.wikimedia.org/T223393 (10dduvall) [19:13:04] 10Release-Engineering-Team-TODO, 10Operations, 10serviceops, 10wikitech.wikimedia.org, and 3 others: switch wikitech to PHP 7.2 - https://phabricator.wikimedia.org/T223393 (10bd808) Spotted while using `eval.php` on labweb1002: we are currently missing the php7.2-ldap package there. [19:17:25] 10Continuous-Integration-Config, 10Release-Engineering-Team (Unit & Int & System Tooling), 10MediaWiki-Core-Testing, 10Browser-Tests, and 2 others: Make MediaWiki Wdio tests less slow (Sept 2019) - https://phabricator.wikimedia.org/T234002 (10Krinkle) [19:23:32] 10Release-Engineering-Team-TODO, 10Operations, 10serviceops, 10wikitech.wikimedia.org, and 3 others: switch wikitech to PHP 7.2 - https://phabricator.wikimedia.org/T223393 (10bd808) >>! In T223393#5560962, @bd808 wrote: > Spotted while using `eval.php` on labweb1002: we are currently missing the php7.2-lda... [19:46:54] !log Upgrading deployment-restbase01.deployment-prep.eqiad.wmflabs to Cassandra 3.11.4 -- T200803 [19:47:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:47:01] T200803: Test/evaluate Cassandra 3.11.4 for production upgrade - https://phabricator.wikimedia.org/T200803 [19:49:24] !log Upgrading deployment-restbase02.deployment-prep.eqiad.wmflabs to Cassandra 3.11.4 -- T200803 [19:49:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:55:56] !log Upgrading deployment-sessionstore01.deployment-prep.eqiad.wmflabs to Cassandra 3.11.4 -- T200803 [19:55:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:55:59] T200803: Test/evaluate Cassandra 3.11.4 for production upgrade - https://phabricator.wikimedia.org/T200803 [20:00:03] Project mwcore-phpunit-coverage-master build #226: 04FAILURE in 5 hr 0 min: https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/226/ [20:00:44] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-CentralAuth, 10Security-Team, 10Beta-Cluster-reproducible, and 2 others: Beta Cluster cross-wiki login request would be blocked by CSP - https://phabricator.wikimedia.org/T211539 (10Krinkle) [20:01:47] !log gerrit set-account owl --active # T234328 [20:01:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:01:50] T234328: biterg.io Gerrit crawling probably stresses the server too much - https://phabricator.wikimedia.org/T234328 [20:12:57] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910), 10Developer-Advocacy, 10wikimedia.biterg.io: biterg.io Gerrit crawling probably stresses the server too much - https://phabricator.wikimedia.org/T234328 (10hashar) Sorry I have completely missed @Aklapp... [20:26:30] ^^ forgot about that one :/ [20:32:45] that is to feed Gerrit data to the nice dashboard at https://wikimedia.biterg.io/ [21:32:19] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (201910), 10Release Pipeline, 10Maps (Kartotherian): Deployment Pipeline fails with CPS error for Kartotherian - https://phabricator.wikimedia.org/T233316 (10dduvall) Looks like `lerna` is one of `devDependencies` and so `npm install --... [21:34:27] thcipriani there are issues with gerrit2001 I think [21:34:29] replication isn't working at all [21:38:47] oh [21:38:52] mutante ^ [21:38:57] we merged a change yesturday [21:39:37] https://gerrit.wikimedia.org/r/monitoring?part=graph&graph=activeThreads [21:39:45] the threads issue happened again? [21:42:22] 10Gerrit, 10Operations: replication/gerrit2001 issues - https://phabricator.wikimedia.org/T235135 (10MarcoAurelio) [21:42:43] 10Gerrit, 10Operations: replication/gerrit2001 issues - https://phabricator.wikimedia.org/T235135 (10MarcoAurelio) p:05Triage→03High [21:43:15] i think https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/541386/ broke it? [21:43:19] paladox: the renaming of the replication target? [21:43:37] i'm thinking that, as gerrit appears to have been restarted this morning [21:43:42] due to thread issues [21:43:57] ack [21:44:42] looking at replication log [21:45:47] I've just killed a `gerrit replication [...] start --wait` [21:45:48] it says the replication to gerrit-replica started.. and then "Cannot replicate" now let's see why [21:46:04] hauskater: oh, where? [21:46:24] mutante: via the CLI [21:46:40] see T235135 [21:46:40] T235135: replication/gerrit2001 issues - https://phabricator.wikimedia.org/T235135 [21:46:41] paladox: i suspected this to be the reason [21:46:45] reject HostKey: gerrit-replica.wikimedia.org [21:46:53] oh [21:46:55] yup [21:48:13] 10Gerrit, 10Operations: replication/gerrit2001 issues - https://phabricator.wikimedia.org/T235135 (10Dzahn) Broken by https://gerrit.wikimedia.org/r/c/operations/puppet/+/541386 when we renamed the replication target yesterday. root cause: reject HostKey: gerrit-replica.wikimedia.org as shown in replicati... [21:49:24] Should we revert, i'm not sure how to fix that one. [21:49:28] there are new phab repos using gerrit.wikimedia instead of gerrit-replica fwiw [21:49:55] not yet, i am looking at the code [21:51:12] ok [21:52:04] 281 if $ssh_host_key != undef { [21:52:04] 282 file { '/var/lib/gerrit2/review_site/etc/ssh_host_key': [21:52:26] paladox: ok, you know what. yes, i will revert it for now. but simply because i dont have the time [21:52:37] we can always do this again later the right way [21:52:43] ok [21:52:53] revert, puppet-merge, let's see if the error goes [21:56:36] mutante: let me know when puppet finishes to merge so I can retry [21:56:39] please [21:57:07] gerrit will need a restart [21:57:42] great [21:57:47] [21:59:01] paladox: when i ran ssh manually as gerrit2 user i already accepted the host key. gerrit does not check in that location though i guess [21:59:16] mutante doin't we store that in puppet? [21:59:22] hauskater: yes, doing the restart [21:59:25] so the file would just get changed when puppet runs [21:59:30] (known_hosts) [21:59:55] oh mutante, forgot that you can restart it [22:00:04] yes, it just needs to be added to that, paladox [22:00:08] ok [22:00:11] well, being on ops I guess you can access all systems [22:00:18] the real deal [22:02:40] hauskater: now [22:02:43] i see 5 tasks [22:02:56] with "show-queue -w" [22:03:27] mutante: great, I'll attempt a new replication run [22:03:35] replication.log looks like working [22:03:43] it is already doing stuff [22:03:56] it is pushing to both github and 2001 [22:04:37] thanks! [22:04:38] it's not saying anything to me [22:05:19] ssh -p 29418 dzahn@gerrit.wikimedia.org gerrit show-queue -w [22:05:27] 3965 tasks [22:06:00] https://gerrit-replica.wikimedia.org/r/monitoring?part=graph&graph=usedMemory that's really wierd how the heap built up around when gerrit.w.org was pilling threads [22:06:13] from 5 to 4000 to 3700. looks fine to me [22:06:14] I'll let the queue clear then [22:06:23] ok [22:07:51] ssh -p 29418 gerrit.wikimedia.org replication start mediawiki/extensions/DiscussionTools --wait <-- server didn't reply to me so I guess it's overloaded or very busy [22:09:08] hauskater: i see it as waiting in the queue [22:09:20] DiscussionTools.git [22:09:33] perfect then [22:10:48] 10Gerrit, 10Operations: replication/gerrit2001 issues - https://phabricator.wikimedia.org/T235135 (10Dzahn) replication.log shows it is replicating again and working on the backlog queue right now. [22:14:05] hauskater: should be already done. the first one is done and the second one is called "retry 1" that was your second attempt [22:15:31] mutante: alright, I'll see if I can get Phab Diffusion clear that error message [22:15:51] I guess now that it can fetch again from gerrit-replica it will vanish in the next update [22:16:17] ok [22:16:29] fsk [22:16:31] STDERR [22:16:31] fatal: your current branch 'master' does not have any commits yet [22:16:43] diffusion is high on drugs or something [22:18:53] was in a meeting: thanks for looking at this all. [22:20:11] thcipriani: now trying to get https://phabricator.wikimedia.org/diffusion/EDTO/manage/basics/ updated [22:20:25] * thcipriani looks [22:20:38] Hitting "Update Now" [22:20:44] but keeps erroring [22:21:34] obviously has content: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/DiscussionTools/ [22:21:36] cloning from gerrit-replica gives me: warning: You appear to have cloned an empty repository. [22:21:47] looks like it's still in the process of mirroring [22:22:00] > 33815f79 waiting .... 22:12:22.758 (retry 1) [68e0f4ef] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/extensions/DiscussionTools.git [22:22:02] it still has 944 tasks left [22:22:09] but down from about 4k [22:22:19] it's fast mutante [22:22:34] yeah, it takes a bit, but not too terribly long in my experience: 10/15 minutes to catchup with everything after restart [22:22:36] thcipriani: that retry 1 was me doing a manual replication run [22:22:52] gotcha [22:22:53] there were 2 of that DiscussionTools.git task [22:22:59] and one is gone now [22:23:07] the first one was when the repo was created [22:23:15] the retry 1 was me [22:23:30] once the retry is clear in the queue, I'd expect to see the diffusion repo be happy again [22:23:37] probably we could have just added the key instead of reverting, but just to much going on [22:24:01] fair enough [22:24:53] server name in known hosts: something I should have considered when +1ing. Every solution is a problem :) [22:26:35] I completly forgot about that thcipriani :P [22:26:49] (that it needed to go into known_hosts file in puppet) [22:26:53] thanks hauskater for finding out the bug [22:26:57] :P [22:26:57] until today [22:27:07] hauskater lol [22:27:22] this time i should have done the restart right after merge.. oh well [22:29:10] mutante i have a change so that we have less to worry [22:29:16] when merging replication changes [22:29:16] i take blame also because i actually asked paladox to do this change just to avoid the hardcoded host name for the future. the good part should be we can probably keep the same key that way [22:29:30] the auto_reload change? [22:29:33] yup [22:29:44] though that can wait till next time as that'll need another restart [22:30:05] then we won't need to do anymore restarts :) (only to get replication to re read the config) [22:30:25] did the retry 1 finished too? [22:30:34] having the option to separate merge and restart is not bad though in other cases [22:30:38] * hauskater cannot see the replication.log [22:30:44] (I don't think I can) [22:30:58] not yet [22:31:35] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (201910), 10Release Pipeline, 10Maps (Kartotherian): Deployment Pipeline fails with CPS error for Kartotherian - https://phabricator.wikimedia.org/T233316 (10Mathew.onipe) @dduvall Thanks!. I removed the test stage also forced devdeps t... [22:31:35] hauskater: i am not looking in replication.log for that part. i run "gerrit show-queue -w" via gerrit ssh port [22:32:11] I'll see which error throws to me ;) [22:32:38] 197 [22:32:41] no error, wow [22:32:43] :D [22:32:54] yeh gerrit-managers can view that command [22:33:16] well, what happened now, it has 1000 new tasks again [22:33:28] index changes , heh [22:33:59] yup [22:34:28] "Get changes to reindex caused by refs/notes/review update of project mediawiki/core" [22:34:38] and they all got in the queue before your retry1 which existed first, heh [22:37:23] diffusion repo is now fully imported [22:37:26] finally [22:37:33] now I can go to sleep [22:37:55] hauskater: ok, great! good night then [22:38:51] Danke mutante, same for you when the time comes [22:40:32] Diffusion is busy publishing new commits to Phabricator as well, so it seems it's catching up too [22:41:49] 10Gerrit, 10Operations: replication/gerrit2001 issues - https://phabricator.wikimedia.org/T235135 (10MarcoAurelio) 05Open→03Resolved a:03Dzahn It looks everything is back to normal now. [22:42:02] Bitte. sorry for breaking it. it's under 100 tasks again [22:42:20] meanwhile installing php7.3 packages on buster server for phabricator now [22:42:37] You didn't. Paladox did (lol) [22:42:47] some other issues left like mailx package missing [22:42:47] :D [22:43:10] it's not like we deleted operations/puppet.git [22:43:12] xD [22:43:43] i should have tested after merge though [22:43:58] anyways.. it has another 1300 new tasks but seems normal [22:44:14] indexing changes in batches maybe? [22:44:25] it is index change in mw/core again [23:09:08] somebody remember why are we installing "subversion" package :) [23:09:24] on phab servers [23:12:44] mutante for the the old svn [23:12:45] repo [23:13:24] https://phabricator.wikimedia.org/diffusion/TSVN/ [23:15:14] to import it once or ..needed permanently? [23:17:18] needed permanently [23:17:30] since phabricator would run the svn commands on the repo. [23:18:26] ok, thanks [23:19:12] next issue: Package[wikimedia-lvs-realserver]/ensure: change from 'purged' to 'present' failed: [23:19:55] oh. this does not look too great: [23:20:01] Cloning into '/srv/deployment/phabricator/deployment-cache/cache/arcanist'.. [23:20:08] error: Unable to find f43c63ef5aaafaa8bf32ba4784d167a0448efd1a [23:20:23] Cannot obtain needed object [23:22:14] :O [23:22:30] twentyafterfour ^ [23:22:54] hmm [23:23:00] paladox: looks like the biggest blocker will be for now that we need wikimedia-lvs-realserver for buster [23:23:09] i gotta make a subtask [23:23:13] ok [23:23:53] twentyafterfour: hi. the phab role is now on phab1001. this is about the puppet run there cloning it the first time [23:23:54] is it a scap package? why is it cloning into arcanist? [23:24:52] 23:19:07 deploy-local failed: [23:24:56] RAN: /usr/bin/git submodule update --init --recursive --jobs 30 [23:25:20] Execution of '/usr/bin/scap deploy-local --repo phabricator/deployment -D log_json:False' [23:28:18] mutante: I just updated things on deploy1001, wanna try it agian? [23:28:38] yep, running it! [23:29:23] looks like it did not change yet [23:29:27] it comes from Dependency Package[phabricator/deployment] has failures: [23:30:35] hmm [23:32:07] that scap::target has a: [23:32:09] 210 require => File['/usr/local/sbin/phab_deploy_finalize'], [23:32:36] maybe we have to run that [23:33:17] For it to say "error: Unable to find f43c63ef5aaafaa8bf32ba4784d167a0448efd1a " it means the object is not in the repo. [23:33:47] yea, that part is a bit worrying [23:34:22] it tries to update all submodules and this is when this happens [23:34:39] https://phabricator.wikimedia.org/source/arcanist/ [23:34:53] that object is certainly in the repo [23:34:58] twentyafterfour is it expected to have last updated in 2018? [23:35:14] https://phabricator.wikimedia.org/rARCf43c63ef5aaafaa8bf32ba4784d167a0448efd1a [23:35:16] it claims it is looking "under http://deploy1001.eqiad.wmnet/phabricator/deployment/.git/modules/arcanist" [23:35:21] http? [23:35:29] er [23:35:36] https://phabricator.wikimedia.org/rARCf43c63ef5aaafaa8bf32ba4784d167a0448efd1a [23:35:59] mutante: yes http that's how scap does it [23:36:04] ok [23:36:09] unfortunately [23:38:45] I made sure that commit is fetched onto the deplotment server, there should be no reason for it to fail now [23:39:18] 10Release-Engineering-Team-TODO, 10Operations, 10serviceops, 10wikitech.wikimedia.org, and 2 others: switch wikitech to PHP 7.2 - https://phabricator.wikimedia.org/T223393 (10Dzahn) 05Open→03Resolved switched over by @andrew and @bd808 [23:41:13] twentyafterfour: unfortunately puppet still showing that error [23:41:34] Unable to find f43c63ef5aaafaa8bf32ba4784d167a0448efd1a [23:41:40] Unable to find fa50d1a5eaa7901d0f34125e190a5be52db6f8ce [23:42:13] the second one is from Cloning into '/srv/deployment/phabricator/deployment-cache/cache/libext/misc'... [23:45:11] hmm [23:45:12] we dont need to fix it right now. there are other issues that keep us from switching [23:45:19] just pointing out what we have open [23:45:26] this and the lvs package [23:46:49] twentyafterfour: maybe we did the submodule init manually last time? this is "/usr/bin/git submodule update --init --recursive --jobs 30" [23:47:02] though all that does not change what paladox said [23:47:22] mutante: maybe, but I'd like it to deploy automatically with scap so I'm trying to figure out why it won't [23:47:32] i am making a subtask for the lvs package we need [23:47:45] twentyafterfour: cool, yes [23:48:11] i doin't think it's a issue with the submodule command [23:49:27] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, 10serviceops: package wikimedia-lvs-realserver for buster - https://phabricator.wikimedia.org/T235140 (10Dzahn) [23:49:43] paladox: it's trying to fetch those objects from the deployment server. The objects exist on the diffusion repo but somehow it's missing from deploy1001 or it's just not looking for the right thing [23:49:46] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, 10serviceops: package wikimedia-lvs-realserver for buster - https://phabricator.wikimedia.org/T235140 (10Dzahn) a:05Dzahn→03None [23:49:55] oh [23:50:07] twentyafterfour do the objects exist on deploy1001? [23:50:13] they should [23:50:20] I'm trying to debug that now [23:50:22] does deploy1001 use phabricator o gerrit for the repo? [23:50:23] *or [23:52:36] paladox: phabricator. but good thinking [23:52:38] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Reimage both phab1001 and phab2001 to stretch / buster - https://phabricator.wikimedia.org/T190568 (10Dzahn) >>! In T190568#5320142, @MoritzMuehlenhoff wrote: >>>! In T190568#53193... [23:52:52] hmm [23:53:13] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Reimage both phab1001 and phab2001 to stretch / buster - https://phabricator.wikimedia.org/T190568 (10Dzahn) @Muehlenhoff Currently moving to buster is blocked by T235140 [23:53:59] mutante https://phabricator.wikimedia.org/T190568 (since the db is read only in codfw, we won't be able to switch to it) [23:54:03] paladox: well, puppet says it can't run that command succesfully. so in some way it must be an issue with that command. but maybe indirectly [23:54:06] we can run a read only replica [23:54:08] Unable to find keyholder key for phab_deploy [23:54:18] paladox: i know. see second comment after that [23:54:37] ok [23:54:49] twentyafterfour that'll be it :P [23:54:52] oh [23:55:00] hold on, did it even clone the repo correctly? [23:55:14] i cant continue working on this. it is 5pm and i was supposed to have OKRs :/ [23:55:30] it was just a status report :) [23:57:54] mutante: thanks you don't have to! [23:58:12] the status report is much appreciated though, [23:59:45] yep, very welcome. wanted you to know there is movement. thanks for looking at it