[00:09:14] 10Deployments, 10Release-Engineering-Team: Make "scap lock" easy to use and adopt as standard practice - https://phabricator.wikimedia.org/T234407 (10Krinkle) [00:09:29] 10Deployments, 10Release-Engineering-Team: Make "scap lock" easy to use and adopt as standard practice - https://phabricator.wikimedia.org/T234407 (10Krinkle) [03:07:31] !log Fix "setup quibble mw-install" Jenkins Console section to not eat all output on new quibble-selenium jobs (T232759) [03:07:34] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [03:07:35] T232759: Move CI selenium/qunit tests of mediawiki repository to a standalone job - https://phabricator.wikimedia.org/T232759 [06:31:23] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10HHVM, 10Patch-For-Review: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 (10Joe) FTR, we did remove hhvm from production meaning we're not serving traffic with it, and we won't go back anymore. [06:32:27] 10Beta-Cluster-Infrastructure, 10Puppet: Puppet fail on deployment-mediawiki-07, missing private hiera variable - https://phabricator.wikimedia.org/T210497 (10mobrovac) 05Open→03Resolved a:03fgiunchedi Puppet is running fine there, closing. [06:35:01] 10Beta-Cluster-Infrastructure, 10Operations, 10Traffic, 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10mobrovac) [07:03:12] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10Neolexx) >>! In T233561#5537361, @Tgr wrote: > In general autoreviews are hard to verify since you have to... [07:12:18] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10MBH) To all discussion participants: don't listen Neolexx. This user can't understand Flagged Revs mechani... [07:53:10] (03PS1) 10Hashar: Raise timeout of operations-puppet-test to 5 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/540358 [07:53:45] (03CR) 10Hashar: [C: 03+2] "INFO:jenkins_jobs.builder:Reconfiguring jenkins job operations-puppet-tests-stretch-docker" [integration/config] - 10https://gerrit.wikimedia.org/r/540358 (owner: 10Hashar) [07:56:51] (03Merged) 10jenkins-bot: Raise timeout of operations-puppet-test to 5 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/540358 (owner: 10Hashar) [08:36:54] (03PS1) 10Hashar: Revert "Move mwext-codehealth jobs to a dedicated label" [integration/config] - 10https://gerrit.wikimedia.org/r/540362 [08:37:16] (03CR) 10Hashar: [C: 03+2] Revert "Move mwext-codehealth jobs to a dedicated label" [integration/config] - 10https://gerrit.wikimedia.org/r/540362 (owner: 10Hashar) [08:38:34] (03PS1) 10Hashar: Revert "Tie mwcore-phpunit-coverage-master to a Jessie docker" [integration/config] - 10https://gerrit.wikimedia.org/r/540363 [08:38:44] (03CR) 10Hashar: [C: 03+2] Revert "Tie mwcore-phpunit-coverage-master to a Jessie docker" [integration/config] - 10https://gerrit.wikimedia.org/r/540363 (owner: 10Hashar) [08:40:21] (03Merged) 10jenkins-bot: Revert "Move mwext-codehealth jobs to a dedicated label" [integration/config] - 10https://gerrit.wikimedia.org/r/540362 (owner: 10Hashar) [08:41:24] (03Merged) 10jenkins-bot: Revert "Tie mwcore-phpunit-coverage-master to a Jessie docker" [integration/config] - 10https://gerrit.wikimedia.org/r/540363 (owner: 10Hashar) [08:45:13] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10Neolexx) >>! In T233561#5540276, @MBH wrote: > Don't listen him and his false statements, for example abou... [08:50:29] 10Beta-Cluster-Infrastructure, 10DNS, 10Operations, 10Traffic, and 4 others: Ferm's upstream Net::DNS Perl library questionable handling of NOERROR responses without records causing puppet errors when we try to @resolve AAAA in labs - https://phabricator.wikimedia.org/T153468 (10MoritzMuehlenhoff) 05Open... [09:13:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10serviceops, 10PHP 7.2 support, 10Test-Coverage: Upgrade our php-xdebug package for php7.2 - https://phabricator.wikimedia.org/T234418 (10hashar) [09:21:47] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10serviceops, 10PHP 7.2 support, 10Test-Coverage: Upgrade our php-xdebug package for php7.2 - https://phabricator.wikimedia.org/T234418 (10hashar) perf for the php process (which runs under a Docker stretch conta... [10:29:08] 10Beta-Cluster-Infrastructure, 10Operations, 10Traffic, 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10ema) p:05Triage→03Normal [10:31:35] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Quibble, and 3 others: CI: Create a way to share a secret between MediaWiki and the testing framework. - https://phabricator.wikimedia.org/T233092 (10LarsWirzenius) I'm not familiar with... [10:36:42] 10Beta-Cluster-Infrastructure, 10Operations, 10Traffic, 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10Vgutierrez) This is caused by adding the ATS-TLS instance to the text cluster. So you need to provide a valid configuration for the ats-tls profile. See:... [10:43:01] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Quibble, and 3 others: CI: Create a way to share a secret between MediaWiki and the testing framework. - https://phabricator.wikimedia.org/T233092 (10daniel) >>! In T233092#5540581, @Lars... [10:57:51] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10User-zeljkofilipin: Upgrade webdriverio to version 5 for all repositories - https://phabricator.wikimedia.org/T234314 (10zeljkofilipin) 05Open→03Stalled Blocked on T234002. [10:59:38] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10User-zeljkofilipin: Upgrade webdriverio to version 5 for all repositories - https://phabricator.wikimedia.org/T234314 (10zeljkofilipin) @Krinkle Thanks for the warning, I've assumed wdio-mediawiki was update... [11:00:15] 10Beta-Cluster-Infrastructure, 10Operations, 10Traffic, 10Core Platform Team Workboards (Clinic Duty Team), 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10mobrovac) 05Open→03Resolved a:03mobrovac As per @Vgutierrez' instructions, I looked up the ATS... [11:43:29] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10MBH) Of course, this bug should never happen for users with "reviewer" rights, 'cause it happens only for... [11:56:16] (03PS1) 10Awight: Parameter to skip npm install [integration/quibble] - 10https://gerrit.wikimedia.org/r/540387 (https://phabricator.wikimedia.org/T225008) [12:02:34] (03PS1) 10Awight: Skip `npm install` for PHPUnit coverage jobs [integration/config] - 10https://gerrit.wikimedia.org/r/540389 (https://phabricator.wikimedia.org/T225008) [12:07:00] (03CR) 10WMDE-Fisch: [C: 03+1] Reenable TwoColConflict browser tests [integration/config] - 10https://gerrit.wikimedia.org/r/540098 (https://phabricator.wikimedia.org/T234311) (owner: 10Awight) [12:07:30] (03CR) 10WMDE-Fisch: [C: 03+1] "...and the dependency is merged. Hope this works now." [integration/config] - 10https://gerrit.wikimedia.org/r/540098 (https://phabricator.wikimedia.org/T234311) (owner: 10Awight) [12:23:50] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201909), 10User-zeljkofilipin: Update existing Selenium documentation - https://phabricator.wikimedia.org/T232598 (10zeljkofilipin) [12:37:44] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201909), 10User-zeljkofilipin: Update existing Selenium documentation - https://phabricator.wikimedia.org/T232598 (10zeljkofilipin) [12:38:36] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201909), 10User-zeljkofilipin: Update existing Selenium documentation - https://phabricator.wikimedia.org/T232598 (10zeljkofilipin) [12:38:53] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201909), 10User-zeljkofilipin: Update existing Selenium documentation - https://phabricator.wikimedia.org/T232598 (10zeljkofilipin) 05Open→03Resolved [12:59:34] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (201910), 10MediaWiki-extensions-CentralAuth: [betalabs] memcached listens solely on 127.0.0.1 (was: Cannot create a new user account) - https://phabricator.wikimedia.org/T232796 (10zeljkofilipin) [13:01:29] 10Continuous-Integration-Config, 10Release-Engineering-Team (Unit & Int & System Tooling), 10MediaWiki-Core-Testing, 10Browser-Tests, and 3 others: Make MediaWiki Wdio tests less slow (Sept 2019) - https://phabricator.wikimedia.org/T234002 (10kostajh) Have folks considered using `child_process.exec()` to r... [13:04:02] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10Neolexx) As so far there are very few of "of course" things in it. Let's us take fact that "patrolling" is... [13:11:31] 10Release-Engineering-Team (Local Dev), 10Release-Engineering-Team-TODO (201910), 10local-charts, 10User-zeljkofilipin: Error: error installing: the server could not find the requested resource - https://phabricator.wikimedia.org/T233960 (10zeljkofilipin) a:05zeljkofilipin→03None [13:25:04] 10Phabricator: Reset 2FA for Phabricator account `Apap04` - https://phabricator.wikimedia.org/T234233 (10Aklapper) @Apap04: The basic question is: In which way do you think you could verify that you are who you state that you are? For copying files, probably `scp` but that is out of scope for this task. [13:37:50] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10Neolexx) >>! In T233561#5540891, @Neolexx wrote: > but only `reviewer` may set it to 2, 3 or 1001. OK, pur... [13:38:56] Project mwcore-phpunit-coverage-master build #211: 04FAILURE in 5 hr 0 min: https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/211/ [13:48:32] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10Neolexx) Fresh to whatever UTC-related time (you figure out) problematic article states: https://ru.wikipe... [13:55:51] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10User-zeljkofilipin: Selenium tests should be easier to run - https://phabricator.wikimedia.org/T182691 (10zeljkofilipin) [13:55:56] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10Documentation, and 2 others: Blog posts about new Selenium framework features - https://phabricator.wikimedia.org/T191982 (10zeljkofilipin) [13:56:00] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10Browser-Tests, and 2 others: Usage instructions in tests/selenium/README.md are confusing - https://phabricator.wikimedia.org/T214708 (10zeljkofilipin) [13:56:02] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10User-zeljkofilipin: Update files required by Selenium in core, extensions and skins - https://phabricator.wikimedia.org/T210726 (10zeljkofilipin) [13:56:10] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10Patch-For-Review, 10User-zeljkofilipin: Run tests daily targeting beta cluster for all repositories with Selenium tests - https://phabricator.wikimedia.org/T188742 (10zeljkofilip... [13:56:22] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10User-zeljkofilipin, 10good first bug: All repositories with Selenium tests should use wdio-mediawiki - https://phabricator.wikimedia.org/T199113 (10zeljkofilipin) [13:56:37] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10Ruby, 10User-zeljkofilipin: Mark mediawiki_api and mediawiki_selenium Ruby gems as deprecated - https://phabricator.wikimedia.org/T228160 (10zeljkofilipin) [13:56:49] (03PS1) 10Awight: Log command versions [integration/quibble] - 10https://gerrit.wikimedia.org/r/540413 (https://phabricator.wikimedia.org/T181942) [13:57:02] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10MediaWiki-Vagrant, 10User-zeljkofilipin: mediawiki/core Selenium tests fail when targeting mediawiki/vagrant VM - https://phabricator.wikimedia.org/T233820 (10zeljkofilipin) [13:57:47] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10TCB-Team, 10Two-Column-Edit-Conflict-Merge, and 5 others: Fix and restore daily browser tests for TwoColConflict - https://phabricator.wikimedia.org/T234311 (10zeljkofilipin) p:05Triage→03Normal [14:08:37] 10Release-Engineering-Team-TODO (201909), 10Release Pipeline: Update PipelineLib Base Image to use WMF-registry image - https://phabricator.wikimedia.org/T230426 (10thcipriani) 05Open→03Resolved a:05thcipriani→03dduvall @dduvall did this last week! ` version: v4 base: docker-registry.wikimedia.org/rel... [14:09:12] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910): Upgrade Gerrit to 2.15.17 - https://phabricator.wikimedia.org/T229110 (10thcipriani) [14:10:59] 10Deployments, 10Release-Engineering-Team-TODO (201910), 10User-MModell: Automate the recurring management of wikitech:Deployments and phab:#train_deployments - https://phabricator.wikimedia.org/T114488 (10thcipriani) [14:11:11] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10TCB-Team, 10Two-Column-Edit-Conflict-Merge, and 5 others: Fix and restore daily browser tests for TwoColConflict - https://phabricator.wikimedia.org/T234311 (10zeljkofilipin) > We understand what failed a... [14:12:00] 10Release-Engineering-Team-TODO, 10Scap: SCAP python error on successful deploy - https://phabricator.wikimedia.org/T233644 (10thcipriani) [14:13:14] 10Gerrit, 10Release-Engineering-Team-TODO, 10Documentation: Update Gerrit documentation on mediawiki.org before upgrading to Gerrit 2.16.x / PolyGerrit UI - https://phabricator.wikimedia.org/T227562 (10thcipriani) [14:15:56] 10MediaWiki-Releasing, 10Release-Engineering-Team-TODO (201910), 10Core Platform Team, 10MW-1.34-notes, 10MW-1.34-release: Branch REL1_34 for MediaWiki and deployed extensions - https://phabricator.wikimedia.org/T232024 (10thcipriani) a:03dduvall Tentatively assigning to @dduvall following IRC discussi... [14:16:18] 10MediaWiki-Releasing, 10Release-Engineering-Team-TODO (201910), 10Core Platform Team, 10MW-1.34-notes, 10MW-1.34-release: Branch REL1_34 for MediaWiki and deployed extensions - https://phabricator.wikimedia.org/T232024 (10thcipriani) p:05Triage→03Normal [14:19:22] 10Release-Engineering-Team-TODO: Request Sauce Labs access for niedzielski - https://phabricator.wikimedia.org/T206358 (10thcipriani) a:03Niedzielski >>! In T206358#5467853, @zeljkofilipin wrote: > @Niedzielski do you have the same problem as @Etonkovidova? Or do you get the error message all the time? Assign... [14:21:34] 10Release-Engineering-Team-TODO: Request Sauce Labs access for niedzielski - https://phabricator.wikimedia.org/T206358 (10zeljkofilipin) I've been using Sauce Labs recently and I had no trouble. Let me know if the problem is still there. [14:29:26] (03PS3) 10Zfilipin: Reenable TwoColConflict browser tests [integration/config] - 10https://gerrit.wikimedia.org/r/540098 (https://phabricator.wikimedia.org/T234311) (owner: 10Awight) [14:36:12] 10Phabricator: Herald rule: KaiOS -> Inuka - https://phabricator.wikimedia.org/T234217 (10SBisson) 05Resolved→03Open @Aklapper something is not quite right with the rule... See activity history on T234438 [14:41:34] (03CR) 10Zfilipin: [C: 03+2] "Deployed job selenium-daily-beta-TwoColConflict" [integration/config] - 10https://gerrit.wikimedia.org/r/540098 (https://phabricator.wikimedia.org/T234311) (owner: 10Awight) [14:42:59] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10User-zeljkofilipin, 10good first bug: All repositories with Selenium tests should use wdio-mediawiki - https://phabricator.wikimedia.org/T199113 (10zeljkofilipin) [14:44:15] (03Merged) 10jenkins-bot: Reenable TwoColConflict browser tests [integration/config] - 10https://gerrit.wikimedia.org/r/540098 (https://phabricator.wikimedia.org/T234311) (owner: 10Awight) [14:46:10] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10TCB-Team, 10Two-Column-Edit-Conflict-Merge, and 5 others: Fix and restore daily browser tests for TwoColConflict - https://phabricator.wikimedia.org/T234311 (10zeljkofilipin) [[ https://integration.wikimed... [14:49:12] zeljkof: Thanks for the test deployment! We received the failure email loud and clear... I'm looking at it now, seems to be something exciting and possibly cross-browser related. [14:50:04] awight__: I'm glad I could help :) let me know if you have questions or need further help [14:51:44] zeljkof: In order to test fixes, is it safe to "rebuild last"? It looks like the job will pick up any new changes on master. [14:52:21] awight: or just build [14:52:35] Great! [14:52:42] there are even docs for that :) https://www.mediawiki.org/wiki/Selenium/Node.js/selenium-daily-SITE-EXTENSION_Jenkins_job [14:52:56] Very futuristic ;-) [14:53:19] just updated, I think today [14:53:41] you can also test locally, if they pass on your machine, they should 99.99% work in CI [14:54:07] the docs for running tests locally targeting beta cluster: https://www.mediawiki.org/wiki/Selenium/Node.js/Target_beta_cluster [14:54:59] Well the odd thing is that CI passes here, https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/70697/console [14:55:27] 10Release-Engineering-Team, 10MediaWiki-User-management, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: Pending changes: autoreview randomly fails - https://phabricator.wikimedia.org/T233561 (10Aklapper) In general, please follow the [Phabricator etiquette](https://www.mediawiki.org/wiki/Bug_managem... [14:55:55] Ah. Right, if it targets the beta cluster then I bet it's just something about the configuration there. [14:58:25] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10User-zeljkofilipin, 10good first bug: All repositories with Selenium tests should use wdio-mediawiki - https://phabricator.wikimedia.org/T199113 (10zeljkofilipin) [14:58:50] awight: yes, mediawiki at CI and beta cluster are not the same [14:58:56] in some cases, far from it [14:59:20] good thing is that you can test it from your machine, it's faster than merging into master, then running the job [14:59:29] all you need to do is change a few environment variables [15:09:13] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10MediaWiki-Core-Testing, 10User-zeljkofilipin, 10good first bug: All repositories with Selenium tests should use wdio-mediawiki - https://phabricator.wikimedia.org/T199113 (10zeljkofilipin) [15:31:03] 10Phabricator: Reset 2FA for Phabricator account `Apap04` - https://phabricator.wikimedia.org/T234233 (10Apap04) I could record a video of myself proving my identity. I'll have to email it to you (aklapper) because of privacy issues. [15:50:01] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO, 10Wikimedia-Logstash, 10observability, 10Patch-For-Review: logstash-beta.wmflabs.org does not receive any mediawiki events - https://phabricator.wikimedia.org/T233134 (10herron) 05Open→03Resolved a:03herron This occurred in prod as wel... [16:00:17] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (201909), 10Release Pipeline, 10Maps (Kartotherian): Deployment Pipeline fails with CPS error for Kartotherian - https://phabricator.wikimedia.org/T233316 (10thcipriani) >>! In T233316#5536374, @Mathew.onipe wrote: > Post merge builds s... [16:14:27] (03CR) 10Brennen Bearnes: "Was about to click "abandon" on this one, then wondered if Dduvall might want anything here. The comments, at least, seem mildly useful.." [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/518333 (https://phabricator.wikimedia.org/T225335) (owner: 10Brennen Bearnes) [16:26:10] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910), 10Developer-Advocacy, 10wikimedia.biterg.io: biterg.io Gerrit crawling probably stresses the server too much - https://phabricator.wikimedia.org/T234328 (10Aklapper) Bitergia asked internally: > can you... [16:27:09] 10Phabricator: acl*phabricator description update - https://phabricator.wikimedia.org/T174279 (10MarcoAurelio) 05Open→03Resolved a:03MarcoAurelio [16:27:28] 10Phabricator: acl*phabricator description update - https://phabricator.wikimedia.org/T174279 (10MarcoAurelio) a:05MarcoAurelio→03None Part done by Andre, part by me. [16:53:53] random question… I recently noticed that I can log into Jenkins using my Wikitech account, and if I do, there’s an option to “keep this build forever” [16:54:09] this sounds quite useful, but I haven’t heard of it before, so I’m wondering if it’s discouraged for some reason / I shouldn’t be using it? :) [16:57:37] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910), 10Developer-Advocacy, 10wikimedia.biterg.io: biterg.io Gerrit crawling probably stresses the server too much - https://phabricator.wikimedia.org/T234328 (10thcipriani) >>! In T234328#5541545, @Aklapper... [17:16:09] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910): Bump CI php72 images given we've moved production to 7.2.22 - https://phabricator.wikimedia.org/T232165 (10Jdforrester-WMF) [17:16:39] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10MobileFrontend, 10Readers-Web-Backlog (Tracking): Javascript test failures on REL1_31 / REL1_32 / REL1_33 - https://phabricator.wikimedia.org/T230454 (10Jdforrester-WMF) [17:17:02] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Prepare Phame to support heavy traffic for a Tech Department blog - https://phabricator.wikimedia.org/T226044 (10Jdforrester-WMF) [17:18:02] 10Release-Engineering-Team-TODO (201910), 10User-greg, 10Wikimedia-extension-review-queue: Investigate and make improvements to the extension review process - https://phabricator.wikimedia.org/T195244 (10Jdforrester-WMF) @greg, is this still actively waiting to be done or should we move this up to TODO? [17:20:17] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO: Collect and expose Jenkins build metrics for visualization, reporting, and analysis - https://phabricator.wikimedia.org/T205927 (10dduvall) [17:20:19] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Add Prometheus exporter to Jenkins instances - https://phabricator.wikimedia.org/T182759 (10dduvall) 05Open→03Stalled Work on this has stalled. I've... [17:21:42] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Add Prometheus exporter to Jenkins instances - https://phabricator.wikimedia.org/T182759 (10dduvall) 05Stalled→03Declined Marking this as "declined"... [17:21:44] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO: Collect and expose Jenkins build metrics for visualization, reporting, and analysis - https://phabricator.wikimedia.org/T205927 (10dduvall) [17:22:25] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO: TEC3:O1:O1.2 Goal – Formalize the collection of CI infrastructure and tooling metrics - https://phabricator.wikimedia.org/T205923 (10dduvall) a:05dduvall→03None [17:28:56] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Fundraising-Backlog: Create composer-test-php70 docker image for fundraising tech's crm tests - https://phabricator.wikimedia.org/T230446 (10Jdforrester-WMF) Hey, happy to help ou... [17:28:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201909), 10Test-Coverage: mwcore-phpunit-coverage-master times out after 5 hours - https://phabricator.wikimedia.org/T232706 (10hashar) There is another issue, the builds that run o... [17:31:08] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10Fundraising-Backlog: Create composer-test-php70 docker image for fundraising tech's crm tests - https://phabricator.wikimedia.org/T230446 (10Jdforrester-WMF) [17:31:32] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (201909), 10Release Pipeline, 10Maps (Kartotherian): Deployment Pipeline fails with CPS error for Kartotherian - https://phabricator.wikimedia.org/T233316 (10dduvall) >>! In T233316#5541473, @thcipriani wrote: >>>! In T233316#5536374, @... [17:33:33] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO, 10Wikimedia-Logstash, 10observability: logstash-beta.wmflabs.org does not receive any mediawiki events - https://phabricator.wikimedia.org/T233134 (10hashar) 05Resolved→03Open Reopening since: * we still have to fix deployment-logstash03 (... [17:45:38] 10Continuous-Integration-Config, 10Quibble, 10Patch-For-Review: Run composer --version in CI jobs - https://phabricator.wikimedia.org/T181942 (10hashar) a:03awight [17:47:52] PROBLEM - Puppet staleness on integration-agent-docker-1008 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [17:47:52] PROBLEM - Puppet staleness on integration-agent-docker-1004 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [17:47:52] PROBLEM - Parsoid on deployment-mediawiki-parsoid10 is CRITICAL: connect to address 172.16.0.141 and port 8000: Connection refused [17:47:52] PROBLEM - Parsoid on deployment-parsoid09 is CRITICAL: connect to address 172.16.5.63 and port 8000: Connection refused [17:54:57] some puppet failures :\ [17:57:20] !log Fixed puppet ssl issue on integration-agent-docker-1004 and integration-agent-docker-1008 ( rm -fR /var/lib/puppet/ssl ) [17:57:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:07:26] RECOVERY - Puppet staleness on integration-agent-docker-1004 is OK: OK: Less than 1.00% above the threshold [3600.0] [18:09:22] RECOVERY - Puppet staleness on integration-agent-docker-1008 is OK: OK: Less than 1.00% above the threshold [3600.0] [18:10:43] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201907): Investigate gerrit session expiration - https://phabricator.wikimedia.org/T222472 (10hashar) For reference, with 1 day of uptime: ` Gerrit Code Review 2.15.14-16-g855b179b5f now 18:08:48 UT... [18:11:20] 10Phabricator: Reset 2FA for Phabricator account `Apap04` - https://phabricator.wikimedia.org/T234233 (10Aklapper) @Apap04: Okay, but how does an "identity" (not sure what that is) directly verify that Apap04 is the mediawiki.org/Phab username of that person? [18:13:58] 10Phabricator: Herald rule: KaiOS -> Inuka - https://phabricator.wikimedia.org/T234217 (10Aklapper) @sbisson: Looking at T234438 it behaves as specified. Maybe that's not what you wanted though. :P What would you like it to do? Shall I change `every time` to `only the first time`? Or exclude it from being trigge... [18:17:57] 10Phabricator: Reset 2FA for Phabricator account `Apap04` - https://phabricator.wikimedia.org/T234233 (10Apap04) My real life identity. This is one way that I know of that I can verify that I'm "Apap04". [18:21:02] hashar: Would you be able to test https://gerrit.wikimedia.org/r/c/integration/config/+/539987 ? I'm not sure how. [18:23:00] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10Fundraising-Backlog: Create composer-test-php70 docker image for fundraising tech's crm tests - https://phabricator.wikimedia.org/T230446 (10Jdforrester-WMF) a:03Jdforre... [18:31:06] James_F: hmm [18:31:38] one way is to manually replay the new commands [18:31:51] another way is to create a temporary job and trigger it manually [18:32:11] eg name: jamestest-mediawiki-core-php72-phan-docker [18:32:13] Eurgh. I guess I could. [18:32:18] which would have the new config [18:32:20] Right now I'm dropping HHVM. :-) [18:32:45] then trigger the build in jenkins and add ZUUL_PROJECT=whatever ZUUL_URL=https://gerrit.wikimedia.org/r/p ZUUL_BRANCH=master ZUUL_REF=master [18:33:12] but i can surely review it ;] [18:34:11] (03PS6) 10Jforrester: layout: Drop HHVM jobs from -wmf branches [integration/config] - 10https://gerrit.wikimedia.org/r/534520 (https://phabricator.wikimedia.org/T234384) [18:34:34] hashar thcipriani i think we should raise the web session cache, thoughts? (i think a few restarts ago, i was logged out) [18:39:13] (03PS7) 10Jforrester: layout: Drop HHVM jobs from -wmf branches [integration/config] - 10https://gerrit.wikimedia.org/r/534520 (https://phabricator.wikimedia.org/T234384) [18:41:46] paladox: thcipriani: so now that we have support for openjdk8 on buster.. am i reinstalling gerrit1001 with buster again? [18:41:58] because it is stretch ..i had reinstalled [18:42:11] I would say yes, let's reinstall! (may as well) [18:42:14] but that was when we only had opendjk 11 [18:42:38] the connector we can download [18:42:45] and put it into the lib folder [18:42:51] so nothing blocks the upgrade now [18:42:56] ok, then let me do that and not merge the role change once again :) [18:43:02] ok, good [18:43:53] (03PS8) 10Jforrester: layout: Drop HHVM jobs from -wmf branches [integration/config] - 10https://gerrit.wikimedia.org/r/534520 (https://phabricator.wikimedia.org/T234384) [18:45:30] It is massively unhelpful to not be able to have a local copy of zuul and so endlessly rely on gerrit to actually do the CI for my layout changes. :-( [18:47:19] paladox: of course this means repeating the rsync one more time but wasn't that long and already pays off we puppetized it [18:47:41] yup [18:47:54] mutante thanks for puppitizing that! :) [18:48:28] we shall see if it works without reverting it to role::spare first for reinstall [18:48:39] Gah, I think I might just drop all the HHVM jobs together. [18:49:54] (03PS6) 10Jforrester: layout: Drop HHVM testing from all quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/534521 (https://phabricator.wikimedia.org/T234384) [18:49:59] (03Abandoned) 10Jforrester: layout: Drop HHVM jobs from -wmf branches [integration/config] - 10https://gerrit.wikimedia.org/r/534520 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [18:51:49] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10HHVM, 10Patch-For-Review: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 (10Jdforrester-WMF) In that case, let's proceed. [18:52:16] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10HHVM, 10Patch-For-Review: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 (10Jdforrester-WMF) 05Stalled→03Open [18:52:35] (03CR) 10Jforrester: [C: 03+2] layout: Drop HHVM testing from all quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/534521 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [18:52:36] hashar: Whee. [18:54:14] (03Merged) 10jenkins-bot: layout: Drop HHVM testing from all quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/534521 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [18:55:06] !log Zuul: Drop HHVM testing from all quibble jobs T234384 [18:55:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:55:09] T234384: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 [18:55:43] (03PS5) 10Jforrester: layout: Collapse -nohhvm jobs into their base as we've dropped that [integration/config] - 10https://gerrit.wikimedia.org/r/534522 (https://phabricator.wikimedia.org/T234384) [18:56:16] (03CR) 10Jforrester: [C: 03+2] layout: Collapse -nohhvm jobs into their base as we've dropped that [integration/config] - 10https://gerrit.wikimedia.org/r/534522 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [18:56:31] (03PS4) 10Jforrester: layout: Drop all HHVM testing except for Fundraising [integration/config] - 10https://gerrit.wikimedia.org/r/534523 (https://phabricator.wikimedia.org/T234384) [18:59:29] (03Merged) 10jenkins-bot: layout: Collapse -nohhvm jobs into their base as we've dropped that [integration/config] - 10https://gerrit.wikimedia.org/r/534522 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [19:01:01] !log Zuul: Drop -nohhvm jobs, no longer needed [19:01:03] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:09:11] James_F: do you have a moment? i'm trying to figure out why debug logging i added doesn't work in production (https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/540428). can we push a debugging-the-debugging patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/540464 to mwdebug1002 or something? [19:09:33] MatmaRex: Sure, let me just finish this commit. [19:09:34] i'm sure that the code in 540428 is correct, it works as expected for me locally [19:09:59] thanks. no hurry [19:12:12] (03PS1) 10Jforrester: jjb: Drop unused HHVM jobs [integration/config] - 10https://gerrit.wikimedia.org/r/540467 (https://phabricator.wikimedia.org/T234384) [19:13:56] (03CR) 10Jforrester: [C: 03+2] layout: Drop all HHVM testing except for Fundraising [integration/config] - 10https://gerrit.wikimedia.org/r/534523 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [19:15:03] James_F: I am going to have dinner and will head to bed after that [19:15:11] hashar: No worries. [19:15:15] Speak tomorrow. [19:15:32] James_F: be safe, and if there is anything weird, just fill a task and I can catch up tomorrow morning :] [19:15:38] :-) [19:15:41] (03Merged) 10jenkins-bot: layout: Drop all HHVM testing except for Fundraising [integration/config] - 10https://gerrit.wikimedia.org/r/534523 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [19:15:46] i am too tired to assist tonight :\ [19:16:10] !log Zuul: Drop all HHVM testing except for Fundraising T234384 [19:16:10] one sure thing, i am very happy to see hhvm gone from CI. That will free up ton of capacity and make changes a little bit faster to merge [19:16:13] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:16:13] T234384: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 [19:16:44] on those good words. See you all later :] [19:16:56] Yeah. [19:21:41] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Gerrit Hardware Upgrade (+ upgrade from jessie to stretch or buster) - https://phabricator.wikimedia.org/T222391 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on... [19:21:46] Yippee, build fixed! [19:21:46] Project mwcore-phpunit-coverage-master build #212: 09FIXED in 4 hr 21 min: https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/212/ [19:24:23] (03CR) 10Jforrester: [C: 03+2] "Deployed." [integration/config] - 10https://gerrit.wikimedia.org/r/540467 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [19:26:51] (03Merged) 10jenkins-bot: jjb: Drop unused HHVM jobs [integration/config] - 10https://gerrit.wikimedia.org/r/540467 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [19:38:11] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Gerrit Hardware Upgrade (+ upgrade from jessie to stretch or buster) - https://phabricator.wikimedia.org/T222391 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['gerrit1001.w... [20:06:21] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10HHVM, 10Patch-For-Review: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 (10Jdforrester-WMF) [20:07:33] 10Release-Engineering-Team (Local Dev), 10Release-Engineering-Team-TODO (201910), 10dev-images, 10MW-1.35-notes (1.35.0-wmf.1; 2019-10-08), 10Patch-For-Review: MediaWiki pipeline config: Correctly tag development images with dev - https://phabricator.wikimedia.org/T234379 (10brennen) 05Open→03Resolved... [20:07:35] 10Release-Engineering-Team (Local Dev), 10Release-Engineering-Team-TODO (201910), 10dev-images, 10local-charts: Point deployment-charts/mediawiki-dev at latest dev image published by pipeline - https://phabricator.wikimedia.org/T234391 (10brennen) [20:12:36] James_F: let me know if we can do it, i'm around for an hour or two more today [20:13:21] MatmaRex: Yeah, was just about to ping you, prod looks stable enough. [20:18:43] (03PS1) 10Jforrester: layout: Migrate fundraising tests from HHVM to PHP70 [integration/config] - 10https://gerrit.wikimedia.org/r/540474 [20:18:45] (03PS1) 10Jforrester: jjb: Drop mwgate-composer-hhvm-docker, now unused [integration/config] - 10https://gerrit.wikimedia.org/r/540475 [20:18:47] (03PS1) 10Jforrester: jjb: Make the quibble conditional publisher track php72 jobs [integration/config] - 10https://gerrit.wikimedia.org/r/540476 [20:18:49] (03PS1) 10Jforrester: jjb: Migrate parsoidsvc-parsertests-docker from hhvm to php72 [integration/config] - 10https://gerrit.wikimedia.org/r/540477 [20:18:51] (03PS1) 10Jforrester: Misc. HHVM clean-up [integration/config] - 10https://gerrit.wikimedia.org/r/540478 [20:20:29] 10Release-Engineering-Team-TODO, 10TechCom: Expand Gerrit Manager permissions - https://phabricator.wikimedia.org/T234474 (10Jdforrester-WMF) [20:22:38] 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10TechCom: Expand Gerrit Manager permissions - https://phabricator.wikimedia.org/T234474 (10Jdforrester-WMF) [20:22:46] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10TechCom: Expand Gerrit Manager permissions - https://phabricator.wikimedia.org/T234474 (10Paladox) [20:31:10] (03CR) 10Jforrester: [C: 03+2] layout: Migrate fundraising tests from HHVM to PHP70 [integration/config] - 10https://gerrit.wikimedia.org/r/540474 (owner: 10Jforrester) [20:32:48] (03Merged) 10jenkins-bot: layout: Migrate fundraising tests from HHVM to PHP70 [integration/config] - 10https://gerrit.wikimedia.org/r/540474 (owner: 10Jforrester) [20:37:35] !log Zuul: Migrate fundraising tests from HHVM to PHP70 T234384 [20:37:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:37:39] T234384: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 [20:39:20] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10HHVM: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 (10Jdforrester-WMF) [20:40:18] (03CR) 10Jforrester: [C: 03+2] "Deployed." [integration/config] - 10https://gerrit.wikimedia.org/r/540475 (owner: 10Jforrester) [20:43:15] (03Merged) 10jenkins-bot: jjb: Drop mwgate-composer-hhvm-docker, now unused [integration/config] - 10https://gerrit.wikimedia.org/r/540475 (owner: 10Jforrester) [20:47:01] (03PS2) 10Jforrester: jjb: Make the quibble conditional publisher track php72 jobs [integration/config] - 10https://gerrit.wikimedia.org/r/540476 [20:49:26] (03CR) 10Jforrester: [C: 03+2] "Deployed." [integration/config] - 10https://gerrit.wikimedia.org/r/540476 (owner: 10Jforrester) [20:49:48] (03PS2) 10Jforrester: jjb: Migrate parsoidsvc-parsertests-docker from hhvm to php72 [integration/config] - 10https://gerrit.wikimedia.org/r/540477 [20:52:05] (03Merged) 10jenkins-bot: jjb: Make the quibble conditional publisher track php72 jobs [integration/config] - 10https://gerrit.wikimedia.org/r/540476 (owner: 10Jforrester) [20:54:43] (03CR) 10Jforrester: [C: 03+2] "Deployed." [integration/config] - 10https://gerrit.wikimedia.org/r/540477 (owner: 10Jforrester) [20:56:03] (03PS2) 10Jforrester: Misc. HHVM clean-up [integration/config] - 10https://gerrit.wikimedia.org/r/540478 [20:57:11] (03Merged) 10jenkins-bot: jjb: Migrate parsoidsvc-parsertests-docker from hhvm to php72 [integration/config] - 10https://gerrit.wikimedia.org/r/540477 (owner: 10Jforrester) [21:02:19] (03CR) 10Jforrester: [C: 03+2] Misc. HHVM clean-up [integration/config] - 10https://gerrit.wikimedia.org/r/540478 (owner: 10Jforrester) [21:02:27] (03PS1) 10Jforrester: jjb: [mediawiki-core-jsduck-docker] Stop defining php_flavour (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/540483 [21:03:33] (03CR) 10Jforrester: [C: 03+2] jjb: [mediawiki-core-jsduck-docker] Stop defining php_flavour (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/540483 (owner: 10Jforrester) [21:04:08] (03Merged) 10jenkins-bot: Misc. HHVM clean-up [integration/config] - 10https://gerrit.wikimedia.org/r/540478 (owner: 10Jforrester) [21:06:44] (03Merged) 10jenkins-bot: jjb: [mediawiki-core-jsduck-docker] Stop defining php_flavour (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/540483 (owner: 10Jforrester) [21:06:55] !log Zuul: Misc HHVM test clean-ups. [21:06:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:07:28] (03PS1) 10Jforrester: jjb: [docker-run] We're not supporting HHVM_REPO_CENTRAL_PATH any more [integration/config] - 10https://gerrit.wikimedia.org/r/540484 [21:11:50] (03CR) 10Jforrester: [C: 03+2] "Deployed. All 301 jobs. Whew." [integration/config] - 10https://gerrit.wikimedia.org/r/540484 (owner: 10Jforrester) [21:14:14] (03Merged) 10jenkins-bot: jjb: [docker-run] We're not supporting HHVM_REPO_CENTRAL_PATH any more [integration/config] - 10https://gerrit.wikimedia.org/r/540484 (owner: 10Jforrester) [21:14:27] (03PS1) 10Jforrester: dockerfiles: Drop all HHVM containers [integration/config] - 10https://gerrit.wikimedia.org/r/540485 [21:14:59] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Gerrit Hardware Upgrade (+ upgrade from jessie to stretch or buster) - https://phabricator.wikimedia.org/T222391 (10Dzahn) >>! In T222391#5542166, @ops-monitoring-bot wrote: > Completed... [21:15:16] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201910), 10HHVM: Drop HHVM from CI - https://phabricator.wikimedia.org/T234384 (10Jdforrester-WMF) [21:15:26] (03PS2) 10Jforrester: dockerfiles: Drop all HHVM containers [integration/config] - 10https://gerrit.wikimedia.org/r/540485 (https://phabricator.wikimedia.org/T234384) [21:16:27] (03CR) 10Jforrester: "OK, this is the last of HHVM in this repo (except the CI config for the repos operations/debs/hhvm and operations/software/hhvm_exporter, " [integration/config] - 10https://gerrit.wikimedia.org/r/540485 (https://phabricator.wikimedia.org/T234384) (owner: 10Jforrester) [21:22:51] (03CR) 10Daimona Eaytoy: "Should the ci-src-setup dockerfile be killed, too?" [integration/config] - 10https://gerrit.wikimedia.org/r/539987 (https://phabricator.wikimedia.org/T234062) (owner: 10Jforrester) [21:23:57] (03CR) 10Jforrester: "> Patch Set 3:" [integration/config] - 10https://gerrit.wikimedia.org/r/539987 (https://phabricator.wikimedia.org/T234062) (owner: 10Jforrester) [21:24:24] (03CR) 10Daimona Eaytoy: "> > Patch Set 3:" [integration/config] - 10https://gerrit.wikimedia.org/r/539987 (https://phabricator.wikimedia.org/T234062) (owner: 10Jforrester) [21:24:41] Daimona: :-) [21:24:56] Daimona: E_TOOMANYPATCHES. I run into it myself a lot. :-) [21:25:45] Heh, that's what happens when you comment stuff instead of going to bed [21:26:11] Although I have to say, the "related changes" section is not that visible [21:27:15] It's more visible in the "new UI", and no-doubt it'll be different in 2.16+. [21:30:34] Yeah, I'm waiting for 2.16 [21:30:47] Since right now the new UI doesn't have project dashboards [21:33:57] yup [21:35:11] James_F: re https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/540488/ where can we start merging HHVM deprecation stuff? [21:43:30] 10Continuous-Integration-Config, 10Release-Engineering-Team-TODO (201910), 10Patch-For-Review, 10phan: ci-src-setup job (used by mediawiki-core-php72-phan-docker) is still running on PHP 7.0.33 - https://phabricator.wikimedia.org/T234062 (10Jdforrester-WMF) p:05Triage→03High This is blocking us properl... [21:43:47] Daimona: Pretty much as soon as that patch lands, it's open-season. [21:44:52] Cool. That patch already LGTM, I just wanted to know whether I can load +2 ammo [21:45:24] If so, I can review tomorrow-ish [22:04:44] 10Phabricator (Search), 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910), 10User-MModell: Test out the Phabricator 'ferrit' search engine. - https://phabricator.wikimedia.org/T230787 (10mmodell) [22:05:26] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910), 10User-MModell: Make sure elasticsearch 6 is supported in phabricator - https://phabricator.wikimedia.org/T181393 (10mmodell) [22:05:50] 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201910), 10Documentation, 10User-MModell: Make PHD run on the backup phabricator server (phab2001, currently) - https://phabricator.wikimedia.org/T232883 (10mmodell) [22:06:34] 10Release-Engineering-Team (Local Dev), 10Developer Productivity, 10local-charts, 10Patch-For-Review: Create an interface for the local-charts ecosystem - https://phabricator.wikimedia.org/T224939 (10mmodell) [22:31:17] paladox: so yea. moving that conversation here a bit [22:31:23] ok [22:31:28] i am looking at the compiler output.. change catalog.. again [22:31:36] the host name situation looks better [22:31:44] and IP i just see gerrit1001 so far [22:31:47] awesome! :) [22:31:51] even though where are the other ones [22:32:02] i have this https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/540500/ change to add gerrit-new to acme. [22:32:06] search for ipv4_address in https://puppet-compiler.wmflabs.org/compiler1001/18719/gerrit1001.wikimedia.org/change.gerrit1001.wikimedia.org.pson [22:32:35] also if you search for "gerrit.wikimedia.org" you now find: [22:32:46] puppet:///gerrit/gerrit.wikimedia.org.erb [22:32:56] but we are using ServerName gerrit-new.wikimedia.org\n [22:33:00] so far so good [22:33:09] this is what we did last time too, btw [22:33:11] yup! [22:33:35] i kind of expected it to add a second IP [22:33:39] just the right one [22:33:53] still looking [22:34:19] oh [22:35:41] "ipv4": "208.80.154.87", [22:35:42] "host": "gerrit-new.wikimedia.org", [22:35:57] "master_host": "cobalt.wikimedia.org", [22:36:08] and the list of "gerrit_servers" has all 3 [22:36:22] so that means we are a replica [22:36:42] yup [22:36:50] master_host i think is unused now? [22:36:55] so we can remove that later [22:37:08] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki configuration Error - string 'Wikipedia' not found on 'https://en.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 2324 bytes in 0.011 second response time [22:37:14] here is what i was still trying to find. the "type": "Interface::Ip" lines [22:37:27] good point about master_host [22:37:39] let me take care of the cert change then, brb [22:37:44] ok :) [22:37:46] thanks! [22:38:27] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-09 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki configuration Error - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 1740 bytes in 0.007 second response time [22:38:46] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki configuration Error - string 'Wikipedia' not found on 'https://en.m.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 2306 bytes in 0.014 second response time [22:39:08] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-07 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki configuration Error - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 1740 bytes in 0.003 second response time [22:39:12] paladox: meanwhile still let's check for all ""type": "Interface::Ip"," followed by parameter "address": [22:39:21] in that catalog link [22:39:50] * paladox has 21 matcheds [22:39:52] *matches [22:40:52] not sure yet if the cert change works like that. when LE does the challenge/response thing that would only work on new server but maybe fail on cobalt [22:41:10] hmm [22:41:18] i dont see a difference in config which backend creates it [22:42:16] hmm [22:42:21] oh we have the list of authorized_hosts too where we already added gerrit1001 [22:43:07] yup [22:43:21] mutante i think it uses the dns [22:43:31] to do this. [22:43:45] yup challenge: dns-01 [22:44:05] yea, that's why i said that will only work on one of the servers [22:45:19] mutante i would have thought this would work? [22:45:20] last time we migrated i think we were using letsencrypt::certificate or so in puppet [22:45:26] yup [22:45:33] before acme_chief [22:45:36] we use that in the cloud [22:45:38] but prod uses acme [22:46:04] which means special code to make the same role work in both.. hrmm ..yea [22:46:46] mutante i doin't think i understand what would break or what will not work? [22:49:17] paladox: looking at acme_chief code, just found role::acme_chief::cloud fwiw [22:49:28] oh, heh [22:49:34] include ::profile::acme_chief::cloud [22:50:28] 11 # server_name acmechieftest.beta.wmflabs.org; [22:51:35] Krenair hi, around? :) [22:52:13] hi [22:52:30] Krenair wondering if you could help us with acme. [22:52:52] ok [22:52:59] We are adding gerrit-new in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/540500/ [22:53:08] i am just pointing out there are efforts to make it work in cloud or there wouldnt be this class [22:53:21] because you mentioned it's different between prod and cloud [22:53:51] so maybe we can get it more similar again [22:53:58] who is trying to do what? [22:55:26] We are trying to add gerrit-new to acme. Would you know if it'll fail due to cobalt/gerrit2001 being in there? [22:55:35] (in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/540500/) [22:56:29] Krenair: my comment was in response to "in cloud we use what we used before acme_chief but prod uses acme_chief" [22:57:15] Why would it fail due to that? [22:57:19] and then i found the profile acme_chief::cloud .. so maybe it wouldn't have to be different anymore [22:57:36] mutante earlier said " not sure yet if the cert change works like that. when LE does the challenge/response thing that would only work on new server but maybe fail on cobalt" [22:57:49] what [22:58:28] acme-chief is not doing HTTP-01 challenges [22:59:09] all that matters is whether acme-chief will be able to make the relevant records to prove ownership of the domain [23:00:12] gerrit-new.wikimedia.org is under wikimedia.org, the zone update config near the bottom of that file will make it update the authoritative nameservers for that domain [23:01:26] thanks! [23:01:53] Krenair: is the ::cloud profile in use and ready? [23:02:42] It's in use [23:02:44] What do you mean by ready? [23:02:49] so there are acme servers in cloud vps or just "test" so far [23:03:14] yes this is how we get the *.wikimedia.beta.wmflabs.org certs [23:03:14] i saw a host name with "test" in it [23:03:27] ok, cool! [23:03:41] Obviously you can only use this in a project that's been set up for it [23:04:10] paladox: so yea, if there is a still a cherry-pick on your local puppetmaster to do that differently.. you can probably remove it [23:04:25] cherry pick for what? [23:04:25] (if it was setup for that) [23:04:26] unless you run your own gdnsd instance and have labs DNS delegate to that [23:04:39] i have no cherry pick related to LE [23:04:51] AFAIK the projects that are capable of doing this right now are deployment-prep and traffic [23:05:21] paladox: my entire comment about the ::cloud profile that triggered this was in response to your comment that we still use the old puppet code in cloud VPS while "prod uses acme" [23:05:33] yup [23:05:52] so that difference is somewhere and i assumed it's your puppetmaster [23:06:00] which is used by the gerrit instance [23:06:24] https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Service_accounts#Known_examples [23:06:32] Oh, it uses the legacy way as the traffic folks reverted it back when acme did not work in labs [23:06:48] also this would require a new instance i guess so needs a quota bump in the git project. [23:07:21] acme has always worked in labs [23:07:37] acme-chief has always worked in labs too [23:07:47] oh [23:07:49] sigh @ quota - projects need more instances over time, yea [23:08:02] it's just not straightforward [23:08:27] you can't just swap out the traditional single-server LE puppetisation for it without arranging for some means of changing DNS records [23:09:09] paladox: move gerrit to beta or deployment prep ?:p [23:09:14] lol [23:09:31] that's not a bad idea i guess. [23:11:38] Krenair: Oops, apparently Beta Cluster is still running HHVM in some places. [23:11:45] :S [23:11:56] Which means beta loginwiki now fatals at you when you try to login. [23:12:00] * James_F sighs. [23:12:16] I mean, so is prod right? [23:12:25] No, prod is all-php72. [23:12:29] Krenair: thanks. i guess it might make sense to ask for it to live in beta and use the existing setup [23:12:37] Hence why we landed the not-HHVM-requirement code. [23:14:18] mutante, I'm not convinced by the idea of putting misc things into beta in general, especially not given the current maintainership status, and not for the purposes of letting external stuff take advantage of the acme-chief setup either. And I don't think paladox has access to it anyway [23:14:34] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team: "MediaWiki 1.35 internal error" upon login - https://phabricator.wikimedia.org/T234491 (10Etonkovidova) [23:14:48] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team: "MediaWiki 1.35 internal error" upon login - https://phabricator.wikimedia.org/T234491 (10Jdforrester-WMF) Whoops. [23:15:33] ok [23:15:49] James_F, is it advisable to build new deployment-mediawiki boxes or is there some easy way to 'convert' them? [23:16:24] I don't know, but clean(er) new boxes seems most sensible. [23:16:33] It depends how much we're relying on local hacks. [23:16:34] Krenair: Ok. I think Gerrit is kind of important enough to not be just "misc" and it's unfortunate that quota is the issue. [23:17:02] And most definitely not to evade quota in other projects. [23:18:45] deployment-prep has 9 VCPUs remaining in quota, we need to maintain some space to facilitate replacement of existing instances from time to time [23:19:00] it's not there for other projects to get merged in [23:20:20] James_F, I don't suppose we have some ticket somewhere where people were converting existing prod boxes? [23:20:45] tbh I haven't been following work on HHVM -> PHP7 closely [23:21:04] Krenair: There's a lot of tasks. I know boxes are being re-imaged to get rid of HHVM, but it wasn't needed for switching to PHP72. [23:21:15] that has nothing to do with "evade quota" at all [23:22:26] instances with hhvm installed: (7) deployment-deploy[01-02].deployment-prep.eqiad.wmflabs,deployment-jobrunner03.deployment-prep.eqiad.wmflabs,deployment-mediawiki-[07,09].deployment-prep.eqiad.wmflabs,deployment-mwmaint01.deployment-prep.eqiad.wmflabs,deployment-snapshot01.deployment-prep.eqiad.wmflabs [23:22:59] Do they have php72 installed too already? [23:24:16] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team: "MediaWiki 1.35 internal error" upon login - https://phabricator.wikimedia.org/T234491 (10Jdforrester-WMF) I'm getting "There seems to be a problem with your login session; this action has been canceled as a precaution against session hijacking. Please... [23:24:18] looks like they do [23:24:37] most of those have PHP 7.2.22-1+0~20190902.26+debian9~1.gbpd64eb7+wmf1 (cli) (built: Sep 3 2019 08:55:19) ( NTS ) [23:24:47] Hmm. There's some magic code to decide which to use, clear. [23:24:53] mediawiki-07 and mwmaint01 have PHP 7.2.8-1+0~20180725124257.2+stretch~1.gbp571e56 (cli) (built: Jul 25 2018 12:43:00) ( NTS ) [23:25:11] Yeah, ideally everything should be 7.2.22 now. [23:25:43] Krenair is there some special thing we have to do for it to generate a new cert? Or does it do it automatically? [23:27:16] paladox, you update the config in hiera and wait for puppet to run on the main acme-chief server, it should fetch a new cert shortly thereafter... after that you wait for puppet to run on each of the servers that actually use the cert [23:27:27] ah [23:27:29] thanks! [23:27:30] mutante ^ [23:27:34] it should all take care of itself [23:28:58] paladox: ack! the puppet run on acmechief1001 already happened [23:29:04] the second one right now [23:29:05] ok! [23:29:17] mutante did cobalt pull in the new cert? [23:29:20] well.. that will make sense after the role [23:35:36] Krenair appears it didn't generate it :( [23:35:54] it might not have completed the process yet [23:36:17] oh, ok [23:36:22] paladox: it is in /etc/acmecerts/gerrit/ [23:36:23] how long does it usually take? [23:36:29] mutante oh! [23:36:32] it updated it? [23:36:42] don't remember [23:36:43] X509v3 Subject Alternative Name: [23:36:44] DNS:gerrit-new.wikimedia.org, DNS:gerrit-replica.wikimedia.org, DNS:gerrit.wikimedia.org [23:36:52] awesome! [23:37:02] there is a link to "live" and one to "new" [23:37:03] https://crt.sh/?id=1950706338 [23:37:05] they both have it [23:37:13] mutante we can do https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/539204/ now ! :) [23:37:37] paladox: there is /etc/acme (from before) and /etc/acmecerts (new ) [23:37:39] live and new should probably match most of the time, except when it's in the process of renewing etc [23:37:52] mutante it's deployed! [23:37:55] see Krenair link! [23:38:02] yea, i saw [23:39:01] Krenair: I'm not sure exactly how things were switched over, but T195392 suggests just `PHP=php7.2`… [23:39:01] T195392: Switch cronjobs on maintenance hosts to PHP7 - https://phabricator.wikimedia.org/T195392 [23:39:27] James_F: that was temp while switching them and testing [23:39:27] mutante, you'll notice in that config file, most certs don't have a staging_time set... so `live` and `new` will normally be the same thing [23:39:33] the unified one however has a staging_time: 604800 [23:39:41] James_F: now it is that just "php" points to the new version [23:39:58] mutante: Not in Beta Cluster, hence why it is down. [23:40:02] so sometimes expect a delay between new and live of up to that amount of time [23:40:16] Krenair: And https://phabricator.wikimedia.org/rOPUPe0d83ea66e09910554f875ccb5702bae8359027a was using `profile::mediawiki::install_hhvm: false` to uninstall HHVM, I guess? [23:40:21] James_F: oh, ok [23:40:36] Krenair: gotcha! thx [23:40:36] Aha. [23:40:37] https://phabricator.wikimedia.org/rOPUP35b2eca1ba9a12f09e28e6a1d5b535a03cdee21e [23:40:46] `profile::mediawiki::vhost_feature_flags:\n php72_only: true` [23:41:45] 15 profile::mediawiki::php::php_version: "7.2" [23:41:45] 16 profile::mediawiki::install_hhvm: false [23:42:10] oh jeez the whole of beta is off :/ [23:42:24] guess we can't break it any more than it already is [23:43:44] Yeah. :-( [23:43:51] James_F: hieradata/role/common/parsoid/testing.yaml shows these all together [23:44:24] mutante: Awesome, thanks. [23:44:36] all the profile::mediawiki:: Hiera keys needed to remove HHVM and switch to 7.2 [23:44:53] like "enable_fpm: true" as well [23:45:07] Neat. [23:45:26] you should be able to copy all that start with profile::mediawiki:: [23:45:38] Krenair: Does that help? [23:46:09] probably [23:46:12] will try [23:46:17] You rock [23:48:29] James_F, okay well this looks better [23:48:46] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 37182 bytes in 0.560 second response time [23:48:52] shinken likes it too [23:49:09] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-07 is OK: HTTP OK: HTTP/1.1 200 OK - 48092 bytes in 0.603 second response time [23:49:34] * James_F grins. [23:49:42] this was literally just setting that `profile::mediawiki::vhost_feature_flags:\n php72_only: true` on deployment-mediawiki-07 [23:49:50] and running puppet of course [23:50:51] Very neat. [23:51:02] am tempted to also tell it not to install hhvm [23:51:12] but maybe not right now [23:51:12] Next step. ;-) [23:51:20] Maybe when it's not midnight, yes. [23:52:12] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 48624 bytes in 0.515 second response time [23:53:28] yeah, should probably be done on some of these too: deployment-deploy[01-02].deployment-prep.eqiad.wmflabs,deployment-jobrunner03.deployment-prep.eqiad.wmflabs,deployment-mediawiki-09.deployment-prep.eqiad.wmflabs,deployment-mwmaint01.deployment-prep.eqiad.wmflabs,deployment-snapshot01.deployment-prep.eqiad.wmflabs [23:53:51] thanks James_F & mutante [23:55:03] * Krenair -> zzz [23:56:10] James_F was mw 1.34 branched? [23:56:37] i doin't see REL1_34 in https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/core,branches [23:57:09] paladox: Not yet, waiting for marxarelli to finish the train. [23:57:17] ah, ok :) [23:57:49] paladox: cutting the branch on Friday [23:57:57] marxarelli thanks! [23:58:01] np! [23:58:49] thanks Krenair, mutante, and James_F re hhvm+beta