[00:41:37] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10577637 (10Scott_French) After a bit more thought, I think sticking with the original plan (i.e., the backport goes in `component/pcre2`), and being sure to u... [08:39:54] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10578092 (10MatthewVernon) I'd seen your question about `${shlibds:Depends}`, but I see you answered it yourself. But yes, `Depends:` should express the necess... [09:03:28] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10578123 (10elukey) Very interesting graph to check: https://grafana.wikimedia.org/d/1T_4O08Wk/ats-backends... [09:54:29] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10578280 (10elukey) Given the graphs above, I think that we'll probably need something around 40/50 pods for... [11:24:57] 06serviceops, 10MW-on-K8s: Ensure tls-proxy container is started before launching main container - https://phabricator.wikimedia.org/T387208 (10Clement_Goubert) 03NEW [11:25:19] 06serviceops, 10MW-on-K8s: Ensure tls-proxy container is started before launching main container - https://phabricator.wikimedia.org/T387208#10578565 (10Clement_Goubert) [11:25:21] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Allow running periodic jobs for mw on k8s - https://phabricator.wikimedia.org/T341555#10578566 (10Clement_Goubert) [12:12:27] 06serviceops, 10MediaWiki-extensions-ReadingLists, 06MW-Interfaces-Team, 10RESTBase Sunsetting: Switchover plan from RESTbase to REST Gateway for Reading Lists endpoints - https://phabricator.wikimedia.org/T384891#10578633 (10MSantos) LGTM. [12:29:14] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10578653 (10jijiki) Deployed the latest images which include `php8.1_8.1.31-1+wmf11u3 php-apcu_5.1.23-1+wmf11u3` and `pcre2_10.42-1~wmf11+1`! Many thanks to @M... [12:31:18] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10578662 (10jijiki) We have deployed 1 php8.1 replica per DC for `shellbox-timeline` and `shellbox-media`. If everything is uneventful until tomorrow, we will move forward and complete... [12:35:30] 06serviceops, 06Content-Transform-Team-WIP, 10Page Content Service, 10RESTBase Sunsetting, and 2 others: hewiki: Route mobile-html to the backing node service instead of RESTBase - https://phabricator.wikimedia.org/T372746#10578677 (10Jgiannelos) a:03hnowlan [12:35:41] 06serviceops, 06Content-Transform-Team-WIP, 10Page Content Service, 10RESTBase Sunsetting, and 2 others: hewiki: Route mobile-html to the backing node service instead of RESTBase - https://phabricator.wikimedia.org/T372746#10578678 (10Jgiannelos) 05Open→03Resolved [13:05:20] hey folks [13:05:33] I posted some ideas for kartotherian's capacity on wikikube in https://phabricator.wikimedia.org/T386926#10578280 [13:06:02] the tl;dr is that I think we'd need 40ish pods to run the service (for each DC) [13:06:21] given the current pod cpu/memory sizes, it is a lot [13:06:45] so if anybody wants to follow up or add ideas please do in the task or here :) [13:07:36] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Periodic job alerting - https://phabricator.wikimedia.org/T385709#10578781 (10Clement_Goubert) Hmm so obviously it's not as simple as I thought it would be. If I understand our [[ https://github.com/wikimedia/operations-puppet/blob/production/modules/alertmanage... [13:08:01] 06serviceops, 10MW-on-K8s, 06SRE Observability, 13Patch-For-Review: Periodic job alerting - https://phabricator.wikimedia.org/T385709#10578783 (10Clement_Goubert) [13:10:35] 06serviceops, 10MW-on-K8s, 06SRE Observability, 13Patch-For-Review: Periodic job alerting - https://phabricator.wikimedia.org/T385709#10578800 (10Clement_Goubert) [13:21:03] 06serviceops, 06MediaWiki-Engineering, 06Traffic, 07Upstream, 07Wikimedia-production-error: 503 error when edit large size pages on PHP 8.1 - https://phabricator.wikimedia.org/T385395#10578816 (10jijiki) [13:21:08] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10578817 (10jijiki) [13:27:17] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10578832 (10cmooney) >>! In T386926#10578280, @elukey wrote: > it is also true that we are replacing a clust... [14:10:34] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10578969 (10elukey) >>! In T386926#10578832, @cmooney wrote: >>>! In T386926#10578280, @elukey wrote: >> it... [14:14:47] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10578984 (10Clement_Goubert) We had the discussion around `num_workers` for service_runner based services in... [14:24:29] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10579052 (10Jgiannelos) Just a comment around the usage in the bare metal nodes, keep in mind that each node... [14:29:31] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10579073 (10Clement_Goubert) >>! In T386926#10579052, @Jgiannelos wrote: > Just a comment around the usage i... [14:30:59] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10579078 (10Jgiannelos) I think so, yeah both master nodes and postgres read replicas. [14:48:03] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10579175 (10elukey) @Clement_Goubert thanks a lot for the in depth review, I'll file the request to increase... [14:48:27] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10579179 (10Scott_French) Alright, so it looks like there's been some miscommunication: for the time being, **please wait before taking any action** the wmf11u... [15:21:20] 06serviceops, 10Gerrit: Remove explicitly enablement of G1 garbage collector for Gerrit - https://phabricator.wikimedia.org/T387223 (10hashar) 03NEW [15:31:55] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Ensure tls-proxy container is started before launching main container - https://phabricator.wikimedia.org/T387208#10579422 (10Joe) In the case of pods accepting traffic, our readiness probe should be enough to ensure this. For scripts, I think the simplest thi... [16:02:10] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10579630 (10Scott_French) Alright, `component/php81` is now back up to date. @Jdforrester-WMF - Thank you so much for getting the CI images up to date in http... [16:05:43] 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Epic: Pregeneration performance optimizations for PCS - https://phabricator.wikimedia.org/T386919#10579644 (10Jgiannelos) @Joe I spent some time figuring out how EventBus works in order to create and emit events but I don't think the current sc... [16:16:52] <_joe_> nemo-yiannis: I think the change is possible, we just need to coordinate with data eng if you change what mediawiki emits [16:19:06] I think we don't really need to change, we can add keys under meta. I thought that additionalProperties were false but defaults to true so something like that validates against the json schema: https://phabricator.wikimedia.org/P73585 [16:43:49] nemo-yiannis: additionaProperties should be false everywhere, and if it isn't it should be :) [16:43:49] also meta field is probably not the right place https://wikitech.wikimedia.org/wiki/Event_Platform/Flaws#meta_field [16:44:11] but, adding new fields is def okay [16:48:40] thanks ottomata for the feedback [17:17:52] 06serviceops, 10MediaWiki-Core-HTTP-Cache, 10MW-on-K8s: mwscript-k8s purgeList does not reliably purge cached URLs - https://phabricator.wikimedia.org/T387127#10579857 (10RLazarus) →14Duplicate dup:03T387208 [17:17:58] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Ensure tls-proxy container is started before launching main container - https://phabricator.wikimedia.org/T387208#10579859 (10RLazarus) [18:05:43] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10580041 (10Scott_French) p:05High→03Medium Alright, production is now caught up to the `8.1.34-1-s2` images, which both install the 10.42 pcre2 backport a... [18:25:53] 06serviceops, 06MediaWiki-Engineering, 06Traffic, 13Patch-For-Review, and 2 others: 503 error when edit large size pages on PHP 8.1 - https://phabricator.wikimedia.org/T385395#10580120 (10Scott_French) Thanks again, all. T386006 should now be substantially resolved from the standpoint of upgrading to a PC... [18:58:35] 06serviceops, 13Patch-For-Review: MediaWiki on PHP 8.1 production traffic ramp-up - https://phabricator.wikimedia.org/T383845#10580259 (10Scott_French) Alright, we believe we're at a point where the issue in T385395 should be addressed, by way of updating the version of PCRE2 used in the PHP 8.1 images to 10.4... [20:22:16] swfrench-wmf: Would it be worth making a CI test that fails on old PCRE2 and works on new, so we feel confident? Or is that too much. [20:25:45] James_F: ah, interesting idea! as in, simply checks the version matches the expected one, or something more like a regression test based on the repro for T385395? [20:27:44] swfrench-wmf: I was thinking a regression test. [20:27:57] But it may be over-engineering the issue. [20:47:35] James_F: got it, thanks for clarifying. that sounds like a nice thing to have, though I guess I'm wondering what it concretely buys us as a safety net. [20:47:35] it would assert that the CI images contain packages that do not carry the regression. that's clearly a good thing in the abstract, but the implication that the same is true in _production_ is a bit indirect ... [20:47:58] Fair. [20:48:04] (yes they're ideally the same packages from the same components, but there are also intermediate steps) [20:48:04] Let's not worry about it, then. [20:48:15] CI is now updated to 8.1.31-1+wmf11u4 / PCRE 10.42 [20:48:21] sounds good, and thank you for raising it! [20:48:31] oh, awesome - thank you on that front as well :) [20:48:38] Happy to help. Thanks! [22:16:01] If only we had an integration testing environment that used the production containers and config... Oh wait, we are working on that now with Pretrain! :) [22:17:49] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Allow members of restricted to run maintenance scripts - https://phabricator.wikimedia.org/T378429#10580902 (10RLazarus) Hmm, @jrbs, who's helping me test (as a member of restricted) reports getting this: ` [...] Error: Kubernetes cluster unreachable: invalid... [23:58:36] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10581145 (10tstarling) Possibly related error in CI https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php81-noselenium/26736/console ` 10:40:00 IN...