[01:15:51] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217836 (10Krinkle) [01:22:54] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217854 (10Krinkle) [01:45:23] 06Traffic, 06MediaWiki-Platform-Team, 06Reader Experience Team, 10MobileFrontend (Core PHP), 13Patch-For-Review: Toggling desktop view doesn't toggle user back into mobile mode - https://phabricator.wikimedia.org/T403866#11217879 (10Krinkle) [02:13:31] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217909 (10Krinkle) [02:13:45] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217910 (10Krinkle) [02:14:08] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217911 (10Krinkle) [02:14:32] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217913 (10Krinkle) [02:17:12] 06Traffic, 06MediaWiki-Platform-Team, 06Reader Experience Team, 10MobileFrontend (Core PHP), 13Patch-For-Review: Toggling desktop view doesn't toggle user back into mobile mode - https://phabricator.wikimedia.org/T403866#11217929 (10Krinkle) 05Open→03Resolved [02:21:34] 06Traffic, 06MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11217937 (10Krinkle) [02:21:45] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217938 (10Krinkle) [02:31:14] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11217946 (10Krinkle) [02:52:39] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:frack:rack/install/configuration new switches in rack F5 - https://phabricator.wikimedia.org/T405618#11217960 (10Papaul) [02:53:55] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:frack:rack/install/configuration new switches in rack F5 - https://phabricator.wikimedia.org/T405618#11217962 (10Papaul) [06:39:00] FIRING: PurgedHighBacklogQueue: Large backlog queue for purged on cp5032:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5032 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [06:43:40] FIRING: VarnishHighThreadCount: Varnish's thread count on cp5032:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5032 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:44:00] RESOLVED: PurgedHighBacklogQueue: Large backlog queue for purged on cp5032:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5032 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [06:48:40] FIRING: [2x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:53:40] FIRING: [2x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:03:40] FIRING: [3x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:23:40] FIRING: [2x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:38:40] FIRING: [2x] VarnishHighThreadCount: Varnish's thread count on cp5032:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5032 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:58:40] RESOLVED: VarnishHighThreadCount: Varnish's thread count on cp5032:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5032 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [09:41:40] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11218562 (10Tgr) >>! In T152882#11217659, @Krinkle wrote: > That means MobileFrontend on loginwiki, in theory, provides just two things: > * Allowing calls to `M... [09:43:59] 06Traffic, 06Data-Engineering: Request for a new request dataset for caching research - https://phabricator.wikimedia.org/T401331#11218567 (10GGoncalves-WMF) Hi, just a quick update after my chat with Sukhbir. We should do this, not only for the value of the dataset itself, but also because it will be an excel... [10:53:50] 06Traffic, 06Data-Engineering: Reduce noise from duplicate sequence-gap alerts on HaProxy-webrequests - https://phabricator.wikimedia.org/T401383#11218788 (10Clement_Goubert) These changes seem to cause some puppet CI failures due to the `lua` `utf8` module not being present in the test environment: ` 12:45:52... [14:01:11] claime: re: https://phabricator.wikimedia.org/T401383#11218788, thanks for reporting. what the invocation point for this error? as in, do you have a CR we can check to see what is triggering it? [14:01:24] sukhe: ofc, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1191671 [14:01:47] Is the trigger [14:02:52] sukhe: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1191698 [14:03:09] thanks :> [14:03:15] we might need to do that in the CI image, also, actually [14:03:33] in the releng repo [14:05:34] I wonder what changed though? like why wasn't this a problem before? this has been out for a while (the lua converter) and we have had multiple CI runs between now and then probably [14:07:15] Maybe because of my change forcing a full rerun of all CI checks? [14:07:39] sukhe: probably that specific test module wasn't getting invoked most of the time [14:08:40] tox.ini [14:08:42] 114:commands = /bin/sh -c 'busted --verbose --helper=modules/profile/files/trafficserver/mock.helper.lua --lpath=modules/profile/files/trafficserver/?.lua ./modules/profile/files/trafficserver/*.lua' [14:08:44] 119:commands = /bin/sh -c 'busted --lpath=modules/profile/files/cache/?.lua modules/profile/files/cache/*_test.lua' [14:09:43] hmm [14:09:51] # Adding an environment to this list is not enough to get the tests to be [14:09:54] # executed on-demand by CI. You need to modify rake_modules/taskgen.rb too. [14:10:00] is `haproxylua` actually glued in to rake_modules/taskgen.rb? [14:10:02] which is what claime's change did and hence the trigger [14:10:04] I see `tslua` there [14:10:14] haproxylua, yeah it is in the envlist [14:10:31] ah [14:10:39] I don't think this puppet CI was ever being invoked anyway? [14:10:48] aside from specifically edits to taskgen.rb [14:10:52] yeah :] [14:11:12] yeah I think the test isn't being run by the normal CI, it just is defined through taskgen [14:11:17] And my editing it triggers a full run [14:11:32] so that also needs to be fixed because I *think* we probably did need it for normal CI runs [14:11:59] claime: yeah I think so [14:12:02] Thanks for looking at that so quick tho <3 [14:12:09] ok thanks, that adds up [14:12:16] * sukhe looks at cdanis' patches [14:12:27] and for this I will flag it to fabfu.r when he is back [14:12:36] It's not urgent that it's fixed on my side, the CRs are prep for the k8s upgrade that probably will happen next week or even later, so it's fine [14:12:45] But you probably want your CI checks to actually run [14:12:54] lol cdanis already added it [14:12:57] :) [14:13:01] claime: yeah just making sure CI is unblocked basically [14:13:06] <3 [14:13:10] you never know when you need it [14:13:37] I think we will need the integration-config PR to be merged, and new images pushed, to fix puppet CI [14:15:41] I can ping releng. [14:16:56] [done, see -releng] [14:31:16] thanks to Reedy! [15:10:10] Reedy finished deploying it. we can try running it again [15:12:38] 07HTTPS, 06Traffic, 10MediaWiki-Action-API, 10MediaWiki-REST-API, and 4 others: Proposal: fail explicitly and revoke relevant API keys over plain-text HTTP connection for all Wikimedia APIs - https://phabricator.wikimedia.org/T368344#11220259 (10Tgr) Per [[https://datatracker.ietf.org/doc/html/rfc6749#sect... [15:44:57] oh duh 😅 sorry sukhe [15:45:00] yeah idk either [15:46:03] cdanis: or maybe it bailed out early when the haproxy bit failed? though I don't think that's how it works. [15:47:49] hmmm [15:47:52] idk [15:49:17] no all good [15:49:18] it was failing then as well [15:49:19] 06:47:15 wmcs: FAIL code 1 (37.88=setup[29.41]+cmd[0.92,0.48,1.19,5.89] seconds) [15:49:23] https://integration.wikimedia.org/ci/job/operations-puppet-tests-bullseye/17737/console [15:49:24] 😌 [15:49:31] two failures [15:49:32] 06:47:15 haproxylua: FAIL code 1 (3.13=setup[3.09]+cmd[0.04] seconds) [15:49:37] 06:47:15 wmcs: FAIL code 1 (37.88=setup[29.41]+cmd[0.92,0.48,1.19,5.89] seconds) [15:59:47] sukhe: seems like the image rebuild also bumped some other things like mypy versions [16:00:03] yeah that probably explains it and mypy being mypy is not new [16:00:12] :) [16:02:21] taavi has a fix [16:02:47] nice [17:50:22] 06Traffic, 06MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11220873 (10Krinkle) [17:54:35] 06Traffic, 06MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11220892 (10Krinkle) [18:24:02] 06Traffic, 06Data-Engineering: improved x-analytics data on Edge Uniques status - https://phabricator.wikimedia.org/T405783 (10CDanis) 03NEW [18:24:47] 06Traffic, 06Data-Engineering: improved x-analytics data on Edge Uniques status - https://phabricator.wikimedia.org/T405783#11220967 (10CDanis) See also T402994, T391411 [20:28:16] 06Traffic, 06MediaWiki-Platform-Team (Radar): Write Hadoop query for progres metric of unified mobile routing metric - https://phabricator.wikimedia.org/T405429#11221223 (10Krinkle) >>! In T405429#11216639, @Krinkle wrote: > […] Revised plot: > {F66698629 height=100} Compared to: [[https://stats.wikimedia.org... [21:34:57] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, and 3 others: Many misc wikis lack mobile domains - https://phabricator.wikimedia.org/T152882#11221462 (10Krinkle) @Tgr Thanks, I'll include loginwiki in the next batch of rollouts on Monday 29 Sep (T403510).