[01:18:05] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11192299 (10Krinkle) [01:24:44] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11192303 (10Krinkle) >>! In T403510#11192102, @Ladsgroup wrote: > If you feel like it, fawiki is an early adopter w... [06:38:00] FIRING: PurgedHighBacklogQueue: Large backlog queue for purged on cp5028:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5028 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [06:39:40] FIRING: VarnishHighThreadCount: Varnish's thread count on cp5028:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5028 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:43:00] RESOLVED: [2x] PurgedHighBacklogQueue: Large backlog queue for purged on cp5028:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [06:44:40] FIRING: [2x] VarnishHighThreadCount: Varnish's thread count on cp5028:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:17:20] <_joe_> fabfur: ^^ have you seen? [07:17:29] <_joe_> looks like a specific node only [07:28:09] yeah I was investigating this [07:30:26] looks like a single residential ip [07:34:40] FIRING: [3x] VarnishHighThreadCount: Varnish's thread count on cp5028:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:39:40] FIRING: [4x] VarnishHighThreadCount: Varnish's thread count on cp5028:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:54:40] FIRING: [3x] VarnishHighThreadCount: Varnish's thread count on cp5028:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:59:40] RESOLVED: [2x] VarnishHighThreadCount: Varnish's thread count on cp5028:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [08:19:25] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: SwitchCoreInterfaceDown (instance ssw1-f1-codfw:9804) - https://phabricator.wikimedia.org/T404946 (10LSobanski) 03NEW [10:32:46] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11193089 (10Ladsgroup) Thanks! [10:40:46] 06Traffic, 10Hiddenparma, 06SRE: Integrate code from the private repository into the CDN - https://phabricator.wikimedia.org/T404826#11193102 (10Joe) Coming to @SLyngshede-WMF's concern, I think some of them are valid, like having disjoint configuration going besides the actual content of a file, including t... [11:19:58] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE: ssw1-f1-eqiad: Fan Spinning Upgraded - https://phabricator.wikimedia.org/T400783#11193185 (10cmooney) So draining traffic from the node did not go as planned. This config was applied: ` set protocols bgp graceful-shutdown sender set routing-instanc... [11:20:20] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11193186 (10Ladsgroup) Notified the community: https://fa.wikipedia.org/wiki/ویکی‌پدیا:قهوه‌خانه/گوناگون#c-Ladsgrou... [11:42:43] 10netops, 06Traffic, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: Move lvs1020 link from ssw1-f1-eqiad to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T404959 (10cmooney) 03NEW p:05Triage→03High [11:43:47] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE: ssw1-f1-eqiad: Fan Spinning Upgraded - https://phabricator.wikimedia.org/T400783#11193246 (10BTullis) Hi. In case it helps with your investigation, I can tell you that we observed a brief loss of connectivity on the dse-k8s cluster, which may well ha... [11:59:20] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE: ssw1-f1-eqiad: Fan Spinning Upgraded - https://phabricator.wikimedia.org/T400783#11193277 (10cmooney) >>! In T400783#11193246, @BTullis wrote: > Hi. In case it helps with your investigation, I can tell you that we observed a brief loss of connectivit... [12:17:17] 06Traffic, 06collaboration-services, 10Gerrit, 06SRE: Document how to deploy changes to DNS repo without Gerrit working - https://phabricator.wikimedia.org/T336754#11193359 (10ABran-WMF) 05In progress→03Resolved this has been [[ https://wikitech.wikimedia.org/wiki/DNS#Emergency_Measures | done ]],... [12:22:41] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11193382 (10Jhancock.wm) @elukey heads up, i'm gonna try swapping the console card with another CP server to see if it's the card or something else. will probab... [12:57:16] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11193463 (10IKhitron) I think there is a problem. The option "use desktop sites only" on tablet does not work when... [13:35:07] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11193651 (10Krinkle) >>! In T403510#11193463, @IKhitron wrote: > I think there is a problem. The option "use deskto... [13:45:14] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11193700 (10IKhitron) > Which operating system, which version, which browser? Samsung Galaxy Android 14 (Upside Dow... [14:19:53] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11193851 (10elukey) @Jhancock.wm I tried to use the firmware upgrade cookbook but it bails out due to this: ` cp2048: SKIPPING - iDRAC version (1.20.25.0) is t... [14:33:18] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11193969 (10Jhancock.wm) since everything is reachable via idrac/mgmt now, i should be able to tackle that as a background task. I'll see how many i can get don... [14:34:57] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11193993 (10TheDJ) oh this option "use desktop/mobile sites" provided by browsers is an important distinction.. For... [14:35:27] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11193996 (10elukey) >>! In T392851#11193969, @Jhancock.wm wrote: > since everything is reachable via idrac/mgmt now, i should be able to tackle that as a backgr... [14:41:39] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11194033 (10Krinkle) >>! In T403510#11193700, @IKhitron wrote: >> What happens on the new ones? > Ditto. Okay, so... [14:42:26] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11194034 (10Jhancock.wm) @elukey I can wait! wasn't trying to rush you. lemme know next week and we'll take care of it then. =) [15:26:35] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11194247 (10IKhitron) > Has this behaviour changed and did this work in the past? Is it specific to Wikimedia sites... [17:55:38] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11195081 (10RobH) [18:06:02] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11195144 (10RobH) [18:08:00] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11195172 (10Krinkle) [18:18:48] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11195213 (10Krinkle) >>! In T403510#11194247, @IKhitron wrote: >> […] And many other site work well always. […] Eve... [18:28:19] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11195264 (10IKhitron) >They "work", yes, but the "prefer mobile/desktop version" does nothing on these websites, be... [19:44:50] 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Rollout Phase 3] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11195538 (10Krinkle) @IKhitron Can you try with https://doc.wikimedia.org/T403510/T403510-check.php in your three m... [20:32:02] brett: running slightly late, 5’ [20:32:14] no worries! [21:15:53] If any pybal errors show up they're expected [21:16:02] we're currently removing wdqs from lvs