[00:39:14] ^ I was wrong on the above. Hosts: cumin:A:lvs does not work [00:39:24] what works is cumin:O: or cumin:P: [00:39:41] actual run for double confirmation https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/8600/console [00:40:00] just as an FYI, in case someone wants to do this as well [00:40:45] er, I meant this for -sre, for the puppet run on A:lvs [00:49:52] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11961688 (10Papaul) [00:54:05] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11961689 (10Papaul) VRRP is up on cr2-eqsin ` cr2-eqsin> show interfaces terse | match "ae1.512|ae1.522" et-0/0/1.512 up... [02:04:43] FIRING: [3x] HaproxyKafkaSocketDroppedMessages: Sustained high rate of dropped messages from HaproxyKafka - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaSocketDroppedMessages - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaSocketDroppedMessages [02:09:43] FIRING: [5x] HaproxyKafkaSocketDroppedMessages: Sustained high rate of dropped messages from HaproxyKafka - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaSocketDroppedMessages - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaSocketDroppedMessages [02:14:43] RESOLVED: [5x] HaproxyKafkaSocketDroppedMessages: Sustained high rate of dropped messages from HaproxyKafka - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaSocketDroppedMessages - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaSocketDroppedMessages [02:14:55] 06Traffic, 06Data-Persistence: Move thumbnail caching from upload cluster to text - https://phabricator.wikimedia.org/T427465 (10Ladsgroup) 03NEW [06:56:39] 06Traffic, 06Data-Engineering, 13Patch-For-Review: Add X-Provenance data to webrequest_sampled_live - https://phabricator.wikimedia.org/T427068#11961935 (10JAllemandou) > Just to frame the context, am I correct? AFAIU I have the same understanding as you @Fabfur :) [07:07:46] 06Traffic, 06Data-Persistence: Move thumbnail caching from upload cluster to text - https://phabricator.wikimedia.org/T427465#11961947 (10cmooney) > It would also allows us to set network QoS for upload to be something lower to avoid overwhelming backhauls Funny only yesterday I was suggesting that we could... [08:03:26] 06Traffic, 06Data-Persistence: Move thumbnail caching from upload cluster to text - https://phabricator.wikimedia.org/T427465#11962083 (10Volans) Some random questions/comments: # Do you already have in mind the final split of # of hosts between text and upload at the end of the migration? How much will thi... [08:44:03] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: Investigate internal rejected prefixes - https://phabricator.wikimedia.org/T423384#11962295 (10ayounsi) `lang=python { address="10.128.0.20", afi_safi="IPV4_UNICAST", instance="asw1-22-ulsfo:9804", job="gnmi", network_instance_name="default", peer_a... [08:58:38] 06Traffic, 06Data-Persistence: Move thumbnail caching from upload cluster to text - https://phabricator.wikimedia.org/T427465#11962406 (10jcrespo) I have only one suggestion regarding the ticket name and the framing (not the project itself, which looks to me like a great idea): I would pitch it as "having a de... [09:01:52] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: Investigate internal rejected prefixes - https://phabricator.wikimedia.org/T423384#11962424 (10ayounsi) Another one was: ` pfw1-codfw> show route receive-protocol bgp 208.80.153.202 hidden extensive inet.0: 26 destinations, 29 routes (26 active... [09:22:34] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: Investigate internal rejected prefixes - https://phabricator.wikimedia.org/T423384#11962509 (10ayounsi) Another one, pfw1 re-advertises its uplinks subnets to cr1/2-codfw: ` cr2-codfw# run show route receive-protocol bgp 208.80.153.203 hidden extens... [09:34:44] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491 (10daniel) 03NEW [10:13:30] FIRING: HAProxyRestarted: HAProxy server restarted on cp5025:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqsin%20prometheus/ops&var-instance=cp5025&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [10:18:29] RESOLVED: HAProxyRestarted: HAProxy server restarted on cp5025:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqsin%20prometheus/ops&var-instance=cp5025&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [12:58:32] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, and 2 others: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11963146 (10Papaul) @BCornwall hello can you please provide me with one CP node in rack 604 that i can use later on today to test th... [13:00:31] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN:Switch refresh diagram and wiring - https://phabricator.wikimedia.org/T423724#11963150 (10Papaul) [13:13:34] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, and 2 others: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11963174 (10Papaul) [13:15:10] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, and 2 others: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11963178 (10Papaul) [13:20:11] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, and 2 others: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11963191 (10Papaul) [14:08:29] FIRING: HAProxyRestarted: HAProxy server restarted on cp1114:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqiad%20prometheus/ops&var-instance=cp1114&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [14:08:34] ha [14:09:02] that's actually text [14:12:08] It does a lot of disk reads [14:13:29] RESOLVED: HAProxyRestarted: HAProxy server restarted on cp1114:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqiad%20prometheus/ops&var-instance=cp1114&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [14:34:08] sukhe: are you taking care of https://gerrit.wikimedia.org/r/c/operations/puppet/+/1282764 (and its parent or want me to? [14:34:37] XioNoX: I am happy to. I just got busy with other stuff after I finished running PCC [14:34:44] sounds good, thx! [14:34:51] thanks for patching it :) [14:35:00] I am debating if we should roll this out now or Monday [14:35:08] you know, the usual... [14:46:35] sukhe: could I get a quick review on https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1295011 oversight discovered thanks to https://phabricator.wikimedia.org/T423384 (and soon an alert for it) [14:46:55] oh nice [14:46:57] looking [14:47:27] at least we had 198.35.27.0/24 in there :) [14:47:53] :) [15:04:11] 06Traffic, 06Data-Persistence: Move thumbnail caching from upload cluster to text - https://phabricator.wikimedia.org/T427465#11963777 (10Ladsgroup) >>! In T427465#11961947, @cmooney wrote: >> We could eventually move MPEG-DASH files to text too, they are similar in nature. > > On the QoS side we might be bet... [15:14:14] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11963857 (10ssingh) Per https://www.mediawiki.org/wiki/Manual:Pywikibot/User-agent, users can set a custom UA easily by setting `user_agent_format`. Should we perhaps not be encouraging that vs blanket all... [15:21:05] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, and 2 others: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11963925 (10BCornwall) Hi, @papaul: cp5032 is depooled/downtimed and ready for reimaging. [15:24:31] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, and 2 others: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11963940 (10Papaul) @BCornwall thank you will do that after lunch doing some onsite work [16:07:15] 06Traffic, 06Data-Persistence: Move thumbnail caching from upload cluster to text - https://phabricator.wikimedia.org/T427465#11964169 (10cscott) There's the potential to break various parser tests that hardcode a file URL. I don't think this will be a big concern in practice, but something to look out for. [16:14:59] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11964208 (10daniel) >>! In T427491#11963857, @ssingh wrote: > Per https://www.mediawiki.org/wiki/Manual:Pywikibot/User-agent, users can set a custom UA easily by setting `user_agent_format`. Should we perh... [16:34:18] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11964312 (10ssingh) >>! In T427491#11964208, @daniel wrote: >>>! In T427491#11963857, @ssingh wrote: >> Per https://www.mediawiki.org/wiki/Manual:Pywikibot/User-agent, users can set a custom UA easily by s... [17:12:34] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11964475 (10BCornwall) If pywikibot's main execution is scripts, then I'd say the username is sufficient. If it's used as a frequent import into other, bigger, tools, I would say it's not sufficient. Given... [17:48:49] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11964658 (10daniel) >>! In T427491#11964475, @BCornwall wrote: > While spoofing isn't new, this allows operators to spoof users, not programs. It's another thing we'd have to sift through. Since x-ua-cont... [17:52:34] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11964661 (10CDanis) +1 from me on supporting the format pywikibot already emits for on-wiki usernames. I think the right long-term answer to spoofability would be doing something like sending a valid JWT... [18:09:23] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11964711 (10Joe) The pattern pywikibot uses for user-agents is nto only [[ https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/files/cache/contact_info.... [18:52:30] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11964838 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host cp5032.eqsin.wmnet with OS... [19:00:45] 06Traffic: Reboot lvs1019 for memory self-healing - https://phabricator.wikimedia.org/T426109#11964845 (10BCornwall) 05Open→03Resolved BIOS updated to 2.27.0. After a reboot: > The self-heal operation successfully completed at DIMM DIMM_B2. > A problem was detected during Power-On Self-Test (POST). > T... [19:27:18] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE: Investigate hardware RAID usage in codfw LVS hosts - https://phabricator.wikimedia.org/T426912#11964942 (10BCornwall) @ssingh @BBlack Okay with me switching write-back to write-through slowly through the codfw cluster or shall we leave this as-is until the refresh? [20:14:02] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN: Setup VRRP on both routers for the new subnets - https://phabricator.wikimedia.org/T427393#11965068 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host cp5032.eqsin.wmnet with OS trix... [20:32:52] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11965102 (10daniel) 05Open→03Invalid >>! In T427491#11964711, @Joe wrote: > So I'm not sure what this task is about. > > Care to provide examples? I... must have been dreaming? I swear I saw requ... [20:38:37] 06Traffic: x-ua-contact: recognize pywikibot user-agent strings - https://phabricator.wikimedia.org/T427491#11965123 (10daniel) Oh, I think I know what happened. There are log entries for `category_redirect (commons:commons; User:RussBot) Pywikibot/10.2.0 (g19732) requests/2.32.3 Python/3.11.2.final.0` with... [21:51:57] <[Staff]Eddie> [Network Announcement] Libera staff requests a respectful minute of silence in honor of Harambe's memory on the 10th anniversary of His death. R.I.P. Harambe (May 27, 1999 - May 28, 2016) [22:27:44] ....wtf [22:33:20] dpsm [22:33:23] spam even