[03:01:13] 10Traffic, 10MediaWiki-ResourceLoader, 10MediaWiki-extensions-CentralNotice, 06Operations, and 2 others: Provide location, logged-in status and device information in ResourceLoaderContext - https://phabricator.wikimedia.org/T103695#3254003 (10Krinkle) 05Open>03declined Declining as I I don't think we s... [05:13:25] 10netops, 06Operations: Zayo Circuit ulsfo<->codfw down - https://phabricator.wikimedia.org/T165006#3254023 (10ayounsi) [05:47:48] 10netops, 06DC-Ops, 06Operations: Interface errors on asw-c-eqiad:xe-8/0/38 - https://phabricator.wikimedia.org/T165008#3254070 (10ayounsi) [05:54:24] 10netops, 06Operations: JSNMP flood of errors across multiple switches - https://phabricator.wikimedia.org/T83898#3254097 (10ayounsi) Not sure yet if related, but LibreNMS doesn't poll all the interfaces from at least asw-c-eqiad. For example, xe-8/0/38 is missing. [07:54:56] 10netops, 06Operations: Zayo Circuit ulsfo<->codfw down - https://phabricator.wikimedia.org/T165006#3254242 (10ayounsi) > We are expecting to have a tech onsite in El Paso around 1:45 AM MST to swap out an optic. Will provide another update once optic has been replaced. [10:35:07] 10netops, 06Operations: Zayo Circuit ulsfo<->codfw down - https://phabricator.wikimedia.org/T165006#3254588 (10ayounsi) 05Open>03Resolved >Our equipment vendor performed cold restart on a card in El Paso TX which has restored your service. I am currently seeing two traffic passing on the circuit. If you ar... [10:37:12] 10netops, 06Operations, 10fundraising-tech-ops: BGP session between pfw clusters flapping - https://phabricator.wikimedia.org/T164777#3254591 (10ayounsi) Nothing explicit in the logs. I've open case 2017-0511-0002 with JTAC [12:18:26] bblack: https://gerrit.wikimedia.org/r/#/c/353274/, right values TBD [12:24:29] oh and I've downgraded varnish on cp4010 to 4.1.5 so that we can compare transient storage usage with machines running 4.1.6 [13:03:53] now added per-cluster hfp rate to varnish-transient-storage-usage [13:03:55] sum by (job,layer) (rate(varnish_main_cache_hitpass[5m])) [13:04:04] https://grafana.wikimedia.org/dashboard/db/varnish-transient-storage-usage [13:05:38] there doesn't seem to be a noticeable increase in the hfp rate to justify transient storage usage increase [14:37:17] 10Traffic, 10ChangeProp, 10ORES, 06Operations, 10Scoring-platform-team-Backlog: [Discuss] Split ORES scores in datacenters based on wiki - https://phabricator.wikimedia.org/T164376#3255257 (10Halfak) [14:37:28] 10Traffic, 10ChangeProp, 10ORES, 06Operations, 10Scoring-platform-team-Backlog: [Discuss] Split ORES scores in datacenters based on wiki - https://phabricator.wikimedia.org/T164376#3231630 (10Halfak) p:05Triage>03Low [16:10:45] 10Traffic, 10netops, 06Operations, 10Pybal: Frequent RST returned by appservers to LVS hosts - https://phabricator.wikimedia.org/T163674#3255626 (10elukey) Compared the strace of two requests, one with Connection close and one without it. Something interesting came up: With Connection: close ``` [pid 4... [16:16:06] 10Traffic, 06Operations: varnish frontend transient memory usage keeps growing - https://phabricator.wikimedia.org/T165063#3255683 (10ema) [16:18:42] 10Traffic, 06Operations: varnish frontend transient memory usage keeps growing - https://phabricator.wikimedia.org/T165063#3255727 (10ema) p:05Triage>03High [16:23:59] 10Traffic, 06Operations: varnish frontend transient memory usage keeps growing - https://phabricator.wikimedia.org/T165063#3255772 (10ema) [18:46:22] 10Traffic, 10DBA, 06Operations, 06Performance-Team: Cache invalidations coming from the JobQueue are causing slowdown on masters and lag on several wikis, and impact on varnish - https://phabricator.wikimedia.org/T164173#3256370 (10aaron) 05Open>03declined >>! In T164173#3253516, @jcrespo wrote: > I th... [20:10:53] 10Traffic, 10ArchCom-RfC, 06Commons, 10MediaWiki-File-management, and 15 others: Define an official thumb API - https://phabricator.wikimedia.org/T66214#3256693 (10Tgr) We'll also need a way to display old versions of images. Clients can encounter old versions without expecting to due to FlaggedRevs hiding... [21:19:32] 10Traffic, 10DBA, 06Operations: dbtree: make wasat a working backend and become active-active - https://phabricator.wikimedia.org/T163141#3256864 (10Dzahn) status update: nowadays terbium and wasat use the identical role and profile in site.pp, as in: ``` 2600 # mediawiki maintenance servers (https://wik... [21:22:13] 10Traffic, 10DBA, 06Operations: dbtree: make wasat a working backend and become active-active - https://phabricator.wikimedia.org/T163141#3256871 (10Dzahn) reason: `database connection to tendril on tendril-backend.eqiad.wmnet failed` [21:36:00] 10Traffic, 10MediaWiki-Cache, 10MediaWiki-JobQueue, 06Operations, and 2 others: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan - https://phabricator.wikimedia.org/T124418#3256951 (10Krinkle) 05Open>03Resolved a:03aaron [21:36:03] 10Traffic, 06Operations: Content purges are unreliable - https://phabricator.wikimedia.org/T133821#3256953 (10Krinkle) [22:03:50] 10Traffic, 06Operations, 10Page-Previews, 06Performance-Team, and 3 others: Performance review #2 of Hovercards (Popups extension) - https://phabricator.wikimedia.org/T70861#3257036 (10Tbayer) Thanks @Gilles! Speaking for myself, I also found yesterday's meeting really useful to better understand your pers... [22:13:38] 10Traffic, 06Operations: AS43821 contact details not as up to date as AS14907 - https://phabricator.wikimedia.org/T165104#3257060 (10Reedy) [22:14:30] 10Traffic, 06Operations: AS43821 contact details not as up to date or as detailed as AS14907 - https://phabricator.wikimedia.org/T165104#3257074 (10Reedy) [22:16:05] 10Traffic, 10netops, 06Operations: AS43821 contact details not as up to date or as detailed as AS14907 - https://phabricator.wikimedia.org/T165104#3257060 (10Reedy) [22:24:15] 10netops, 06DC-Ops, 06Operations: mr1-ulsfo crashed - https://phabricator.wikimedia.org/T164970#3257116 (10ayounsi) RMA# R200124729 [22:35:49] 10netops, 06DC-Ops, 06Operations: mr1-ulsfo crashed - https://phabricator.wikimedia.org/T164970#3257123 (10RobH) Juniper emailed us the tracking info, and I've opened an inbound shipment ticket with unitedlayer. I'll plan to go onsite next Wednesday and swap them. [22:47:05] 10Traffic, 10netops, 06Operations: AS43821 contact details not as up to date or as detailed as AS14907 - https://phabricator.wikimedia.org/T165104#3257146 (10faidon) 05Open>03Invalid Just different databases (ARIn/RIPE) with different anti-spam measures. Nothing we can do about it :)