[00:05:31] 10netops, 10Operations: Bird multihop BFD - https://phabricator.wikimedia.org/T209989 (10ayounsi) After changing the port range to IANA recommended range and restarting Bird, we can see the BFD packets leaving from the proper port: `IP dns2001.wikimedia.org.55170 > cr1-codfw.wikimedia.org.4784: UDP, length 24`... [10:34:15] Krenair: so... I've rewritten the directory deploy following another approach [10:34:24] instead of handling only in the API [10:34:35] I made acme-chief-backend store the certificates in unique directories [10:34:37] https://gerrit.wikimedia.org/r/c/operations/software/acme-chief/+/494956 [10:35:07] and then provided support on the API for the puppet file_metadatas request -> https://gerrit.wikimedia.org/r/c/operations/software/acme-chief/+/494957/ [10:38:27] it makes more sense.. cause fixes some issues like deploying one cert for the rsa-2048 version and the previous/next one for the ec-prime256v1 one [10:41:55] we're going to need a script that helps with me migration from one directory schema to the other one, I'm working on that now [11:10:35] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10dr0ptp4kt) Great framing, nice job! One question, though, what's this part about and how does... [11:59:12] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10dr0ptp4kt) ^ Well, I intended for that to be on email. But it stands: I think Olga put this i... [12:55:06] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10jcrespo) [13:03:18] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10jcrespo) [13:06:05] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10Vgutierrez) {F28347567} it looks indeed like purge requests [13:06:58] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10jcrespo) [13:08:12] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10jcrespo) >>! In T217893#5011033, @Vgutierrez wrote: > {F28347567} it looks indeed like purge requests I updated the comment- Those seem recurring, there wa... [13:08:51] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10hashar) The spike of PURGE requests to the Varnish text frontends seems to be recurring. A view over 24 hours from https://grafana.wikimedia.org/d/000000180... [13:13:37] 10Traffic, 10Operations: Traffic (text) instability due to unknown cause, causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10Vgutierrez) cp1077 effectively depooled at 13:09 UTC [13:40:59] 10Traffic, 10Operations: Traffic (text) instability due to misbehaving cache server (cp1077), causing a 1.5-2% requests failing - https://phabricator.wikimedia.org/T217893 (10jcrespo) [13:41:54] I have updated title at https://phabricator.wikimedia.org/T217893 handing it to you for followup [13:54:25] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Gehel) [14:13:28] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10BBlack) Looking at an internal version of the flavor=dump outputs of an entity, related observations: Test request from the in... [17:20:36] 10netops, 10Operations: Increase network capacity (2018-19 Q3 Goal) - https://phabricator.wikimedia.org/T213122 (10ayounsi) [17:20:43] 10netops, 10Operations, 10ops-eqiad, 10ops-eqsin, 10Patch-For-Review: Deploy cr2-eqsin - https://phabricator.wikimedia.org/T213121 (10ayounsi) 05Open→03Resolved the redundancy testing is outside the scope of the goal, so everything needed here is done. [21:03:17] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Smalyshev) We've been around this topic a number of times, so I'll write a summary where we're at so far. I'm sorry it's going... [21:11:20] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Smalyshev) > disable cache busting by default, enable it internally This would immediately break all external updaters. They'd... [21:11:23] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Smalyshev) > disable cache busting by default, enable it internally This would immediately break all external updaters. They'd... [21:47:45] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Smalyshev) > don't do cache busting on events older than X This however gave me an idea. If we kept a map of all latest revisi...