[08:02:08] 10netops, 10DC-Ops, 10Operations, 10ops-esams: cr2-esams temperature warning - https://phabricator.wikimedia.org/T176816#3736634 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:22:53] 10Traffic, 10Operations, 10Phabricator, 10Zero: Missing IP addresses for Maroc Telecom - https://phabricator.wikimedia.org/T174342#3737107 (10Dispenser) **7 days**. In 7 days the IP information will start disappearing. [12:27:45] 10Traffic, 10Operations, 10Phabricator, 10Zero: Missing IP addresses for Maroc Telecom - https://phabricator.wikimedia.org/T174342#3737108 (10Aklapper) https://phabricator.wikimedia.org/T174342#3559407 already lists the IP ranges I'd say. Link seems to be https://meta.wikimedia.org/wiki/Steward_requests/Ch... [14:23:26] 10netops, 10Operations, 10ops-eqiad: Setup eqsin RIPE Atlas anchor - https://phabricator.wikimedia.org/T179042#3737395 (10Cmjohnson) I have connected the Ripe atlas anchor to iron if you want to load the image. [15:36:24] 10Traffic, 10Operations: Renew unified certificates 2017 - https://phabricator.wikimedia.org/T178173#3737533 (10BBlack) Copying in some commentary I was accidentally putting in the wrong ticket (the private purchasing one) for the new globalsign certs over the past few days: >>! In T178831#3731564, @BBlack wro... [16:18:58] mmh [16:19:01] Error: /Stage[main]/Varnish::Logging/Rsyslog::Conf[varnish]/File[/etc/rsyslog.d/80-varnish.conf]: Could not evaluate: Could not retrieve information from environment future source(s) puppet:///modules/varnish/rsyslog.conf.erb [16:19:56] _joe_: any idea about that error message? ^ [16:20:42] reverting in the meanwhile [16:25:13] ema: re the syslog change in general, something that was crossing my mind this morning is we need to be careful about PII there (e.g. client IPs/cookies in syslog outputs) [16:25:21] <_joe_> ema: I have no idea, no [16:25:24] I don't think we have any such cases today, but I'm not 100% sure. [16:25:28] <_joe_> what's the context though? [16:25:48] _joe_: merged https://gerrit.wikimedia.org/r/#/c/388482/ which looked fine to pcc [16:26:00] <_joe_> ema: ah! [16:26:02] <_joe_> I see it [16:26:12] <_joe_> s/source/content ? [16:26:25] <_joe_> and then template() [16:26:39] <_joe_> https://gerrit.wikimedia.org/r/#/c/388482/9/modules/varnish/manifests/logging.pp line 23 [16:26:46] ah, snap! [16:26:48] ty [16:27:25] bblack: yeah that's a good point! In particular if we add 'slow request logging' we should avoid adding PII to the logs [17:04:42] 10netops, 10Operations, 10fundraising-tech-ops: bonded/redundant network connections for fundraising hosts - https://phabricator.wikimedia.org/T171962#3737774 (10Jgreen) [17:04:46] 10netops, 10Operations, 10fundraising-tech-ops, 10ops-codfw: connect second ethernet interface for fundraising codfw hosts - https://phabricator.wikimedia.org/T176175#3737772 (10Jgreen) 05Open>03Resolved This is done, all codfw hosts have bonded ethernet now. [21:13:41] 10HTTPS, 10Traffic, 10Operations, 10Parsoid, 10VisualEditor: Parsoid, VisualEditor not working with SSL / HTTPS - https://phabricator.wikimedia.org/T178778#3738796 (10PlanetKrypton) @Arlolra ``` kryptonit3@ubuntu-2gb-nyc3-01:/etc/mediawiki/parsoid$ cat config.yaml # This is a sample configuration file... [21:37:52] 10Traffic, 10ORES, 10Operations, 10Scoring-platform-team, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3738870 (10BBlack) 05Open>03stalled p:05High>03Normal The timeout changes above will offer some insulation, and as time... [22:13:04] I thought you might be interested =) https://meta.wikimedia.org/wiki/2017_Community_Wishlist_Survey/Miscellaneous/Improve_DNS-_and_TSL(SSL)security [22:15:35] lol [22:16:32] The problem of bad roots can be split into two sub-cases: The user controls their device and has reasonably-secure/standard software on it, or they don't. [22:17:39] in the former case, injecting fake roots would have to be via exploitation, which we can't stop (along with many other fallouts for that user who is exploited). That or badly-managed/complicit well-known CAs, but CT/CAA/etc are already pinning that problem down pretty hard and will only get stronger. [22:18:05] in the latter case (e.g. gov, employer, etc is in control of the user's device), there's nothing anyone can do to protect the user. [22:18:34] and DNSSEC doesn't actually make any pragmatic difference in any of these cases. [22:19:46] also lol at "Implement DNSSEC for the DNS of the Wikipedia-domains as a first step. [...]" -> "The work (without testing and documentation) should need less than 1 day. I can offer help, if needed" [22:20:10] I don't think anybody have seen any country systematically MITMing anyway [22:20:44] and for targeted attacks, we honestly can't offer any reasonable protection and nobody can [22:21:48] right, the point of everything we do at this level is to prevent mass-scale surveillance and censorship. We can't defend a targeted individual in most cases. [22:22:11] Where we can, we make changes to make targeted *remote* attacks harder, but nothing we can do about local ones (or all of them in general) [22:25:51] For some reason xkcd.com/538/ comes to my mind. [22:28:01] also it was only a couple weeks ago I commented negatively on the long-oustanding DNSSEC ticket: https://phabricator.wikimedia.org/T26413#3704222 [22:28:11] I'm sure that had some influence on trying to find new avenues to push it [22:35:53] https://kate.io/blog/simple-hash-collisions-in-lua/ [22:36:19] this is one the problems with having so many programming languages in the world. the same problems get found and solved 10 times over. [22:36:47] I remember when Perl got past that one, it has to have been well over a decade ago. I think most of the major scripting languages lack this problem at this point. [22:37:00] yet here's Lua not learning the lesson :P [22:37:17] (note to future selves: care about this when we start writing Lua extensions for ATS) [22:37:23] lua's the worst of them all [22:37:40] the language is open source in name only [22:38:08] yeah [22:38:17] on the bright side though, it's miles better than VCL! :) [22:38:31] a small clique of devs develop it privately and only publish the code on release [22:38:55] so you're looking forward at replacing Varnish with ATS? :P [22:39:59] at least some of it, maybe eventually all of it [22:40:35] BUT IN THIS LANGUAGE ARRAYS START AT ONE [22:40:56] there's a pretty strong case for replacing our backend-most varnish instances with ATS. And ATS looks like a good alternative to nginx for the TLS termination parts at the very front edge, too (although it would be a strange choice if we weren't doing that for other reasons). [22:41:15] only time and testing will tell whether it can replace the frontend varnishes as well, or we end up in a varnish-fe -> ats-be world for a while. [22:41:36] any plans to ditch Apache too? [22:41:51] we don't use apache for the traffic-edge stuff [22:42:11] as for MW, I donno. I think there's a lot of deep tie-in between complicated custom apache config and MW in general? [22:42:43] not as much these days [22:43:14] the work on the non-canonical https redirect service might clean a lot of that up too [22:43:36] at least our numerous domain redirects are done with a template generator so they can migrate very easily [22:43:41] (it will take a lot of those redirects away, putting them exclusively at the traffic layer. our MW install will then only deal with the real canonical project domainnames, basically). [22:43:57] yaaay [22:54:51] https://puck.nether.net/pipermail/outages/2017-November/010947.html about today's outage