[10:48:54] cp1055 has several packages from backports installed, what's up with those? [10:52:38] moritzm: uh, you're right [10:53:05] systemtap-runtime was probably installed while gathering stats for the spdy -> h2 transition [10:54:57] perhaps it was used for some temp testing, but I can't find relevant references to cp1055 in SAL/irc logs [10:57:45] or maybe for something related to monitoring? I noticed that there are also updated versions of monitoring-plugins [10:58:06] yeah, nagios and monitoring-plugins [10:58:45] but also libgeoip and golang [11:00:52] so on other cache hosts python-six is the only package from backports installed [11:10:06] ema,bblack - https://gerrit.wikimedia.org/r/#/c/292172 should be ready for some review :) [11:20:06] not monitoring related afaik [13:16:40] bblack: thanks for the code review, really what I needed :) Working on the last comment on the big boolean check, but I followed your suggestions for the rest of the code and it works perfectly [13:20:12] will add comments [13:27:39] 10Traffic, 06Operations: californium and gallium ferm rules should disallow public access to port 80 - https://phabricator.wikimedia.org/T137106#2357419 (10BBlack) [13:35:05] (comments added) [14:00:26] we keep on getting cron-spammed by apt-show-versions for reasons unclear to me [14:00:33] run-parts: /etc/cron.daily/apt-show-versions exited with return code 255 [14:00:45] but the script does this: [14:00:46] apt-show-versions -i > /dev/null 2>&1 [14:02:04] we don't really need apt-show-versions I think? [14:03:12] paravoid: meaning that 'apt list' gets the job done? [14:03:41] but besides that, I'd like to understand why cron is still sending emails even though stdout and stderr are sent to /dev/null [14:04:43] oh, wait [14:04:55] it's not the output from apt-show-versions! [14:05:00] it's just a non-zero exit code [14:45:41] paravoid: so yes, we don't seem to be using apt-show-versions, but perhaps some of us expect to find it installed given that it's been there for a while? [14:46:58] to silence the cronspam we could either remove apt-show-versions altogether or run it with || true in cron [14:50:13] try removing it and see what breaks :) [14:50:28] that sounds like a plan! [14:56:17] :) [15:05:16] 10Traffic, 06Operations, 13Patch-For-Review: californium and gallium ferm rules should disallow public access to port 80 - https://phabricator.wikimedia.org/T137106#2357559 (10BBlack) 05Open>03Resolved a:03BBlack [15:07:38] we also have cron spam for apt-xapian-index failing, I don't think we need that either. IIRC it's only used for debtags [15:15:15] elukey: review updated [15:16:45] thanks bblack, will check in a bit :) [15:26:23] bblack: your change (I think) made icinga warn about grafana-admin and piwik [15:29:24] paravoid: looking [15:30:55] 10Traffic, 06Operations: Scripts depending on varnishlog.py maxing out CPU usage on cache_misc - https://phabricator.wikimedia.org/T137114#2357668 (10ema) [15:32:24] paravoid: well yeah, but it's just a monitoring issue not a real issue. Those sites require auth and thus throw 401 if you don't, which the check doesn't. but since the check didn't set XFP either, before the change they got a 301 instead and accepted that without warning... [15:32:46] why are they even listening on 80? [15:32:58] aren't we doing the redirects in varnish? [15:33:01] er [15:33:06] sorry [15:33:09] stupid of me :) [15:33:16] :) [15:33:29] the real problem is they shouldn't even be public hostnames/machines, since they're behind varnish [15:33:41] there's already tickets about that, it might be complicated to move them, etc. [15:33:50] wouldn't fix that monitoring check though [15:33:53] I did fix ferm rules so the public can't actually talk to port 80 in related cases [15:33:56] yeah [15:34:03] the auth thing is separate [15:34:14] the icinga check should probably accept 401 if possible, looking... [15:34:42] 10Traffic, 06Operations: Scripts depending on varnishlog.py maxing out CPU usage on cache_misc - https://phabricator.wikimedia.org/T137114#2357701 (10ema) p:05Triage>03High [15:36:51] varnishxcache and friends are using 100% cpu on cache_misc apparently ^ [15:39:54] :/ [15:45:43] they were working fine at one point... [15:46:04] indeed [15:47:58] what is also interesting is that on cache_maps they're behaving properly [15:49:44] does restarting them fix them? [15:50:06] or does restarting them on maps break them there? :) [15:51:26] bblack: no and no [16:28:26] 10netops, 06Labs, 06Operations, 10Tool-Labs: 'German Wikipedia Broken Weblinks Bot' is ill-behaved and in danger of getting all of Labs blacklisted - https://phabricator.wikimedia.org/T136829#2357927 (10Andrew) 05Open>03Resolved Update: the bitninja people seem to be wrong about everything. I'm closi... [16:29:57] 10netops, 06Labs, 06Operations, 10Tool-Labs: 'German Wikipedia Broken Weblinks Bot' is ill-behaved and in danger of getting all of Labs blacklisted - https://phabricator.wikimedia.org/T136829#2357942 (10MoritzMuehlenhoff) FTR, I received the same vague reply as Andrew, seems mostly auto-generated... [17:13:33] bblack: updated the code review, i also removed some extra bits like the "begin:" prefix because it didn't make much sense in the end. [17:13:56] tested and checked, it works fine [17:14:03] the code should be less convoluted now [17:44:58] elukey: more comments [17:53:28] bblack: thanks for the patience, I am reviewing the code now. I really thought that !strncmp was less clear than the ternary operator, but it looks less clean [17:54:03] ah sorry, long day, what I wanted to say was that I considered the ternary operator more clear but I was mistaken [17:56:26] I proposed to my team to either invest time in adding unit tests or to think about rewriting it. I am clearly missing some basic log flows but there should be some tests to back me up a bit :) [17:57:14] the "end:" only use case is very interesting, didn't think about it [17:58:33] also fmt_tmp += strlen(APACHE_LOG_END_PREFIX); looks nice. What I was thinking was if we want to support white space stripping in the begin/end of the format string, but it might be too much :D [18:00:34] well the current code doesn't strip whitespace either, so either way :) [18:01:08] yes yes I thought about it after reading your proposed change [18:01:19] but it seems a bit overkill for this feature :) [18:23:15] bblack: code review updated (and tested) [18:23:40] now 'end:' correctly defaults to the predefined format [18:30:38] thanks, will follow up with some long running testing and if everything will look good I'll package the new change and talk with ema about deploying it in misc/maps [18:31:29] (ema: vk again! Are you happy? :P) [18:31:53] 10Traffic, 10Wikimedia-Apache-configuration, 10DNS, 06Operations, 13Patch-For-Review: Create moon.wikimedia.org and redirect it to https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon - https://phabricator.wikimedia.org/T136557#2338844 (10Dzahn) added to DNS: moon.wikimedia.org has address 198.35.26.96... [18:49:06] 10Traffic, 10MediaWiki-ResourceLoader, 06Operations, 06Performance-Team: Image urls in CSS remain cached with old $wgResourceBasePath - https://phabricator.wikimedia.org/T134368#2358567 (10Krinkle) a:03Krinkle [20:12:24] 10Traffic, 10Wikimedia-Apache-configuration, 10DNS, 06Operations, 13Patch-For-Review: Create moon.wikimedia.org and redirect it to https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon - https://phabricator.wikimedia.org/T136557#2358803 (10Dzahn) after having to purge the cached default page from varnish... [20:43:09] 10Traffic, 10Wikimedia-Apache-configuration, 10DNS, 06Operations, 13Patch-For-Review: Create moon.wikimedia.org and redirect it to https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon - https://phabricator.wikimedia.org/T136557#2359006 (10Dzahn) 05Open>03Resolved a:03Dzahn @MartinRulsch @Heather w... [20:44:23] 10Traffic, 10Wikimedia-Apache-configuration, 10DNS, 06Operations: Create moon.wikimedia.org and redirect it to https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon - https://phabricator.wikimedia.org/T136557#2359067 (10Dzahn) [22:25:04] 10Traffic, 10Analytics, 10MediaWiki-extensions-CentralNotice, 06Operations: Generate a list of junk CN cookies being sent by clients - https://phabricator.wikimedia.org/T132374#2359414 (10AndyRussG) [22:42:29] 10Traffic, 06Operations, 10fundraising-tech-ops: Fix nits in Fundraising HTTPS/HSTS configs in wikimedia.org domain - https://phabricator.wikimedia.org/T137161#2359459 (10BBlack) [22:43:14] 10Traffic, 06Operations, 10fundraising-tech-ops: Fix nits in Fundraising HTTPS/HSTS configs in wikimedia.org domain - https://phabricator.wikimedia.org/T137161#2359472 (10BBlack) [22:54:06] 10Traffic, 06Operations, 10fundraising-tech-ops: Fix nits in Fundraising HTTPS/HSTS configs in wikimedia.org domain - https://phabricator.wikimedia.org/T137161#2359500 (10BBlack) [22:54:09] 07HTTPS, 10Traffic, 06Operations, 13Patch-For-Review: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination - https://phabricator.wikimedia.org/T132521#2359499 (10BBlack) [22:55:16] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#2359505 (10BBlack) [22:55:19] 07HTTPS, 10Traffic, 06Operations, 13Patch-For-Review: Preload STS for wikimedia.org - https://phabricator.wikimedia.org/T132685#2359503 (10BBlack) 05Open>03Resolved This is submitted for preload now (which takes an agonizingly long and unpredictable time to reach the chrome list and then browsers...) [22:55:35] 07HTTPS, 10Traffic, 06Operations, 07Tracking: HTTPS Plans (tracking / high-level info) - https://phabricator.wikimedia.org/T104681#2359507 (10BBlack) [22:55:37] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#1411365 (10BBlack) 05Open>03Resolved [22:57:54] 07HTTPS, 10Traffic, 06Operations, 07Tracking: HTTPS Plans (tracking / high-level info) - https://phabricator.wikimedia.org/T104681#2359537 (10BBlack) [23:28:33] 10Traffic, 06Operations: Scripts depending on varnishlog.py maxing out CPU usage on cache_misc - https://phabricator.wikimedia.org/T137114#2357668 (10BBlack) Which hosts were exhibiting this? The first one I looked at (cp4001) seems normal. [23:37:16] 10Traffic, 06Operations: Scripts depending on varnishlog.py maxing out CPU usage on cache_misc - https://phabricator.wikimedia.org/T137114#2359694 (10BBlack) I found a few. It seems to be all the esams hosts, plus two of the eqiad hosts ( cp1051, cp1061 ).