[00:11:12] 10Traffic, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: Port gdnsd statistics from ganglia to prometheus - https://phabricator.wikimedia.org/T147426#2915297 (10fgiunchedi) 05Open>03Resolved a:03fgiunchedi gdnsd metrics deployed [13:12:46] 10netops, 06Operations: cr2-esams<->cr2-eqiad link flaps - https://phabricator.wikimedia.org/T154577#2916308 (10faidon) [13:17:18] 10netops, 06Operations: cr2-esams<->cr2-eqiad link flaps - https://phabricator.wikimedia.org/T154577#2916362 (10faidon) This has been raised to Level3 as ticket #12023671. [13:46:56] 10Traffic, 06Operations, 13Patch-For-Review: python-varnishapi daemons seeing "Log overrun" constantly - https://phabricator.wikimedia.org/T151643#2916386 (10ema) Because of the Log overrun issue we are actually losing quite a lot of information. I'm now comparing the values produced by the [[ https://gerrit... [16:42:52] new varnishreqstats merged, cpu usage going down nicely: https://grafana.wikimedia.org/dashboard/db/prometheus-machine-stats?var-server=cp3040:9100&var-datasource=esams%20prometheus%2Fops&from=now-1h&to=now [16:50:42] nice :) [16:54:36] \o/ [16:58:27] 10Traffic, 10Citoid, 06Operations, 10RESTBase, and 5 others: Set-up Citoid behind RESTBase - https://phabricator.wikimedia.org/T108646#2916808 (10Jdforrester-WMF) [16:58:32] 10Traffic, 10Citoid, 10ContentTranslation-CXserver, 10MediaWiki-extensions-ContentTranslation, and 5 others: Decom legacy ex-parsoidcache cxserver, citoid, and restbase service hostnames - https://phabricator.wikimedia.org/T133001#2916807 (10Jdforrester-WMF) [16:58:50] 10Traffic, 10Citoid, 10ContentTranslation-CXserver, 10MediaWiki-extensions-ContentTranslation, and 4 others: Decom legacy ex-parsoidcache cxserver, citoid, and restbase service hostnames - https://phabricator.wikimedia.org/T133001#2216638 (10Jdforrester-WMF) [17:16:54] don't be surprised if you see graphs on varnish-aggregate-client-status-code going up, that's because we were missing quite a few requests with the overruns [17:18:41] ok [17:18:52] I had seen the recurrent little dropouts [17:19:00] I figured those were that [17:19:19] I think so yeah [17:19:28] but from 16:32 on, looks like a big increase in general [17:20:25] all in all we didn't really notice because we were missing metrics uniformly (as in not only 200s but here and there) so the proportions made sense overall [17:21:11] https://phabricator.wikimedia.org/T151643#2916386 [17:22:27] and varnishreqstats is the script most affected by the overruns given that we don't filter out anything at the VSL level, not even PURGEs [17:23:03] right [17:23:09] we probably should :) [17:30:22] well not if we want to plot them :) [17:31:03] oh right [17:31:19] ops meeting! [19:51:26] 10netops, 06Discovery, 06Discovery-Search, 10Elasticsearch, and 2 others: codfw: elastic2025-elastic2036/switch port configuration - https://phabricator.wikimedia.org/T154605#2917474 (10Papaul) [22:18:00] 10netops, 06Discovery, 06Discovery-Search, 10Elasticsearch, and 2 others: codfw: elastic2025-elastic2036/switch port configuration - https://phabricator.wikimedia.org/T154605#2918012 (10RobH) 05Open>03Resolved ports setup [23:38:17] 10Traffic, 10Citoid, 06Operations, 10RESTBase, and 5 others: Set-up Citoid behind RESTBase - https://phabricator.wikimedia.org/T108646#2918257 (10GWicke) [23:40:34] 10Traffic, 10Citoid, 06Operations, 10RESTBase, and 5 others: Set-up Citoid behind RESTBase - https://phabricator.wikimedia.org/T108646#2918289 (10GWicke) With T152221 resolved it seems that only T152220 remains to be done until we can call this done. @Esanders, @mobrovac, @Jdforrester-WMF, could you take a...