[00:17:42] hi, when would be a good time to pair with a Traffic team member on adding a new LVS service? https://gerrit.wikimedia.org/r/q/topic:%2522shellbox-constraints-lvs%2522 [07:28:06] legoktm: hi, I'll check those CRs [08:49:40] hello folks, I have a question about LVS - at the moment we don't contemplate the option of having VIPs with hosts in the Analytics vlans, but I don't recall if it is on purpose (for some limitation) or simply because there was no real need in the past [08:52:03] (the current barrier is of course not having any interface on LVS nodes being able to L2 forward to any analytics VLAN in eqiad, but it doesn't seem to be unthinkable to add support) [08:56:00] so I don't know if it wasn't needed on the past or there were some PII handling concerns [08:56:26] I can raise your question during the weekly traffic meeting and get back to you [08:56:34] <3 [08:56:42] that would be awesome thanks! [08:56:52] the meeting is scheduled for this EU afternoon so you won't have to wait too much hopefully [08:57:06] perfect timing :) [08:57:31] of course don't forget to send some beers.. I tend to forget things if I'm thirsty [08:57:40] O:) [08:58:35] I am keeping a backlog for in-person meetings :D [11:41:16] 10Traffic, 10SRE, 10decommission-hardware: decommission cescout1001.eqiad.wmnet - https://phabricator.wikimedia.org/T275696 (10MoritzMuehlenhoff) Noticed this during clinic duty: @ssingh If the decom cookbook ran on the host, you can can tick off the relevant parts under "Steps for service owner" and reassig... [12:01:03] 10Traffic, 10SRE, 10decommission-hardware: decommission cescout1001.eqiad.wmnet - https://phabricator.wikimedia.org/T275696 (10ssingh) a:03Jclark-ctr [12:22:34] 10Traffic, 10Analytics, 10SRE, 10Patch-For-Review: Compare logs produced by atskfafka with those produced by varnishkafka - https://phabricator.wikimedia.org/T254317 (10klausman) I did an analysis of the ATS and Varnish Kafka topics as reported for `cp3050.esams.wmnet` (the only host that currently feeds... [13:14:13] 10Traffic, 10Analytics, 10SRE, 10Patch-For-Review: Compare logs produced by atskfafka with those produced by varnishkafka - https://phabricator.wikimedia.org/T254317 (10Ottomata) 10-20 seconds / 0.02% missing seems acceptable to me. Perhaps this is enough verification to proceed? [13:45:26] 10Traffic, 10Analytics, 10SRE, 10Patch-For-Review: Compare logs produced by atskfafka with those produced by varnishkafka - https://phabricator.wikimedia.org/T254317 (10BTullis) I'm trying to get my head around what the implications of these two statements are: > usually, there are 0.02% of events that ar... [13:51:37] 10Traffic, 10Analytics, 10SRE, 10Patch-For-Review: Compare logs produced by atskfafka with those produced by varnishkafka - https://phabricator.wikimedia.org/T254317 (10Ottomata) I think thats right! [15:16:42] 10Traffic, 10Analytics, 10SRE, 10Patch-For-Review: Compare logs produced by atskfafka with those produced by varnishkafka - https://phabricator.wikimedia.org/T254317 (10elukey) Something that I noticed, that may be totally off: ` scala> spark.sql("SELECT count(*) FROM wmf.webrequest where webrequest_sourc... [18:02:13] vgutierrez: thanks :) I'll try to be around tomorrow EU morning if that would be a good time to roll it out together? [18:05:36] That would be perfect [18:56:09] 10Traffic, 10SRE, 10Sustainability (Incident Followup): Raw "upstream connect error or disconnect/reset before headers. reset reason: overflow" error message shown to users during outage - https://phabricator.wikimedia.org/T287983 (10Legoktm) [20:53:05] 10netops, 10Infrastructure-Foundations, 10SRE: ripe-atlas-codfw is down - https://phabricator.wikimedia.org/T267714 (10Papaul) [20:55:36] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE: (Need By: TBD) rack/setup/install atlas-codfw.wikimedia.org - https://phabricator.wikimedia.org/T273114 (10Papaul)