[04:56:38] 10Acme-chief, 10Horizon, 10cloud-services-team (Kanban): Create a service account to manage traffic.wmflabs.org. from acme-chief - https://phabricator.wikimedia.org/T229786 (10Vgutierrez) Yeah, to be honest, a limited access for this account would be better than a full administrator role [07:55:03] 10Traffic, 10Analytics, 10Analytics-Kanban, 10Operations, and 2 others: TLS certificates for Analytics origin servers - https://phabricator.wikimedia.org/T227860 (10ema) 05Open→03Resolved Thank you so much @elukey! ATS is now using TLS only for connections to #analytics origins. [07:56:27] 10Traffic, 10Operations, 10serviceops, 10Patch-For-Review: Applayer services without TLS - https://phabricator.wikimedia.org/T210411 (10ema) [08:09:59] 10Traffic, 10Operations, 10serviceops, 10Patch-For-Review: Applayer services without TLS - https://phabricator.wikimedia.org/T210411 (10ema) [08:48:17] 10Traffic, 10Operations, 10serviceops, 10Patch-For-Review: Applayer services without TLS - https://phabricator.wikimedia.org/T210411 (10ema) [09:06:09] nice.. QUIC & HTTP/3 support have been merged into ATS master branch [09:09:46] wow! [09:10:43] IMHO it's kinda scary [09:11:00] that means allowing legit traffic in udp/80 udp/443 [09:11:19] and that's a mess with DDoS amplification attacks [09:41:59] good point [09:57:35] 10Acme-chief, 10Horizon, 10Patch-For-Review, 10cloud-services-team (Kanban): Create a service account to manage traffic.wmflabs.org. from acme-chief - https://phabricator.wikimedia.org/T229786 (10aborrero) For the record, I just created a basic documentation page: https://wikitech.wikimedia.org/wiki/Portal... [10:04:30] 10netops, 10Operations: BGP session down for AS 20485 on cr2-esams - https://phabricator.wikimedia.org/T230004 (10elukey) p:05Triage→03Normal [10:16:48] 10netops, 10Operations: BGP session down for AS4739 on cr4-ulsfo - https://phabricator.wikimedia.org/T230005 (10elukey) p:05Triage→03Normal [11:45:41] volans: hello! I am currently following https://wikitech.wikimedia.org/wiki/Ops_Onboarding#access_to_pwstore and it says "reach out to volans..." [11:47:46] my PGP key is https://gpg.mozilla.org/pks/lookup?op=get&search=0xB01C8B006DA77FAA and it's already signed by more than two people. is there anything else that needs to be done? (other than setting up pwstore) [11:49:12] like do I need to add the key somewhere else? it's also on keys.opengpg.org [11:49:22] thanks! [11:52:59] sukhe: I'll add you later to pwstore (out for lunch in a bit) [11:53:10] there's also a pwstore deb at https://people.wikimedia.org/~jmm/ [11:54:15] thanks moritzm! [11:59:33] I am guessing I need someone to add me to the wmf and ops LDAP groups as well: https://wikitech.wikimedia.org/wiki/Ops_Onboarding#add_to_wmf_and_ops_LDAP_groups [12:01:05] I tried but I got "debug3: remaining preferred: password" so my pub key was not accepted [12:01:11] thanks! (stepping away for a bit for breakfast) [12:16:40] sukhe: wow, didn't know it mentioned me specifically :) any 2 signatures from SRE are good [12:17:05] and I actually cannot add people there, amending docs [12:21:38] lol jijiki: https://wikitech.wikimedia.org/w/index.php?title=Ops_Onboarding&type=revision&diff=1801299&oldid=1801297 [12:22:38] amended :) [12:22:51] volans: it took you only a year to figure this out [12:22:59] nobody ever pinged me for that :D [12:23:11] at least not specifically or explicitely mentioning it ;) [12:24:03] this means that either people don't read the docs as well as sukhe did [12:24:05] or [12:24:24] they skipped this part :p [12:24:46] moritzm: can you confirm it requires 2 signatures from SRE? just to make sure I put the right doc there :) [12:33:09] I am sure there are 2 [12:36:55] sukhe: so yeah you still need it signed by 2 SREs. As I'm leaving very soon for the rest of the week I volounteer jijiki to sign it instead of me ;) [12:37:37] I was troll here, how did I end up the trollee ? [12:37:54] I was the troll* [12:38:19] that's the risk of the game :D [12:39:28] sukhe: I will sign your key [12:41:47] I don't think we've written down two sigs as a requirement, that more or less come up as established practice [12:42:12] but with Hangouts it's also quick to do, so we can just as well stick with it [12:42:34] volans: jijiki: I see the page has been updated :P [12:43:28] or just log to the same host and write(1) the fingerprint, that's also fine [12:43:38] yep I noticed the troll and fixed it :D [12:44:53] moritzm: I am sure there is a trust path from my key to yours through Peter [12:45:28] hmm I just realized that my new key is not signed by Peter, the old Debian key was [12:46:16] if you are just looking to verify the fingerprint, https://2019.www.torproject.org/docs/signing-keys.html.en [12:47:07] anyway, I didn't mean to interject so let me know what you prefer and we can do that [12:48:10] which key do you want to use for pwstore, did you create a separate wikimedia.org one or added an identity to your existing key? [12:49:00] moritzm: I added an identity to the existing key. should I create a separate one? [12:49:46] no, that's perfectly fine, both works [12:50:31] I was just wondering as the 0xB01C8B006DA77FAA I just fetched from SKS doesn't have a wikimedia.org identity, or are you planning to use a different one? [12:50:47] moritzm: yeah that's weird, looks like the keyserver didn't sync [12:52:17] moritzm: https://keys.openpgp.org/vks/v1/by-fingerprint/E4ACD3975427A5BA8450A1BEB01C8B006DA77FAA [12:54:00] ok I need to go afk for a bit, sukhe ping me later if you need an extra person to sign your key [12:54:16] jijiki: thanks, I will [12:58:32] I wonder if we should update the docs and replce pgp.mit.edu with a functioning and perhaps non-SKS keyserver [12:59:41] ack, I'll fetchh from openpgp.net [13:00:13] ideally at some we have DNSSEC and simply add the keys to our wikimedia.org authdns :-) [13:04:48] 10Wikimedia-Apache-configuration, 10User-Dereckson, 10patch-welcome: svn.wikimedia.org/doc/ should redirect to doc.wikimedia.org - https://phabricator.wikimedia.org/T109950 (10Krinkle) [13:04:52] 10Wikimedia-Apache-configuration, 10User-Dereckson, 10patch-welcome: svn.wikimedia.org/doc/ should redirect to doc.wikimedia.org - https://phabricator.wikimedia.org/T109950 (10Krinkle) p:05Triage→03Low [13:10:37] I've imported and verified sukhe's key, will add it to the pwstore users file [13:11:29] sukhe: I'll wait with re-encrypting all the data files until Pham's key is also added (he's also being onboarded currently), probably later the day or tomorrow [13:15:43] thanks moritzm! [13:57:35] 10Traffic, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ssingh) [14:11:55] ema: around by any chance? I have an LVS config change to deploy and I'm wondering if there is anything special to do beside puppet-merge: https://gerrit.wikimedia.org/r/c/operations/puppet/+/528491 [14:13:17] gehel: hi! I'm not 100% sure, but I think puppet-merge should be enough (considering that the change only touches icinga-related things) [14:13:37] ema: glad to now that it's not obvious to you either :) [14:14:04] gehel: maybe try pcc against the relevant LVSs/icinga.wm.o [14:14:18] yeah, will do [14:15:18] ema: since I already interrupted you, which LVS woudl that be for class: high-traffic2 ? [14:15:58] I always ask cumin -- cumin 'A:lvs' 'grep cloudelastic /etc/pybal/pybal.conf' [14:16:09] (because I have a terrible memory) [14:16:25] good answer! [14:21:33] ema: looks like a noop for the LVS servers, I'll merge and scream if things go south [14:21:59] question: is there a recommended/preferred pastebin? [14:22:51] (for IRC mostly) [14:22:52] Phabricator supports pastes [14:22:57] sukhe: phabricator has phaste that can be used (and also integrated with prod) [14:23:16] but mostly interesting if you want to keep a paste for longer or if you want to lock down permissions [14:23:17] ah nice! didn't know [14:23:48] or use etherpad.wikimedia.org depending on what you want to paste [14:24:39] thanks, noted [14:26:30] gehel: ack! [14:37:20] sukhe: on production machines btw there's the 'phaste' command installed everywhere, which will upload its stdin to a phabricator paste [14:39:14] cdanis: interesting, thanks. I will check if I can restrict to WMF-NDA for some of the more private pastes [14:57:15] gehel: it seems that icinga isn't happy about the change [14:57:24] root@icinga1001:~# icinga -v /etc/icinga/icinga.cfg |grep Error [14:57:24] Error: Could not find any host matching 'cloudelastic.wikimedia.org' (config file '/etc/nagios/nagios_service.cfg', starting on line 19277) [14:57:27] Error: Could not expand hostgroups and/or hosts specified in service (config file '/etc/nagios/nagios_service.cfg', starting on line 19277) Error processing object config files! [14:57:42] yep, trying to rollback, but ci taking forever [14:58:24] I have some idea of what the issue is, but this is going to take some digging into it [16:42:59] 10Traffic, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10MoritzMuehlenhoff) [17:18:53] 10Traffic, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ssingh) [17:23:43] quick question: https://gerrit.wikimedia.org/r/c/operations/puppet/+/528851 I merged this as my understanding was that you need a +2 (+1 from a reviewer and +1 from yourself) [17:23:58] I had a +1 from ema but when I hover my mouse over it, it says "Looks good to me, but someone else must approve" [17:24:13] should I have waited for another +1? or another "approval" (what does that mean in this case?) [17:24:41] sukhe: "approve" in this context means +2 [17:24:59] ah! thanks, I was worried there for a second :D [17:25:08] it's not a major patch or anything but I thought I broke protocol [17:25:25] i don't think you did. the only rule i am aware is maybe "at least one +1" [17:25:35] and even that isnt strict [17:25:52] but keep in mind if you merge stuff in gerrit you also need to go to the puppetmaster [17:25:55] and run a command there [17:26:14] do you have shell access working already? [17:26:49] yes. I should probably read on how (or where) to do that [17:34:13] sukhe: ssh to puppetmaster1001.eqiad.wmnet [17:34:36] then type 'sudo puppet-merge' [17:34:51] you will see a diff of your change and get a yes/no question if you accept it [17:35:09] i just checked and it's just your change, not multiple ones [17:35:27] so you can hit yes and then it will be actually synced to puppetmasters [17:35:56] this is so that there is another "human has to deploy" sanity step after +2 [17:36:44] you will see it syncs the data to multiple masters. after that is done you should ssh to icinga1001.wikimedia.org and run 'puppet agent -tv' to watch puppet make the change to Icinga config [17:37:54] finally you can do ' sudo icinga -v /etc/icinga/icinga.cfg' to check if icinga config syntax is fine, just in case. and that's it [17:38:33] well, for this specific change the real test is you login on icinga web UI and try to run a command like "schedule downtime" or any other [17:40:03] mutante: thank you! this is very helpful [17:40:38] you are very welcome. we could do better with the docs i'm sure [17:42:34] so always do the puppet-merge soon after gerrit, or icinga will start complaining about unmerged patches. if somebody else also merges in gerrit meanwhile then the puppet-merge script will start warning about "multiple" authors and make you literally type "multiple". but when that happens we usually start to ping each other on IRC to make sure [17:42:35] mutante: it's probably also I have not got to the docs yet (and in some cases haven't even figured out where they are). I wanted to get the checklist out of the way before I started reading about the workflows and stuff [17:42:53] I have been bothering ema with some of the things but clearly there is a lot to learn [17:43:30] yea, so i just tried to find a good link for you and also did not find it right away :p [17:43:39] in general try wikitech.wikimedia.org for docs first [17:43:52] in some cases it's office wiki, but mostly it's wikitech [17:44:02] except if it's about team structures , then it's mediawiki.org [17:44:03] heh [17:44:44] and it's wiki, so if you find something to add or update, feel free [17:44:58] I did start reading https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster but got lost :P [17:45:35] ouch, yea, that is a bit overkill while still in onboarding, indeed [17:45:49] that involves the whole wmcs, formerly known as "labs" setup [17:46:01] so another quick question, the person who merges the patch on gerrit is also the person who should run puppet-merge? or is that supposed to be someone else since puppet-merge involves a quick review? [17:46:23] it's always the same person except if somebody forgets. basically [17:46:36] then they get pinged "hey, you got a patch on the master and i want to merge too" [17:47:08] the review should happen before that [17:47:46] thanks [17:48:14] I should put wikitech.wm.org on my afternoon reading list [17:48:30] also +1 to your change btw, but i did not check in LDAP the SN vs the CN field [17:48:36] and half of the time we got the wrong one [17:48:52] but worst case it does nothing and best case you can run commands and easiest you just test :) [17:49:08] also this way you got to know puppet-merge [17:51:49] did you get into SSH config yet to jump to a server behind the bastion hosts? [17:51:51] yeah! [17:52:05] you will need that to get to the puppetmaster [17:52:06] mutante: yeah, I already had that as I have access to stat1007 for a while [17:52:11] ok, cool [17:58:58] technically you can also review +2 on gerrit but not hit "submit". that would mean "approved" without merging it. but in reality i almost never see people use that [17:59:12] so having to run puppet-merge is just true of you submit [17:59:48] some repos do auto-submit by a bot after a human does +2, but in puppet repo that isnt the case [18:01:20] I see [18:05:05] code review at my previous job involved Trac so this is far better :) [18:06:56] heh, nice to hear that Gerrit isn't the worst option [18:20:34] 10Domains, 10Traffic, 10Operations, 10Product-Design-Strategy: Add a repo reference to Design Strategy web address - https://phabricator.wikimedia.org/T230053 (10Volker_E) [18:20:56] 10Traffic, 10Operations, 10Readers-Web-Backlog: [Bug] iPadOS 13 shows the desktop version of Safari with a broken layout - https://phabricator.wikimedia.org/T229875 (10Jdlrobson) a:05Jdlrobson→03ovasileva Not sure if you or Sam would be best placed to work out what to do with this. Talking to Operations... [18:21:23] 10Domains, 10Traffic, 10Operations, 10Product-Design-Strategy: Add a repo reference to Design Strategy web address - https://phabricator.wikimedia.org/T230053 (10Volker_E) [18:25:15] 10Domains, 10Traffic, 10Operations, 10Product-Design-Strategy: Add a repo reference to Design Strategy web address - https://phabricator.wikimedia.org/T230053 (10Dzahn) This is `modules/profile/manifests/microsites/design.pp` in the puppet repo. Happy to help adding a third repo there to git clone from.... [18:43:53] 10Domains, 10Traffic, 10Operations, 10Product-Design-Strategy: Add a repo reference to Design Strategy web address - https://phabricator.wikimedia.org/T230053 (10Volker_E) > So the new one is called "strategy" and you wand /strategy as the URL as well? That's correct. :) [23:23:02] 10netops, 10Operations: cr4-ulsfo rebooted unexpectedly - https://phabricator.wikimedia.org/T221156 (10ayounsi) 05Open→03Resolved >According to engineering there is no much information that can be provided from the crash as the issue thread do not have any information and is blank. >This is was not reprodu... [23:28:27] 10netops, 10Operations, 10observability: Add VCP stats monitoring - https://phabricator.wikimedia.org/T228824 (10ayounsi) This is working! Why is that behind a configuration options and not enabled by default? I have no idea. Will let those two sit overnight and roll it to the whole fleet if all good.