[04:58:24] Krinkle: hey, I'm around [07:46:40] PSA promcon EU 2021 recordings are available at https://www.youtube.com/playlist?list=PLj6h78yzYM2PZb0QuIkm6ZY-xTuNA5zRO [08:05:55] godog: thanks! I guess I have to watch https://www.youtube.com/watch?v=bmcdqK2LATc [08:11:20] XioNoX: heheh seems like a quick win (haven't watched myself) [08:15:09] 5m, it's quick for sure :) [08:22:34] well, the tip at 1:05 might actually speed up LibreNMS [08:36:25] there https://phabricator.wikimedia.org/T283060 [08:37:40] nice! [09:14:00] volans: https://phabricator.wikimedia.org/T261861 can now be resolved, right? [09:15:08] kormat: indeed! done [09:15:15] 🍻 [10:39:06] marostegui: ack [10:40:47] marostegui: so, to cut straight to it, I was thinking as our first mitigation we could perhaps do the purge by hand during the depooled state of each host. How feasible would that be so you think? The query would be fairly simple [10:41:04] Eg (mostly?) unthrottled [10:41:47] As part of the compaction cycle. Just skip the purge so that we have the margin going forward [10:42:21] (Not skip entirely, but skip the normal script) [10:42:27] Krinkle: I thought about that too yesterday. for pc1 we won't notice issues as the spare host has the keys, for pc2 and pc3, we'll see more misses as the spare host will take over the depooled host so you can purge it [10:43:52] Yeah, it means well depool slightly longer since it means purge query needs to complete before the optimise step [10:44:04] I have no idea how long we would be talking about though [10:44:16] we can see with pc1 [10:44:29] actually, we can try with pc1010 (which is the spare host for pc1) [10:44:41] and based on that we can see how to proceed [10:44:44] At this scale, how long would it take for a bold delete query with just an exptime clause [10:44:48] Krinkle: so you want to skip the optimize part at this point? [10:45:51] Well, we'd do delete and then optimise. So, bringing the next step forward in our plan, and doing the purge by hand at the same time whenever we're in depooled state [10:46:13] Krinkle: yeah, that was my question, it wasn't clear if you wanted us to run the optimize too ;) [10:47:03] Sure, yes. I think we need to do that either way. Might as well do that at the same time since it's so much work to do the cycling etc [10:47:28] yeah, so my suggestion would be to start with pc1010 so we can measure time spent, space saved and all that....without having to depool hosts for now [10:47:38] Ok [10:48:16] Let me do a sanity check in terms of overal count and count of my naive would-be query. [10:48:28] yeah, and I need to downtime the host and disable notifications [10:53:05] PSA #2, SLO conf is happening https://www.sloconf.com/ [10:53:35] marostegui: ok, so pc1010 is not known to MW, but it contains all the latest/current data from pc1007 [10:53:57] and I assume it does not replicate any direct queries to anywhere? [10:54:02] e..g not part of the circle [10:54:34] I think the circles are not on dbtree, so I'm checking [10:55:49] yes [10:55:52] it is fully spare [10:55:59] I have downtimed it [11:12:06] strange, 10.64.48.174 pc250 has 'datetime' as exptime field type [11:13:05] the schema file says BINARY(14) [11:13:22] and there is a third form, in the sqlbagostuff.php file when a new shard is first created lazily, which uses "BLOB" [11:13:28] what a mess. [11:13:51] welcome to parsercache! [11:15:10] should I be looking where the a hack is that makes this work, or does mysql actually support our strange 14-digit timestamp format and turn it magically into a good datetime value upon update/insert? [11:16:44] ORDER BY exptime ASC, gives me the oldest row as from 2020-11 Nov 2020 [11:25:55] Krinkle: those can be old keys that were never purged [11:26:06] how does the same query work on pc1009 for instance? [11:26:24] SELECT CAST('20110401235959' as DATETIME); [11:26:25] > 2011-04-01 23:59:59 [11:26:30] ok, somehow this amazes me [11:26:51] Krinkle: you can also try a min(exptime) from pcXXXX limit 1 and see what happens [11:27:10] so despite having drifted and forgotten, our current code for inserting still works. [11:30:38] yeah, and the where selects work as well. I didn't think those would sort/cast correctly. [11:30:48] e.g.: [11:30:57] SELECT keyname,exptime FROM pc250 WHERE exptime < '20210517000000' ORDER BY exptime ASC LIMIT 2; [11:30:57] l | 2020-11-18 12:02:57 | [11:31:03] SELECT keyname,exptime FROM pc250 WHERE exptime > '20210517000000' ORDER BY exptime ASC LIMIT 2; [11:31:03] | 2021-05-17 00:00:02 | [11:31:29] ok, I'll ignore this fantastic mess [11:31:29] is that pc1010? [11:31:32] yeah [11:31:47] keep in mind that pc1010 acts as spare for pretty much all the section, so it might have additional mess [11:36:48] SELECT COUNT(*) FROM pc250 WHERE exptime < '20210517000000'; | 26177 | [11:36:48] SELECT COUNT(*) FROM pc250 WHERE exptime < '20210417000000'; | 6084 | [11:36:48] SELECT COUNT(*) FROM pc250 WHERE exptime < '20210117000000'; | 3033 | [11:36:50] pc1010 [11:37:20] Krinkle: yes, the additional mess I was talking about, when we rotate the host, there's stuff that doesn't replicate from pc1008 ie, when the purging happens [11:38:46] marostegui: it only replicates from one host at a time, right? [11:39:04] yeah [11:39:13] ah you maen there's a gap where maybe it receives some more inserts but then never gets purges for them [11:39:18] exactly [11:39:34] ie: we set it to become pc2 master for a few days.......so all those purges are missed [11:39:37] does october/november seem strange ? or is that when we last used the spare to cycle through all the hosts [11:39:38] once it becomes pc1 slave again [11:40:01] Krinkle: I don't recall what happened those days, but it can be when it last served there yes [11:40:20] there's more recent time when it served on pc3, when the master had to go for a kernel upgrade too [11:41:22] OK, there's none of this stuff on pc3 / pc1009 [11:41:29] that one is more as expected [11:41:32] yeah, expected [11:41:47] it should be the same on pc1008 (pc2) [11:41:56] OK, so what's next. [11:42:24] Krinkle: I would say, let's clean pc1010 [11:42:34] then promote it to pc1 master, and then clean pc1007 [11:45:51] ack [11:46:03] 250 shards.. I gues this is worth writing a little script for. [11:46:23] pc000 to pc255 [11:46:52] yeah [11:47:18] marostegui: so.. no limits? [11:47:23] nope [12:03:15] Krinkle: grafana-next.w.o old dashboards fixed [12:24:09] marostegui: I'm ready to go. I'm running it on one table now to see how long it takes. Then waiting for ack before doing all tables on pc1010 [12:24:24] Krinkle: go ahead for all the tables [12:24:36] I can optimize that table first if you like once you are done too [12:25:30] https://phabricator.wikimedia.org/P16060 [12:25:49] wikiadmin@10.64.48.174(parsercache)> DELETE FROM pc000 WHERE exptime < 20210528000000; [12:25:49] Query OK, 340929 rows affected (1 min 24.34 sec) [12:26:08] ok, let me optimize it [12:26:39] rows remaining: 927,168 [12:26:43] (not deleted) [12:26:54] ok [12:27:02] rebuilding pc000 now [12:30:46] Krinkle: done, around 3 minutes, it went from 13GB to 7GB, keep in mind that also includes defragmentation, so not all is because of the data deletion [12:31:23] if we want to see how much space we get from the deletion I can rebuild pc001, then you delete the data, and then I do it again [12:31:29] marostegui: ok, running now [12:31:39] pc000: ... found 59 where exptime less than 20210528000000 [12:31:45] right [12:31:51] that's... odd? [12:32:19] I guess given exptime is in the future, maybe some short-lived objects were created just now [12:32:30] Krinkle: do you want me to optimize pc001 before you delete the data so we can see how much we get from the data deletion? [12:32:33] that's the downside of our weird exptime logic [12:32:43] to delete 20 days, we do now+30-20 [12:33:05] but that also means unusual objects with short exptime like hours or 1 day get deleted immediately [12:33:08] anyway, whatever [12:33:17] marostegui: ok :) [12:33:20] haven't begun yet [12:33:55] ok, let me do that [12:37:39] Krinkle: you can go ahead, the table is now 10G [12:38:54] https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=28&orgId=1&var-server=pc1010&var-datasource=thanos&var-cluster=mysql&from=now-1h&to=now [12:40:14] let me know when pc001 is done [12:41:01] pc000: ... found 333 where exptime less than 20210528000000, deleted 333 rows. [12:41:01] pc001: ... found 342682 where exptime less than 20210528000000, deleted 342681 rows. [12:41:03] marostegui: done [12:41:14] ok, let me optimize [12:42:50] so back 7gb, so we are getting 2GB of space per table [12:43:05] so that's around 0.5T [12:43:07] not too bad! [12:45:44] marostegui: ok, so for the rest, shall I keep sending you bunches of tables that are done (in PM), or wait until all done, and then you loop in some way? [12:45:57] nah, once they are all done we'll take care of it [12:46:12] ok [12:46:44] then we should depool pc1 master, promote pc1010, do pc1007, and same thing with pc1008 and pc1009 [12:46:44] looks like it'll probably take... (250*2min)/60= 8-10 hours? [12:47:05] yeah, and the optimize might take around 5 hours too [12:47:40] I was expecting the deletes to be a lot faster. [12:48:06] this is wthin the same order of magnitude almost as it was in 2016 with the sleeps [12:48:10] they might be faster on the other hosts, as they probably have a warmer cache [12:57:46] Krinkle: kormat will be working with you on all this process, so as next step you can ping her once we are good to proceed with pc1010's optimizations! [12:58:14] okido [14:22:29] on a decom task, what does "any service group puppet/hiera/dsh config removed" mean? [14:22:44] i'm guessing the heira reference means to remove hieradata/host/ [14:22:50] but the other stuff i'm less clear on [14:23:09] it's on a case-by-case basis, it refers to any specific reference to a host by hostname, IP, etc... [14:23:23] the decommissioning cookbook tries to highlight those grepping various repositories for you [14:23:36] and showing what matches [14:23:42] what does "service group" refer to? [14:23:48] (or is that "service group puppet"?) [14:24:12] if the host is part of any group/cluster that is hardcoded somewhere [14:24:17] AFAIK [14:25:10] I agree is not superclear from the quote [14:25:50] ok. i'm going to remove the host from site.pp, then run the decom script, and then cleanup any other references. [14:25:59] maybe that'll work out [14:26:09] the other way around [14:26:31] volans: uff. see https://phabricator.wikimedia.org/T282096 [14:26:44] the order there is wrong, then? [14:28:02] the spare::system step IMHO is not needed, as the decom cookbook does a physical/VM shutdown (+VM removal) and wipes out the bootloader [14:28:53] also the downtime is done automatically but is good to have the step there because the host might have additional monitoring that is not defined under its Icing ahost [14:29:30] also how you wrote it earlier I got that you were removing from site.pp, not changing the role ;) [14:29:54] the instructions say "remove site.pp", and then the bit about system::spare says it's optional [14:29:55] and yes, that step is quite confusing "remove site.pp, replace with role(spare::system) " [14:30:28] I'd just run the decom cookbook and then cleanup puppet repo, including site.pp afterwards [14:30:32] ok :) [14:30:35] if you ask me :) [14:30:45] in my case puppet won't start the relevant service anyway [14:51:11] volans: i like how i get prompted for a homer config diff. "does this look good to you?" "I.. guess? how does it normally look?" [14:51:49] 301 XioNoX for homer's UI messages :-P [14:52:16] it's hard to predict, it depends on the hosts, we could document it somewhere but it would get out of date pretty soon [14:52:19] suggestions? [14:52:40] is it realistic to expect users to understand the diffs? [14:53:21] if not, maybe it's not useful to ask them to say yea/nae [14:54:44] the problem is that currently we can't push partial changes to the network devices (it's in the longer plan), so you might get additional unsafe changes that were just merged into the repo or because data on netbox was changed [14:54:45] cdanis: huh. how do you delete a db instance from dbctl? is removing it from the puppet list of db machines sufficient? there's no `dbctl instance delete` [14:55:39] kormat: I don't recall if it's done automatically when removing conftool-data/dbconfig-instance/instances.yaml [14:55:46] as part of puppet-merge [14:55:46] it is, yes [14:56:56] maybe worth to document it on wikitech/Dbctl [14:57:35] huh, fancy [15:02:09] kormat: yes, please don't push changes you don't understand :) [15:02:48] <_joe_> kormat: there is no delete command in confctl because you also add the instances from the yaml files [15:02:51] kormat: I think initially it was for DCops, then more people started to use it? Which is great, and I'm happy to explain it to people (or people to ping me to double check a diff) [15:02:56] <_joe_> so you can only modify entries at runtime [15:03:17] there are ways to improve it, eg. the partial diffs that volans mentioned [15:04:48] XioNoX: hah, ok. i assumed this was ok: https://phabricator.wikimedia.org/P16068 - i looked db1085 up on netbox, and the interface name seemed to at least match [15:05:29] kormat: cool, that's the good way of doing it :) [15:05:42] if you were decoming db1085 of course [15:05:50] let's say i was. ;) [15:06:02] "surprise decom" [15:06:30] there's always a chance. everytime i type a hostname into something like this, i get cold sweats [15:08:04] one thing I got that was unexpected once as a surprise was unapplied changes that were unrelated [15:08:25] so in that case I just checked with the owner of the other thing to confirm it was expected/wanted [15:32:16] * jbond42 wonderes why his 6 year old desktop with 8GB of ram is running slow. checkes and sees 3 kvm vm's, minicube and 3 additional docker containers running. yep that will do it (https://phabricator.wikimedia.org/P16070) [15:33:05] :) [15:35:38] I have a very naive question if I may. hnowlan had run me through the process we use for making envoy images, e.g. build our own envoy package and then build a docker image with it. So the question - why are we not using pre-built pachages, like https://www.getenvoy.io/install/envoy/debian/ which is referenced as official way of installing envoy on debian? Is it a trust issue? [15:38:45] Pchelolo: https://wikitech.wikimedia.org/wiki/Envoy#Building_envoy_for_WMF says getenvoy didn't exist when we got started, but also the debs they provide are incomplete -- j.oe can give you more details when he's back in the office [15:39:36] thank you rzl, I'll followup with j.oe [15:49:33] <_joe_> Pchelolo: I had the bad idea to look at how that sausage is made [15:49:39] <_joe_> the getenvoy packages from tetrate [15:49:55] <_joe_> and they don't include any of the stuff we expect in a proper debian package to run on real hardware [15:50:26] <_joe_> until we're using envoy outside of containers, we really don't want to use those packages [15:50:35] oh, ok... [15:50:37] <_joe_> sorry, s/until/for as long as/ [15:50:45] <_joe_> we did use tetrate's packages on jessie [15:50:52] <_joe_> and had tons of puppet to add stuff [15:51:01] <_joe_> that was more naturally fitting a debian package [15:51:21] <_joe_> but yes, I should just reach out to tetrate and try to get them to make their debian packages better [15:51:32] gotcha [15:51:38] <_joe_> we also have the option to stop actually building envoy ourselves and dump the binary in the deb [15:52:56] I was just mostly wondering for general education purposes [15:55:21] _joe_: maybe you can answer - do i need to manually run a dbctl config commit after the puppet change removing the instance got merged? [15:55:31] because 2 cumin hosts are alerting [15:55:59] https://phabricator.wikimedia.org/P16072 [15:56:13] the dbctl wikitech page doesn't seem to have anything covering adding or removing hosts [15:58:52] I think so [15:59:22] sorry, meeting [16:00:03] <_joe_> kormat: yes [16:22:06] let me take a look [16:22:24] ah! [16:22:26] yes [16:22:28] you do [16:22:36] because of the hostsByName section in the generated config, yes [16:22:38] okay [16:25:23] kormat: sorry, added instructions on wikitech [16:26:10] cdanis: awesome, thank you! <3 [16:26:22] https://wikitech.wikimedia.org/wiki/Dbctl#Removing_/_decommissioning_a_host [16:51:25] <_joe_> kormat: as a mental model, think that whatever change you make to instances or sections will only be compiled into the mediawiki configuration once you commit it [16:51:40] <_joe_> so anything, including removing or adding hosts, needs a commit [16:59:00] yeah, that wasn't always true for removing a host that wasn't referenced anywhere, but, ever since we moved in the hostsByName section it is [17:23:04] jobo: this is the talk I mentioned about handling cct maintenances: https://ripe82.ripe.net/archives/video/516 [17:23:27] clearly early stages and unsure how applicable any of it might be, but interesting nonetheless. [17:25:53] topranks: there is this too https://phabricator.wikimedia.org/T230835 [17:26:12] but not actively maintained [17:26:58] ok cool, yeah it looks to tackle the same problem. [17:27:18] <_joe_> equinix open-sourced its provisioning tool https://tinkerbell.org/, now in the CNCF sandbox [17:27:34] <_joe_> the concept is interesting, but... 5 microservices [17:27:36] <_joe_> :P [17:28:04] as long as they are transparently manged by our k8s... :-p [17:28:07] *managed [17:28:35] <_joe_> that's one of the things I feel uneasy about, having a tool needed to provision kubernetes running in kubernetes [17:29:04] <_joe_> I can see arguments going both ways [17:29:19] <_joe_> but yes, at least this is stuff that's designed to run in k8s AFAICT [17:30:46] I used packet.net bare-metal-aas, now Equinix Metal, a little bit in last place. We had the odd hiccup but mostly it worked very well. [17:30:48] one of the microsservices README: This repository is Experimental ... and we strongly encourage you to NOT use this in production [17:33:28] <_joe_> topranks: oh so AIUI this is more or less equinix metal [17:33:59] <_joe_> without I guess some of the better features :P [17:34:57] Yes I think so, it's the system they use to provision hosts, PXEboot then install OS image etc. [21:40:36] For a product ostensibly about "search", Kibana/Logstash are embarrasingly bad at doing basic search throughout their own interfaces. Random example: https://user.fm/files/v2-2f9d50fa52d98f521c95021df13ea4d6/capture-dashboard-search.png [21:45:14] 😆