[07:53:57] for Gerrit 286858 I've removed the lock_wait_timeout as agreed the other day. everything else unchanged. I'm running the compiler now [07:56:29] note that I would want to change it- but I am seeing breakage reports on when it was lower [07:57:41] yes, ofc [08:14:53] FYI, avoid the auto-sizing of thread_pool_size before 10.1.9 ;) https://jira.mariadb.org/browse/MDEV-7806 [08:16:14] I am worried about auto-sizing for max_connections [08:16:34] let's check memory impact on some variables that depend on it [08:16:45] after you merge [08:16:52] such as p_s variables [08:17:13] I hardcoded them, but maybe I left some out [08:18:06] performance_schema_digests_size is auto-sized [08:18:37] maybe extra others [08:18:52] ok, I'll take a look [10:01:08] I am checking several options for table archival [10:01:43] for dbstores? [10:01:49] for es [10:02:13] .ibds (transportable tables) are fast to recover, but allow little flexibility and could be potentially affected by file corruption [10:02:46] .txt are easy to handle and very flexible on reimport, but it could take some time to recover if large reimports [10:04:22] I suppose the answer is to backup both? [10:04:47] if we have the space :) [10:04:55] and then is the compression [10:05:11] es is already compressed row by row in the original data [10:05:58] but I am getting a 15% smaller size using pigz [10:06:09] and a 31% using plzip [10:06:35] those are too small for regular backups, but considerable for such huge files and long-term storage [10:07:10] if you have any thoughts on disaster recovery, please share them [10:07:59] for compression you could try also LZMA, I think pxz is the parallel version [10:08:20] plzip is lzma [10:09:03] I can use the xz container if you prefer it, but I did not have good experiences with it in the past [10:09:29] no for single files not worth I guess, simpler is safer here [10:10:05] let me check real quick with the non-paralel version [10:13:09] I get similar results, probably xz is more standard and recognizable [10:15:43] could be, I don't know between the two which one is more used/maintained [12:46:03] a glitch on db1064 and db1068 [12:47:55] at 11:49 [12:48:05] yes, I saw it, but was already recovered, and I don't see errors on the other s4 slaves with lower weight [12:48:33] they are both in the same rack, could have been network? [12:49:06] i like to think software first [12:49:25] but yeah, I see no correlatino otherwise [12:50:50] the spike in aborted connections on tendril is ONLY on db1064 [12:52:57] ok slightly differnt errors [12:53:13] (if you missed my last message: the spike in aborted connections on tendril is ONLY on db1064) [12:53:35] db1068 almost all are Lost connection to MySQL server during query [12:53:36] ForeignDBFile::loadExtraFromDB [12:54:15] while db1064 are mostly Can't connect to MySQL server [12:54:27] jynus__ ^^ [12:58:24] yes, saw that, but (while not sure) I just think it is not relevant- that is probably a very common commons API call [12:58:51] as in- connections could not be created vs. the existing ones failed [12:59:24] yes, I can see the very common query to get img_metadata slow up to 17 seconds on both [12:59:57] with avg on 9, and is by primary key [13:00:07] is that now? [13:00:15] or during the issues? [13:00:20] during the issue [13:00:58] then I would wait for now, consider it a one time thing, be alert if it happens again [13:01:16] yep, agree [13:01:46] before it didn't happen, and today there are no code deployments [13:02:24] I will check my "while trues", however [13:03:20] it doesn't have one, I will set up one [13:03:36] (and be one step closer to feel guilty to no puppetize it) [13:04:15] lol [16:22:00] * volans back, a bit later than expected... [16:35:11] slightly high traffic, some spikes here and there but otherwise no surprises... [16:35:13] yet [16:38:23] ok, btw I just found that db2011 has read_only=0 (m2) [17:07:44] if read_only is because of pt-diff, +1