[06:29:11] think i did a similar thing just now on db1070, but something to do with dropping table and thrashing table cache, rather than metadata locking [06:30:13] in other news, the s5 rolling schema change gerrit 228756 is done. doing some checksumming [06:41:45] springle: legoktm was wondering in another channel how we should store floats in mysql.. [06:41:54] he's doing a varchar now which I guess isn't the best idea :D [06:49:05] YuviPanda: depends how much you care about precision vs accuracy. choose between DOUBLE or DECIMAL really [07:08:09] springle: ah, cool. I don't think we care too much about either past two digits of accuracy but it's good to not use varchar or something stupid [07:12:18] yep [07:13:23] well varchar == more space, more space == bad [07:13:42] but for some strange reason, timestamps are stored as characters on wikimedia [07:13:50] as a rule [07:14:08] yeah, i once tried to have the timestamps argument [07:14:18] and why? [07:14:45] only thing I can think of is decimals, which only arrived at 5.6/10 [07:15:05] but then you have decimals [07:15:31] my first decimal meant "decimal digits", the second, the type [07:16:31] other things I do not understand, is the rule of having table names in singular, something that I have never seen as a standard anywhere [07:17:22] but hey, I have greater wars to fight [07:21:11] springle, important proposed schema changes, take a look: https://phabricator.wikimedia.org/T107610#1506876 [07:23:03] iirc, binary(14) was a pre-mysql-4.1 choice, back when TIMESTAMP defaults were harder to control. And at some point, the string 'infinity' also crept in, along with milli/micro seconds in some extensions [07:23:21] infinity, lol [07:23:26] i suppose we might be in a better sitation now, and could have the DATETIME discussion again [07:23:29] heh [07:23:32] yeah [07:23:40] Nan probably too? [07:23:51] i don't recall. probably [07:24:05] it was messy and painful enough to try to change en-masse [07:24:13] it is ok [07:24:22] binary for texts is not a huge issue [07:24:55] but for new, independent features maybe it can be pushed [07:26:51] yeah, i agree with you on T107610#1506876. new hardware is easy. compression and flow changes can wait [07:27:35] oh, I agree with that, but that is not what is written there [07:27:56] that is what they want to do after hw provisioning [07:28:30] they wanted a new shard for flow, convinced them they only needed a separate table [07:28:36] i know. i agree with you. the last bit was more opinion, intended to show that i give flow changes low priority :) [07:28:47] ok perfect [07:28:58] sorry, will be more specific next time [07:29:44] wanted to check because technically "I took a decision" without asking you first [07:30:17] but it was in this case obvious to me, it took me a while to understand what they wanted [07:30:37] (a config change, not actually new servers) [07:31:33] no problem. you should indeed feel free to make the obvious decisions! [07:31:46] I said to mark on the last meeting that I would like to work more hand in hand with the developers, before they implement crazy things [07:31:57] but that sadly many times it is impossible [07:32:46] for me, it was amussing them taking several iterations and a formal meeting to accept my suggestion or normalizing a table [07:33:14] didn't you and i argue a bit about working with devs? :) when we spoke about me getting cc'd on all SQL gerrit commits? glad to see you're doing it ;) [07:33:27] it won't be much fun sometimes [07:33:47] but well worth it [07:33:59] well, I got to reduce the revision table size [07:34:04] that is a big win [07:34:17] although a pain in the ass to roll in [07:35:00] did they deicide to do anything about content model fields in page table too? [07:35:16] yeah, it is part of the same package [07:35:19] nice [07:35:26] I fighted to normalize that [07:35:34] id + small table [07:35:48] for content ids [07:36:03] to be fair lego* on this channel did all the work [07:36:15] watch out for Special page queries that soon will try to join/filter/order on that table [07:36:30] interesting [07:36:39] i guarantee someone will try it, in some horrible way, probably with LIMTI [07:36:59] like the bot we recently had [07:37:06] LIMIT 5000000000000, 51 [07:37:20] ori finally disabled the page completelly [07:37:50] that won't fix it for good. we've disabled similar pages and API queries over time, with same silly limits [07:38:03] funny thing is that on other special pages, paging is done ok, based on id [07:38:20] it's coming from some code path and appearing in edge cases [07:38:31] and usually only bots find it [07:39:41] so how things are going for you, more in general? [07:40:45] well enough. still trying to organize leave through safeguard [07:40:52] so i can't give you actual dates yet [07:41:04] np [07:41:23] not loving the three layers of bureaucracy [07:41:59] I got aproved for Percona; independently of your state; but not 100% sure if I could attend [07:42:21] how're you going? noticed your sleep pattern is getting more like everyone else around here ;) [07:42:25] oh ok [07:42:33] Hehe [07:42:47] Sleep 'pattern' :) [07:42:55] not 100% sure? because of permission? or availability? [07:43:07] well, a bit more stressed, probably due to external factors [07:43:16] let me leave it a bit vague [07:43:25] yeah, i suppose 'pattern' is strong word implying some uniformity over time [07:43:35] np [07:44:31] I also want to help with fw, and rob with purchase, and maps, checksums etc. but lots of interruptions by servers and other people [07:44:51] difficult to focus on long tasks [07:45:29] that's one of the benefits i have, is timezone enforcing asynchronicity [07:45:35] you don't get that benefit :( [07:45:51] i did wonder if it would unfairly push load onto you [07:45:56] yep, most people understand that [07:46:03] not your issue [07:46:27] unfortunately, there isn't much we can do, except make sure you feel ok saying "wait" or even "no" [07:46:31] but some say "I need help from DBAs, oh Jaime is connected, will poke him" [07:47:14] oh, I've made a rule [07:48:05] anyone that comes to me in a bad tone "because you DBAs/OPs take too much time" can wait [07:48:45] :) [07:50:57] then only thing that probably we can do is to assign small subtasks in the longer ones [07:51:17] so they get some attention [07:52:13] jynus: so, possible incendiary left-field question ;) what do you think about TokuDB for simple traffic like External Storage with mostly primary key access and without multi-source replication, instead of recompressing at application row level? [07:52:46] I think application-level compression is always better [07:52:54] really? why? [07:53:05] it load-balances workload [07:53:18] CPU load and page duplication? [07:53:23] fair point [07:53:33] no, actual work by developers [07:53:49] but yes, also application servers [07:54:21] it also results in potential 6-month tasks with sustained load, like the recompression [07:54:22] and I think pk access with innodb is the greatest thing ever [07:54:46] true [07:54:57] so you also would not like innodb page compression? [07:54:58] it is not "append only" [07:55:13] they are also citical servers [07:55:34] do not trust toku yet that much [07:55:50] well, yes, i'm only speaking of a posibility [07:55:52] and throwing hardware has to be done already in any case [07:55:53] would need trial [07:56:01] well [07:56:08] if we are talking theoretically [07:56:14] yeah, good point re hardware [07:56:36] toku + SSDs or Inodb compression + SSDs are ok [07:56:38] yet new hardware also gives us time to try other options [07:56:47] as in sweet options [07:57:09] nod [07:57:45] it is the "long term storage" that uneases me [07:57:58] cool. interesting points. i'm not sold on Toku in critical servers yet either, but percona buying them makes it *possibly* have good future [07:58:09] single storage engine makes binary copies [07:58:14] more difficult [07:58:28] so with horizontal deployments [07:58:31] I'm ok [07:58:49] right now it is a pain to repopulate dbstores with non-dbstores [07:59:06] or the other way round [07:59:30] yes. possibly we should just increase dbstore disks [07:59:35] so, theoretically, I would be very for compression [07:59:37] and stick with inno [07:59:50] to reduce io [08:00:13] or inno + compression, which is fairly stable in 10.0 (whatever xtradb that is) [08:00:17] but both innodb compression and toku have some issues [08:00:26] maybe only on the versions we use [08:00:43] we have innodb compression issues? [08:00:51] I haven't follow latest developments [08:00:53] where did this happen [08:01:05] there were some issues on early 5.6 [08:01:39] yes, correct. and we had some issues with innodb compression in early 10.0 trials. but i'm fairly sure that after about 10.0.11 it's been uneventful [08:01:57] as in the equivalent to memory leaks but with compression failures that reduced performance [08:02:11] I think fb made patches [08:02:25] do not know if they went upsteam or to percona/mariadb [08:02:42] ssds, out of budget? [08:03:10] for dbstore? or ES [08:03:15] for core [08:03:43] probably in budget by now, for core [08:04:02] they were in budget for cassandra hosts, which have similar data set magnitude [08:04:16] if that is the price we have to pay, I am ok with trying it [08:04:54] we'd have to ask RobH for a quote with current Save specs and at least 2T SSD [08:05:04] s/Save/Slave/ [08:05:08] foo, bbl [08:05:11] food, even [08:05:26] * springle must need to eat; typing skills are degrading... [10:21:28] I will give you that the lazy ALTER TABLE for adding columns of TokuDb is cool [13:03:14] with 10 we have lost ALTER TABLE progress report for online changes [13:03:37] one liner to substitute it: [13:03:47] echo `du -m '#sql'*'.ibd' 2>/dev/null|cut -f 1` MB written out of `du -m image.ibd |cut -f 1` MB [13:27:15] The wikidatawiki.wb_items_per_site schema change is needed for testwikidata as well [13:27:30] oh [13:27:50] the table has less than 3k rows there, so shouldn't be painful [13:27:55] let me check, if it has a trivial size, we may be able to do it fastly [13:28:34] on our defense, I think it was not explicit on the commit [13:28:50] true :/ [13:31:24] "testwikidatawiki" easy name [13:35:39] testwikidatawiki, wb_items_per_site, CHANGE ips_site_page ips_site_page varbinary(310) NOT NULL [13:35:56] hoo^ can you double check the change [13:37:45] yeah, that looks right [13:38:44] ok done [13:39:03] main issue is not actually the size, but the traffic [13:39:42] There aren't many write to that table, I guess [13:39:57] actually, selects are the blocker [13:40:58] its complicated (metadata locking) [13:41:15] I thought that would be "just" table lock [13:41:19] * a [13:42:29] no: reads can happen and in 10 also writes, but the ones ongoing have to finish in order to reload the table definition [13:43:23] can be summarized on when high traffic, many things that usually are not a problem now can go wrong [13:44:35] I will let you check it, and I will comment on the issue [14:40:52] duplicate key on db1056, depooling [15:33:03] I suspect out of band changes- if it was a master replication issue, all slaves would fail (with a specific auto_increment) or all would have worked (without an auto_increment) [15:34:09] that leads to 2 posibilities- pt-online-schema change breaks replication or we have 3rd parties writing to slaves [15:51:34] oh, I am having an orgasm right now- I think I solved the mystery problem now! [15:52:35] statement binlog + isolation level + insert select = duplicate errors [15:52:42] and I have the proof [15:53:29] which is nice because it discards me, you and pt tools as the source of problems [15:53:43] also mysql versoin [15:54:01] but makes the problem generalized site-wide [15:56:39] "proof that is too large to fit in the margin", but we will talk tomorrow [15:56:51] row based-replication FTW [16:32:28] I leave pt-osc running on db1019 from iron; s4 is not yet complete [19:53:16] *db1042, not db1019. This is all for T90300