[07:31:49] 10DBA: Investigate (and if possible drop _counters) - https://phabricator.wikimedia.org/T145487#2639278 (10Marostegui) I have renamed `_counters` to `TO_DROP__counters` in all the codfw hosts: ``` dbstore2001.codfw.wmnet dbstore2002.codfw.wmnet db2036.codfw.wmnet db2043.codfw.wmnet db2050.codfw.wmnet db2057.cod... [07:33:57] 10DBA, 06Operations, 10ops-eqiad: db1082 hardware check - https://phabricator.wikimedia.org/T145607#2639280 (10Marostegui) Note: Replication was started and it went well. The host was powered off a bit after for a memtest [07:57:51] 10DBA, 06Operations: Drop database table "email_capture" from Wikimedia wikis - https://phabricator.wikimedia.org/T57676#2639339 (10Marostegui) a:03Marostegui [08:05:16] 07Blocked-on-schema-change, 06Community-Tech, 13Patch-For-Review, 07Schema-change, 05WMF-deploy-2016-09-13_(1.28.0-wmf.19): Add local_user_id and global_user_id fields to localuser table in centralauth database - https://phabricator.wikimedia.org/T141951#2639369 (10Marostegui) After getting it done in co... [08:11:50] marostegui, let me loop you on MCR [08:12:16] but I will try to be as neutral as possible so you can give your honest opinion [08:12:26] jynus: MCR? [08:12:52] it is a new architecture redisign of mediawiki [08:13:13] give it a read: https://www.mediawiki.org/wiki/Multi-Content_Revisions [08:13:39] Ah, ok [08:13:59] Going to give it a read [08:17:09] do not read all [08:17:17] it will be a large wast of time [08:17:28] or read the introduction [08:17:39] and focus on https://www.mediawiki.org/wiki/Multi-Content_Revisions/Content_Meta-Data [08:19:52] ah, ok [08:20:01] much better XD [08:28:35] jynus: I am probably missing lots of context, but from the initial read, it scares me how can we make this scalable [08:29:02] I have already biased with my conversation with tim [08:29:15] so I will just tell you my thoughts staight [08:29:18] *straight [08:29:34] we already have issues with large/tall tables such as revision and logging [08:29:39] jynus: There is also something that scares me, that "text" table [08:29:49] that is a non issue [08:30:02] remember we do not store actual content on s* hosts [08:30:14] only pointers to external storage [08:30:21] which, btw, it is sharded [08:30:24] :-) [08:30:45] phew :) [08:31:01] the main issue is revision ends up being a "things" table [08:31:31] and at least now, we control text on that table, and revisions for other things, like images, separatelly [08:31:57] the content/content_revisions would eventually store every single object [08:32:28] which is the paradigm many failed pieces of software (like drupal) does [08:32:43] good example [08:32:48] will it be less pretty if we have image_revisions, template_revisions? [08:32:54] of course [08:33:08] but at least it will be more scalable and maintanable [08:33:26] imagine now that instead of dropping tables, like you are doing [08:33:40] you have to delete rows of the most accessed table [08:33:46] of the server [08:33:56] no, that is totally mad [08:34:16] it would also mean that that table can become too large to operate even [08:34:18] I sometimes are not good with words [08:34:27] and I wouldn't mind [08:34:36] if there was a scalability plan [08:34:41] but partitioning sucks [08:34:54] and a logical partitioning is my proposal [08:34:56] Yes, partitioning is the worst invention even to scale things [08:35:04] image_revisions [08:35:11] i would rather have 1000 tables partitioned by modwhatever than 1000 mysql partitions [08:35:12] category_revisions [08:35:13] and different kind of object might need additional fields in time [08:35:21] exactly! [08:35:24] you nailed it [08:35:36] image revisions may need sha1 hash [08:35:45] others don't [08:35:54] I am very bad at words [08:35:56] yeah, whatever we decided on is better than partitioning on a mysql level [08:36:10] so if you help me express when someone asks [08:36:15] Of course! :) [08:36:31] the main issue is that it is not a "I will have to work more" [08:37:00] it is, it will be hell to make it fast and maintain [08:37:19] the biggest issue is that "thing" tables [08:37:22] it will be impossible if it grows like the predict it will in that page [08:37:35] oh, I think those are optimistic [08:37:39] I mentioned that to tim [08:37:52] now, a single page edit, does 1 revision [08:38:04] later, by editing diferent contents [08:38:13] you would generate 1 revision each [08:38:57] I see [08:39:15] so again, I think the idea is nice [08:39:24] I think the implementation is flawed [08:39:27] that table would be just madness, it will also be almost impossible (or basically lots of work) to shard if needed in a few years time [08:39:35] and let me give you a precedent [08:39:53] wikidata started growing like crazy (which is good) [08:39:55] the idea is nice, the implementation needs a bit of tweaking from the mysql side indeed, to make it scalable and fast [08:40:11] everybody said "do not worry, it will slow down with time" [08:40:38] haha [08:40:50] 4 years later the speed of edits has not but increased [08:41:02] which is good, do not get me wrong [08:41:29] wikidata is the best thing that wiki-world has produced in a long time [08:41:46] but also has scalability problems [08:42:12] we want to make sure a design is scalable and robust [08:42:25] And operable [08:42:39] This multi revision is clearly something that will grow a lot [08:42:53] some people complain "why does it take so much time to do schema changes?" [08:43:28] do you think they will be happy when we say that those changes will literally make changes slower? [08:44:01] so, again, MCR is good, we want that [08:44:23] Or if the table is big enough where at some point you cannot even operate it (we had a few tables like that back at my previous job) [08:44:24] but let's implement it properly [08:44:29] yes, totally agreed [08:44:50] And I think our proposal isn't that much of a change to have multiple tables hashed or whatever [08:45:13] but that can always be discussed [08:48:02] the idea is that image_revision almost already exists [08:48:35] same for flow_revisions, etc. [08:48:53] so it would be uglier, but even easier to make the changes necessary [08:49:21] to integrate those in an all-around-MCR desing [08:50:29] Yes, I think it needs to be clear that from a mysql point of view it is very inefficient what they're thinking about now, but the good thing is that there are ways to make it efficient and fast, it just requires some changes, but we can achieve the same goal [09:42:12] 10DBA, 06Operations: Drop database table "email_capture" from Wikimedia wikis - https://phabricator.wikimedia.org/T57676#2639920 (10Marostegui) This table currently exists at: S1 (enwiki - ie: db1053) S3 (testwiki - ie: db1044) As it has been said, it has not been written in a long time. ``` root@db1053:/sr... [10:00:27] marostegui, I will wait for your thoughts on 310530, and deploy evertyhing otherwise [10:00:39] I mean https://gerrit.wikimedia.org/r/310530 [10:00:48] checking [10:01:41] It looks good to me [11:43:36] 10DBA, 10MediaWiki-Page-deletion, 06Operations, 07Performance: Cannot delete two pages with large histories even having the appropriate permissions to do so - https://phabricator.wikimedia.org/T145630#2640244 (10MarcoAurelio) Thank you. Is it possible to grant more limits to stewards when performing bigdel... [12:50:12] 10DBA, 06Operations: Drop database table "email_capture" from Wikimedia wikis - https://phabricator.wikimedia.org/T57676#2640375 (10Marostegui) ``` db1053.eqiad.wmnet MariaDB PRODUCTION s1 localhost enwiki > rename table email_capture to TO_DROP_email_capture; Query OK, 0 rows affected (0.23 sec) db1044.eqia... [12:58:08] jynus: memtest came back w/ zero errors [13:00:28] 10DBA, 06Operations, 10ops-eqiad: db1082 hardware check - https://phabricator.wikimedia.org/T145607#2640397 (10Cmjohnson) performed a memtest, test came back with zero errors [13:01:08] cmjohnson1, sorry :-( [13:02:06] not a problem [13:02:20] 10DBA, 06Operations: Investigate db1082 crash - https://phabricator.wikimedia.org/T145533#2640407 (10jcrespo) [13:03:12] Aren't "cosmic rays" supposed to change memory bits 1 in a trillion? [13:03:28] maybe we got one of those :-D [13:05:49] with ECC it shouldn't [13:06:07] :-) [13:06:13] :-P [13:06:16] volans, do not take me seriosly [13:06:44] if software is involved, it is always the software [13:06:54] (except when it isn't) [13:07:14] cmjohnson1: Thanks - I have never seen memtest providing errors as you said yesterday [13:07:29] I wonder if someone ever did :) [13:07:55] hahaha yeah probably not [13:08:05] marostegui, you take care of repool? [13:08:10] yep [13:08:15] good [13:08:25] cmjohnson1: Can you bring the server back up or is that something we do from the ilo? [13:08:35] should be up [13:08:50] i didn't watch it post but it rebooted. [13:09:05] Ah right, I will check thanks :) [13:28:49] 10DBA, 06Operations, 10ops-eqiad: db1082 hardware check - https://phabricator.wikimedia.org/T145607#2640460 (10Marostegui) Thanks Chris - I will close this ticket and we will keep updating the upstream. [13:31:00] cmjohnson1: Looks like the server doesn't have network (ping from neodymium doesn't reply) and I cannot ssh it, however, looking at the ILO, the server has booted up with this error: [FAILED] Failed to start LSB: ferm firewall configuration. [13:31:18] I don't know the root password so I cannot troubleshoot it, so someone with the password would need to check :) [13:31:46] may just need another reboot ...the memtest may have put in an awkward state [13:32:08] cmjohnson1: I just did :( [13:32:52] k...give me a minute I will plug the cart in and check it [13:33:12] No rush, thanks [13:35:31] cmjohnson1: It is now replying to ping :) [13:35:32] marostegui: fixed [13:35:39] what was it? [13:35:41] forgot to plug the cable back in [13:35:51] haha :) [13:35:53] Easy fix! [13:35:54] Thanks [13:36:02] yw [13:36:27] ferm [13:36:30] ? [13:36:44] maybe volans did some change recently? [13:37:15] jynus: to ferm? do I didn't [13:37:43] the cable looks like more low level than ferm :-P [13:40:31] 10DBA, 06Operations: Investigate db1082 crash - https://phabricator.wikimedia.org/T145533#2640476 (10Marostegui) After the memtest (no errors found) the server is back and catching up with the master. Once it caught up, we will pool it back and slowly give it some weight in the LB. [13:40:33] sorry, didn't saw that [13:40:40] :-( [14:16:10] marostegui: ah, you're not yet in pwstore [14:16:21] do you have a PGP key for your wikimedia identity? [14:16:46] the root password is stored in pwstore [14:24:15] moritzm: I generated one the first day I remember, but I think in the end I was told to leave it till the offsite as there will be a key sign party :) [14:24:38] Which is fine, as I have not needed the root password really :) [14:25:28] ok, we can also do that at the offsite [14:25:36] sure [14:25:53] it doesn't seem to be uploaded to the keyserver network, though? gpg --search-key marostegui@wikimedia.org doesn't give any results [14:26:10] moritzm: Ah, no I didn't do anything with it [14:26:16] I can upload it though [14:27:05] yeah, you can do that already, it's no problem if it's round with any sigs for a while [14:28:27] moritzm: Cool then, should I follow this? https://wikitech.wikimedia.org/wiki/PGP_Keys ? [14:28:54] yes, please [14:29:27] Cool, wilco [14:29:29] thanks [14:42:44] 10DBA, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Ensure ORES data violating constraints do not affect production - https://phabricator.wikimedia.org/T145356#2640661 (10Halfak) [14:49:45] moritzm: I pushed it 10 minutes ago, it should be there now I believe, again no rush [14:56:37] soon to merge https://gerrit.wikimedia.org/r/310564 [14:58:52] lets see [15:01:17] I will want to break db1082, so tell me if you are doing something with it [15:01:30] (start it and stop it several times) [15:01:59] jynus: Nope, I am not doing anything, it is catching up with the master, but no worries [15:02:13] jynus: Feel free to break it [15:41:55] brion, I am here [15:42:02] \o/ [15:43:20] so we are indeed trying to introduce an abstraction layer, which can be a bit funky indeed :) [15:43:31] i want to make sure that major issues you might forsee are something we take into account [15:43:42] I do not mind indirection and normalization [15:43:55] in fact, I like the ideas [15:44:02] it is the implementation that worries me [15:44:19] i am definitely concerned about how differing access patterns might affect performance [15:44:54] consider that accesssing the revision table could be easyly 80 of our main db requests [15:44:58] yep :) [15:45:09] my thought is that we have basically two major access patterns on revision [15:45:11] not 80% of the performance because it is usually very fast [15:45:18] one is bulk fetches -- [15:45:34] the history lists, contribs lists, rc & watchlist, and of course dumps [15:45:45] the other is individual fetches or updates -- [15:45:47] yes, key-value access [15:45:53] is not a concern here [15:45:57] such as when rendering a page, or editing it [15:45:58] *nod* [15:46:11] even if it is the most common access [15:46:16] obviously adding a couple extra tables to a join is a concern :D [15:46:21] on the bulks [15:46:22] not really [15:46:28] aho [15:46:29] many people say that [15:46:58] but in reality, if all tables were smaller than the original one [15:47:22] we would get a huge boost because hoter keys will have more space to be in memory [15:47:27] nice [15:47:33] ok that's good to know :D [15:47:44] so for many intends and puposes, consider accesing a table by key O(1) [15:47:50] we were speculating earlier (me & daniel) that if we can compact the rows it might help but wanted to confirm with you that makes sense [15:47:59] and acessing a single value of a few keys O(1), too [15:48:18] *fet tables [15:48:20] *few [15:49:03] (many of the bulk fetches won't actually need to dive into the content table, though it def will for dumps. either way if we know that's not inherently inefficient that's good :D) [15:49:05] brion, that is very generic, I suppose there is a plan there, feel free to send me the idea [15:49:45] so one thing we noticed is there's a couple varchars/varbinaries in revision [15:49:56] but if it involves json columns, it is normally not a good idea (but it depends on the access pattern) [15:50:01] hehe [15:50:11] some json usage is good [15:50:20] json stuff should only get fetched on a full render or dump, fortunately, and then just to fetch it [15:50:26] the one, I do not remember if it is ES or PC, is ok [15:50:30] luckily we're not doing anything damnfool like joining on json values :D [15:50:37] yeah [15:50:50] can I give you an example [15:50:54] so revision has rev_user_text which we think we could factor more efficiently [15:51:03] that is separate from this [15:51:06] and rev_comment which feels like it could be broken out but i'm not sure that's a good idea [15:51:06] sure [15:51:27] so that we do not talk about the same topic, so it is not "controversial" [15:51:32] :D [15:51:39] daniek kind of agreed on this one [15:51:45] and I think it was not even original [15:52:01] (and I am not proposing to do this now, it is just an analogy) [15:52:07] ok :) [15:52:13] templatelinks, pagelinks, etc. [15:52:15] imagelinks [15:52:24] very tall tables [15:52:40] usage very different than revision [15:52:49] but they will work for what I want to say [15:53:09] in some cases, like templatelinks on commons or one of the wiktionaries [15:53:18] larger than most other tables [15:53:24] for obvious reasons [15:53:49] typical way to solve performance problems: "let's move usage outside of mysql" [15:53:54] heh [15:53:58] I mean, yes, maybe [15:54:10] but why not use better designs (*) [15:54:25] e.g. we create an entity "title" [15:54:32] no longer a weak entity [15:54:54] but a strong one shared by page, image, and *link tables [15:55:11] without compression, you would get a huge reduction in size on those tables [15:55:18] you will need extra joins? yes [15:55:22] *nod* [15:55:28] will it make more complex queries [15:55:29] yes [15:55:33] is it ideal [15:55:34] no [15:55:42] but it could be considered [15:55:46] because smaller tables [15:55:51] easier to maintain [15:55:57] less performance issues [15:56:13] users other than mediawiki can do full scans faster [15:56:24] is it the same issue than revision? [15:56:26] no [15:56:45] nice yeah, that deduplicates a lot of large strings and lets the main association table be super compact [15:56:51] but it is a way, with a (maybe) better design, to solve a pure implementation problem [15:57:06] again, that was an example, let's not talk about that [15:57:22] but in spirit, is the idea of smaller tables [15:57:41] though it's a relevant one, we thought about similar ways to compactify rev_user_text into a more compact reference :D [15:57:43] I think it was tim that had the idea of separating the comments outside [15:57:51] of the table [15:57:55] yeah there's a lot of duplication in comments, many similar ones :D [15:57:58] and at the same time allow 2GB comments [15:58:02] hehe [15:58:09] in that case duplication wouls not be a huge win [15:58:17] heh [15:58:22] but in most cases, you only need 1 of the 2 tables [15:58:37] while reading both would have almost no impact [15:59:03] jynus: another quick check while I'm thinking of it... fixed-size rows: something we should strive for or not a big deal? [15:59:07] I send time a summary of production problems we currently have [15:59:11] *sent Tim [15:59:19] usually we have a bunch of integers which are fixed size and then a varchar or two [15:59:30] ah great i'll read those over too [15:59:38] brion, on modern mysql, do not go for fixed [15:59:49] ok good to know [15:59:54] innodb has been tuned to be very efficient dynamically [16:00:03] * brion learned mysql in 3.x days, i may still have bad ideas ;) [16:00:11] no, and at that time [16:00:17] myisam fixed was a huge win [16:00:24] yeah [16:00:29] now oracle had to concede that utf8 was here [16:00:38] i remember we had to keep some tables in one format and others in another, it was awful days :D [16:00:39] and optimize for variable size [16:00:53] so, to summarizr [16:01:02] and they finally support 4-byte chars thanks to emoji ;) [16:01:10] you have here an ally [16:01:16] for more normalization [16:01:21] and compaction [16:01:22] etc. [16:01:47] but let's try to, without sharding or partitioning [16:01:52] have smaller tables [16:02:02] excellent! [16:02:13] and to be fair, if this was for one of the 800 smaller wikis [16:02:13] let's definitely aim for that [16:02:24] I would not even care [16:02:32] yeah it's gotta work for enwiki or we can't do it [16:02:44] but enwiki, commons, wikidata have scalibility problems [16:03:08] and I would like mediawiki to take care of those, instead of the hacks we currently have at mysql level [16:03:19] *nod* [16:03:20] just for those larger wikis [16:03:35] it is a huge burden for me to have "special slaves" [16:03:45] yeah, and i don't want us over in archcom land to be 'architecture astronauts' with no idea what's going on on the ground too ;) [16:03:59] your advice is very valuable! [16:04:01] I do not think you are [16:04:33] ok so i think there's two major avenues we can tweak the existing MCR plan in... [16:04:57] one is making sure the content table (the new table with the per-rev+per-slot info) is compact so we're not introducing new pain [16:05:08] the other is changing the revision table to be more compact while we're fiddling with it [16:05:59] is there anything on the content table that looks particularly worrying to you? [16:06:31] slot id and content model can be integer references instead of strings (i think we already made that change) [16:07:05] we might be able to compactify the storage reference, but not sure an int+bigint pair or something would really be a big change [16:07:23] ugh where's my bookmark for the wiki page [16:07:53] https://www.mediawiki.org/wiki/Multi-Content_Revisions/Content_Meta-Data#Re-using_Content_Rows there we are [16:08:52] that is only part of the pain [16:09:02] what about extensions writing to that table? [16:09:22] and how to handle extensions being deleted (and its content has, too) [16:10:19] so for contents+slots, extensions should be writing to it only by passing an additional slot with their custom data in during edit, so they should never be 'orphaned' [16:10:31] in the way that extensions writing blobs directly to ES currently orphan their data [16:11:21] if we kill an extension that was actively saving data then we'll basically have to put in some kind of stub handler for that type i guess [16:12:00] the old blobs would stay there just like other obsolete text data :) [16:12:10] :-( [16:12:22] that is the difference [16:12:36] now, we can just drop entire tables [16:12:56] later, we will have to do complex operations on precisely a large and busy table [16:13:14] well the theory is we'd still want the _data_, it's historical/archival page source [16:14:03] just like we keep old versions of files that have been deleted from mediawiki source code in git [16:14:35] ah that reminds me -- [16:15:07] -- there was some thought originally of storing some extension-specific derived data in content slots as well, but we're pulling back on that [16:15:10] so you may be happy to hear that :D [16:15:45] we might, or might not, want a more consistent 'way to store your derived blobs' for extensions [16:15:57] but it's looking like content slots are wrong place to put them [16:17:16] sounds like you'd prefer extension-specific tables for derived data as it's easier to drop them w/o fiddling with a giant shared table [16:17:46] plus it just keeps the access patterns separate [16:18:01] brion, let me give you the concrete example [16:18:07] of why I prefer that [16:18:18] https://phabricator.wikimedia.org/T54921 [16:18:31] it is not the only option, though [16:18:49] partitioning the table in a sane way [16:18:58] heh [16:19:07] but I would like to avoid that at all costs [16:19:38] that is why I prefer a "logical" partitioning [16:19:52] how or the details, I do not care much [16:20:02] there are room for several options [16:20:25] and I think both options are not exclusive [16:20:32] let me give you a concrete example [16:20:37] ok [16:20:44] we create the image_revision and image tables [16:21:03] that will be easier to later integrate into revision/content [16:21:13] if we go that way [16:21:23] and will be useful at the same time [16:21:43] (note that almost everybody agreed to that chage a few weeks ago) [16:22:05] I said in an email that I am pragmatic [16:22:10] I want something now [16:22:19] :) [16:22:21] even if it is small improvement [16:22:30] than a huge improvemnt in a year [16:23:01] so I think my opinions may be wrong [16:23:18] but they are not radical or try to oppose for the sake of oposing [16:23:27] I only do not like a specific part of the proposal [16:23:36] I love most of the others [16:23:41] *nod* [16:23:51] it just happens that the specific part is the core of the thing :-) [16:23:55] haha [16:24:27] so what'd be the key things you'd recommend as an alternate? [16:25:44] i won't take it personal, honest :) [16:27:15] * AaronSchulz tries to skim backscroll [16:29:44] so there is an "alternative I would do" [16:30:00] and a "if you get something like this, I will be happy" [16:30:06] :) [16:30:18] it's good to have options! [16:30:25] the second is, give me smaller tables [16:30:39] and stop saying "this are already small tables" [16:30:46] (not you) [16:31:01] they are not, revision as it is now, it is not small [16:31:15] however you do, you will get me happy [16:31:24] also have into account the maintenance [16:31:43] to be done in the future: removing data, schema changes, etc. [16:32:01] how I would do it? [16:32:26] same thing, but separate tables for separate entities- keep as much compatibility as possible [16:32:34] with the past [16:32:59] but make adjustments to existing tables (image, etc.) to make it closer to the proposal [16:33:21] but use multiple-table inheritance to avoid very large tables [16:33:45] it is more of a "put some patches here and there" [16:33:53] * brion ponders possibilities [16:33:53] but it will be faster [16:34:05] and we are really in need of maintenance-like features [16:34:12] you can still do [16:34:19] structured data support [16:34:28] e.g. infoboxes on tables [16:34:36] but why not create new tables? [16:34:49] similar to the ones proposed [16:35:04] you link back to page or revision, as you do with images [16:35:12] *nod* [16:35:21] again, I am not sure I would do that [16:35:24] main thing is to make sure we have a way to know which tables to join on [16:35:36] I concede that is an issue [16:35:46] but we have a content-type [16:36:00] the problem is, future uses are not yet clear [16:36:01] i don't feel like i have a good handle on how to do that cleanly without something like the slots table :D [16:36:26] if future uses are not yet clear, why do an "everthing goes" [16:36:32]