[05:51:27] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s5 - https://phabricator.wikimedia.org/T166207#3374033 (10Marostegui) db1026 is done: ``` root@neodymium:/home/marostegui# for i in `cat s5_tables`; do echo $i; mysql --skip-ssl -hdb1026 wikid... [05:51:36] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s5 - https://phabricator.wikimedia.org/T166207#3374034 (10Marostegui) [06:01:42] 10Blocked-on-schema-change, 10DBA: Apply schema change to add 3D filetype for STL files - https://phabricator.wikimedia.org/T168661#3371212 (10Marostegui) p:05Triage>03Normal [06:03:23] 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3374039 (10jcrespo) > I propose Monday after ops meeting (17:00 UTC). I am sorry, but that is outside of my working hours. [06:03:28] 10Blocked-on-schema-change, 10DBA: Convert unique keys into primary keys for some wiki tables on s7 - https://phabricator.wikimedia.org/T166208#3374040 (10Marostegui) codfw master finished. [06:38:30] 10Blocked-on-schema-change, 10DBA: Apply schema change to add 3D filetype for STL files - https://phabricator.wikimedia.org/T168661#3374065 (10Marostegui) a:03Marostegui As Jaime pointed out in the changeset, it should be a fairly straight schema change as it is only modifying an ENUM at the end of the list.... [06:38:45] 10DBA, 10Operations, 10Traffic: dbtree: make wasat a working backend and become active-active - https://phabricator.wikimedia.org/T163141#3374067 (10Dzahn) [06:39:04] 10DBA, 10Operations, 10Traffic: dbtree: make wasat a working backend and become active-active - https://phabricator.wikimedia.org/T163141#3187493 (10Dzahn) a:05Dzahn>03None [06:46:52] 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3374077 (10madhuvishy) @jcrespo no problem! let me know what time works for you :) I can do earlier on Monday too. Would 14:00 UTC work? Feel free to propose a suitable time if not. Tha... [07:24:20] 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#2867045 (10Marostegui) >>! In T153033#2867070, @Reedy wrote: > No rush to remove this one, but it should eventually. Need to check if the data has any use for anyone (Analytics or research, maybe?) before dropping it completely... [07:30:51] 10DBA: DBA-related homes on tin - https://phabricator.wikimedia.org/T132696#3374096 (10Marostegui) 05Open>03Resolved These two are empty now - I guess they were cleaned up when tin was reimaged (T144578#2738301) [07:52:09] 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3374119 (10jcrespo) So there are several things here- the dns change and the actual reboot. There should be a time between them. I say you (as in, anyone on your team) change dns day 1... [08:04:45] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s5 - https://phabricator.wikimedia.org/T166207#3374137 (10Marostegui) [08:05:02] 10Blocked-on-schema-change, 10DBA: Convert unique keys into primary keys for some wiki tables on s1, s2, s4, s5 and s7 (eqiad) - https://phabricator.wikimedia.org/T164185#3374140 (10Marostegui) [08:05:05] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s5 - https://phabricator.wikimedia.org/T166207#3288328 (10Marostegui) 05Open>03Resolved s5 is done [08:33:24] using tendril? [08:34:11] ^marostegui [08:35:35] yes [08:35:49] but you can do whatever you want to it [08:35:52] I am not using it all the time [08:35:55] just from time to time :) [08:35:57] so go ahead [08:35:58] of course [08:36:04] I just wanted to deploy [08:36:09] go for it :) [08:36:18] https://gerrit.wikimedia.org/r/#/c/359372/2 [08:36:26] which comes with the chance of a breakage [08:36:50] so actually, you should use it now and see if it breaks and report immediately [08:36:55] ok? [08:37:08] sure :) [08:37:13] also, there are times where something is ongoing that would be a bad time [08:37:25] I hope it is not the case (nothing too critical ongoing) [08:37:28] nope [08:37:31] not from my side [08:37:33] ok, doing [08:38:13] ok [08:43:10] you reverting then? [08:45:04] yes [08:45:43] yeah, saw the cert error [08:47:18] tendril back [09:09:32] I will try once again, I need to fail the dns, then enable it [09:09:53] okay [09:10:02] It may take more time [09:10:06] because puppet [09:16:10] 10DBA, 10Operations: puppet stopped mysqld using orphan pid file from puppet agent - https://phabricator.wikimedia.org/T86482#3374297 (10Marostegui) 05Open>03Resolved After having a chat with Filippo we believe this hasn't happened again so we are closing it for now. Feel free to reopen if this is seen again. [09:24:40] can I retry again? it may take some minutes to put it back, depending on puppet and the let'sencrypt script? [09:24:47] go ahead! [09:28:19] also dns has to first catch up internally [09:28:31] so it is not preciselly a high available system [09:35:01] it should be working now [09:35:50] BTW, if you are running alter tables somwhere, you better comment on the labsdb restart thread :-) [09:38:31] hehe yeah [09:38:37] I was planning not to run anything :) [09:38:41] until that thread was done [09:39:01] tendril isn't running for me btw [09:39:17] isn't? [09:39:18] now it does [09:39:19] took ages [09:39:20] it is for me [09:39:22] but it finally loaded [09:39:23] yeah, cache [09:39:29] I also tried to logout+login to make sure all worked [09:39:31] so all good! [09:39:32] I think it is more aggresive on tendril [09:39:42] yeah, I checked we didn't lose authentication [09:39:48] :) [09:49:20] ηευ [09:49:22] er [09:49:23] hey :) [09:49:36] let's discuss goals here? [09:50:52] jynus: ^ [09:51:51] ok [09:52:03] you know I was agains that goal, right? [09:52:35] uh? [09:53:02] see my comments below [09:53:35] I briefly saw two DBA goals yesterday [09:53:51] and I saw mark asked you to draft one [09:53:53] Yeah, I had a chat with m*rk today and I removed it [09:53:58] ah [09:54:21] Like, it will probably be added for Q2, but we can do that when we are discussing q2 goals :) [09:54:21] can we write something up that merges the two and is something that makes you both comfortable with it? [09:54:54] it would be impossible to have both goals for q1, it is a loooong goal, so we split it into two pieces, software (if you wanna call it that way) and the storage [09:55:10] the storage needs large dedication of alex [09:55:14] yeah, it's obviously impossible to do two goals [09:55:19] which he has already 2 goals [09:55:24] *who [09:55:29] *has [09:55:49] but now there's only one now, but jynus still said he's against it, so that's a problem :) [09:56:30] I think what can be realisticly done is not goal-worthy [09:56:43] "ability to, in the future, maybe have something" [09:57:38] how did we start from two goals and end up with nothing goal-worthy? :P [09:57:48] because of feedback [09:57:58] ok, let's get into the substance of it, because I don't know much about all that [09:58:18] what is it that you guys think we should implement long-term? [09:58:35] manuel, show the backup etherpad :-) [09:59:02] See key results, Not part of this goal party [09:59:06] *part [09:59:18] and the part that was deleted [10:00:28] just send the part that was deleted to paravoid [10:00:53] I just did [10:00:57] ah :) [10:01:10] i did too :) [10:01:27] s/send/sent :) [10:08:06] how are the multi-instance stuff related to backups? [10:09:00] Because our dbstore servers are hard to scale and to operate (ie: we cannot enable GTID) with multi source [10:09:00] dbstore1001/2001 are the backups dumping hosts [10:09:14] ah [10:09:42] and basically right now are not ina format that is easly recoverable [10:09:54] what does that mean? [10:10:18] remember when give me a hard time to buy more disk for those? [10:10:40] ? [10:10:46] we were trying to set them up so in case of disaster [10:10:56] we could recover them quickly [10:11:09] we have not yet fixed those [10:11:16] paravoid: to give you a small hint, we use engine=tokudb in dbstore1001 and in production we do not [10:11:43] that means that right now recover from those takes 2-4 days [10:12:08] but they use less disk [10:12:31] guys, don't be so high-level :) [10:12:52] and going back to multi instance, right now we cannot use gtid with multisource because of upstream bug, meaning that if a server crash we might corrupt data, whereas if we use gtid that wouldn't happen, so the only solution is to use multi instance [10:13:02] I'm not a DBA, but I can understand a few more details on a technical level :) [10:13:23] if I was talking to manuel, I would have used the same terms [10:13:49] so [10:13:57] we do not know the details yet [10:14:17] right now we're pushing all shards to a single mariadb instance on dbstores, using mariadb's multi-source replication [10:14:28] correct [10:14:36] I removed references to mydumper because I wasn't 100% sure that is what I was going to use [10:14:54] but there is an upstream bug that prevents us from using gtid + multisource, so we want to switch away from that and set up separate mariadb instances, one per each shard [10:14:57] paravoid: but if it crashes, all data is lost [10:15:06] paravoid: it is more than a bug [10:15:22] right now the single instance mariadb that we run on dbstores is running with TokuDB, which we know to be unstable [10:15:24] I would say it cannot be solved, probably, based on their reaction [10:15:29] paravoid: yes, and also because if we corrupt or break something, we are breaking all shards at once, whereas with multi instance we would only break one shard [10:15:31] (are there any other problems with it?) [10:15:37] paravoid: https://jira.mariadb.org/browse/MDEV-12012 this is the bug [10:15:53] basically, we are moving away from the monolitical approach [10:16:12] and tokudb because it is buggy [10:16:18] but that is ortogonal to the goal [10:16:28] this is about dumps [10:16:34] and all of this is related to backups, because we take backups out of dbstores (presumably to avoid the extra I/O load on our regular slaves) [10:16:45] yep! [10:16:47] it happens to be there [10:17:03] but I do not think it is related otherwise [10:17:05] how does that all affect our recovery times? [10:17:16] by having one separate shard [10:17:16] the 2-4 days you mentioned above [10:17:24] we can just copy- it is as fast as netcat [10:17:29] we cannot do that now [10:17:42] because it is mostly a huge monolitical blob [10:17:49] and then we have to convert all the tables to innodb [10:17:53] yes [10:18:05] we take backups with xtrabackup now, right? [10:18:05] so basically, dbstore is one slave more [10:18:09] no [10:18:20] that doesn't work with mariadb [10:18:26] we lost that when we migrated [10:18:26] oh [10:18:44] I thought we had that, I'm not up to speed cleary :) [10:18:45] can you undestand that most mariadb-specific featurs are buggy? [10:18:59] and why we are disappointed with that? [10:19:10] we are not going to do any change, but that would give context [10:19:18] of why we complain on every meeting [10:19:42] so how are backing up the dbstores now? [10:19:42] I think manuel has a different idea for this goal [10:19:54] so I will express my focus [10:20:03] ok :) [10:20:04] and then he should expres his [10:20:07] agreed [10:20:21] mine has very little to do with dbstores [10:20:40] those backups are not well setup [10:20:42] that's a good idea, but in order to help you guys out, I need to understand the problem space first :) [10:20:47] (hence my questions) [10:20:48] let's say not perfectly setup [10:20:53] but it works [10:20:54] it backups [10:21:04] and we use them almost every month [10:21:08] how are backing up things now? [10:21:08] for small incidents [10:21:13] mysqldump [10:21:22] oh really [10:21:26] with little space available, we have no othert choice [10:21:39] no binlogs or anything? [10:21:56] I think you are asking how we can do recovery [10:22:02] which is a different question [10:22:08] I asnwered that on an email [10:22:47] paravoid search: [Ops] Watchlist Feature Backup/Retention [10:22:56] the bullet points [10:23:37] every week, a full backup of all wiki database data is performed. It is generated on our database recovery server, and then compressed, encrypted and stored long-term offsite (secondary datacenter) on our backup storage server. The retention time for that backup is currently 30 days. [10:23:44] Other than full database backups, there are 2 means of recovering old database data: [10:23:51] * binary logs (database writes produced in the last month, which allows to do point-in-time-recovery starting from a full backup) and [10:23:58] * delayed slaves (databases delayed 24 hours to prevent accidental data deletions). [10:24:04] yeah [10:24:09] so we keep binlogs somewhere too? [10:24:15] not somewhere [10:24:25] we have 160 copies of them on every server for 30 days [10:24:40] ok, good :) [10:25:50] alright [10:25:53] I don't want to sound rude, but I will be direct- if I get questions like that, I myself fall back to "high level" mode [10:26:05] because they are quite basic [10:26:41] so, to summarize [10:26:49] we have point in time recovery [10:27:01] humour me and don't [10:27:05] we have delayed slaves, which are the dbstore1001 and 2 [10:27:21] 2001 [10:27:37] and we have what I call "backups" [10:27:46] which is anything related to bacula [10:27:55] we have a huge space problem there [10:28:54] so I asked mark for some king of hardware to provision increased disk space for database backups [10:28:57] *kind [10:29:11] ok [10:29:24] so this is the starting point, with one exception [10:29:52] when we talk mysql, normally one think s* hosts or m* hosts [10:30:01] basically, db* hosts [10:30:15] s* or m* groups [10:30:27] there are other machines running mysql [10:30:32] for example pc* hosts [10:30:44] contains parsercache data from mediawiki, no need for backups [10:30:50] following me? [10:30:59] I am aware, yes [10:31:07] and ok no backups, right? [10:31:19] I'm not entirely sure if we could survive without pc* at all, but ok [10:31:32] with a blank slate for pc* I mean [10:31:32] yes, but the data is disposable [10:31:38] we actually tested that [10:31:48] and recently deleted one pc hosts [10:31:54] yeah, as long as noone writes a salt/cumin commands that wipes them all at once :) [10:32:03] we had a 25% increase on load on HHVM [10:32:16] but notice it is only one cache layer [10:32:21] memcache is on top [10:32:22] I know, but still [10:32:28] in any case [10:32:33] I think it is more disposible than any case [10:32:35] it's not urgent [10:32:35] yes [10:32:43] the other part, is es* server [10:32:57] I'd say that ideally we'd cover that too, but we're far from ideal, we can tackle that (much) later [10:32:59] they are not databases, because they contain key-value store [10:33:10] lol, ok [10:33:12] well, this is the whole point of this goal, for me [10:33:32] and that is something that manuel may disagree [10:33:38] or apparently, even you [10:33:48] No, i don't disagree in that sense [10:33:55] disagree with what? [10:33:57] I think your focus is different [10:34:07] with that we should make a priority to backup es* hosts [10:34:11] exactly, maybe my opinion is different on what needs to be tackled first [10:34:28] So, my opinion is: [10:34:29] current backups work, they are bad, but they work, even if it takes 2 days to recover [10:34:37] we should fix those and make them better [10:34:47] but only once we have full database coverage [10:34:50] and I let you speak [10:34:52] haha [10:34:53] how are we backing up external store right now? [10:34:58] paravoid: we are not [10:35:03] /dev/null [10:35:03] oh [10:35:04] oops :) [10:35:12] my turn! :p [10:35:16] nah, basically [10:35:17] that is my argument :-) [10:35:23] yes, please go marostegui :) [10:35:54] I totally agree that es* need to be backuped, but right now if we have a disaster, we would recover from dbstore servers (as it is less likely that es servers will be dropped entirely) [10:36:06] and dbstores are a pain and I don't even think we trust their data [10:36:35] that is why I think we would need to tackle dbstore servers first, because they are unreliable, easily corruptable and a pain to operate (or rebuild if needed) [10:37:12] So [10:38:02] To me it all comes to decide whether we think it is more likely: if we have an outage and need the dbstores to recover data or if we have an outage with es servers and lose their data [10:38:14] Both are important, no doubt, we just need to priorize them [10:38:24] I am fine either way, as both need to be addressed at some point [10:38:29] right [10:38:31] so my question is [10:38:37] finished? [10:38:40] I just disagree a bit with jynus because I have been working a lot with dbstores lately, and they are a pain [10:38:41] do we all agree on what the long-term goal is? [10:38:49] [10:38:51] where do we want to be, say, a year from now? [10:39:03] yes [10:39:08] I think so yes [10:39:08] and not just on the broader objective, but how it will look like technically as well [10:39:23] we just showed you the 1-year plan [10:39:25] for backups [10:39:31] you did? [10:39:32] where? :) [10:39:42] it is the 2 goals that used to be there [10:39:52] dbstores+es backups+new storage :) [10:39:54] fix dbstores, fix backups, fix bacula [10:40:03] fix bacula? [10:40:43] bacula has so many issues, that alex told me he may not be able to attend our needs regarding es [10:41:05] so we either need to workaround that or setup someting in addition [10:41:05] what kind of issues? [10:41:14] es store is purely incremental [10:41:27] no sense to backup the same thing over and over [10:41:46] there is not such as system in place to send incremental bits to it [10:42:03] bacula is file-based [10:42:31] we want "object storage" model, and send only the changes since the last backup [10:42:59] we cannot store on the filesystem a complete copy [10:43:17] and right now things older than 30 days are deleted on bacula [10:43:33] and we are talking here 20TB of data [10:43:47] maybe less, compressed [10:44:07] that is all [10:44:28] because this needs help from "backup team" [10:44:44] the first part is to "create the ability to backup" [10:44:58] that funny-sounding sentence [10:45:26] You are the backup team :) [10:45:28] aka a script to retrieve and package portions of the database [10:45:33] I know [10:46:36] then we setup the hardware, then the backend/storage of the backup [10:46:42] That sounds like custom complexity [10:47:21] so are we going to backup 20TB every week? [10:47:30] we do not have the means [10:47:41] Perhaps we should [10:47:52] so [10:48:08] can we please use your pad to write the long-term changes needed for backups? [10:48:09] We also dont have the means to develop custom solutions [10:48:16] just collect them, copy/paste them in one place [10:48:25] sure [10:48:30] Indeed [10:48:32] then we can prioritize them, and split the work throughout the year [10:48:43] mark, I am not suggesting to bring down bacula [10:49:04] but it's currently hard to follow it all, especially when apparently there are small disagreements on every point :) [10:49:05] I am suggesting to create a script, and then use bacula with a different policy [10:50:11] Yeah, make sure the full plan is described in the etherpad [10:50:37] but that was exactly what we were doing [10:50:55] with (software) and then (storage) [10:51:07] yeah [10:51:13] As quarterly goals [10:51:19] put the full plan ahead, knowing not all would be done [10:51:30] I think jynus and myself only disagree on what needs to be tackled first [10:51:40] yeah [10:51:45] within the software part [10:51:49] actually, [10:52:23] my disagreement is that what can be done in 3 months, given the constraints, is not very useful [10:52:45] and I was told quarters were not longer than important [10:53:18] let's collect it all in sufficient detail, we can split it up next [10:53:20] I think the software part is a 4-5 month task [10:53:29] alex wasn't sure what half of these things meant either when i asked him yesterday [10:53:30] and the storage another 3, maybe more [10:53:40] it's hard to make estimates when we're lacking common understanding of the work needed [10:54:07] we already spoke with him [10:54:25] and said his role on th software part was just advise on the purchase [10:54:36] in preparation for the second part [10:54:50] at most, ask questions about ideas [10:54:59] I'm not sure if the software/storage split makes sense either [10:55:14] we tried to avoid backup terminology [10:55:15] let's just collect it all in one place, then find ways to slice it up in order to make sense [10:55:43] ok, but it was done like that because if not, we need hard dependency on backup team [10:55:53] and that definitely cannot be done this Q [10:56:09] stop talking about backup teams please [10:56:14] you know as well as I do that there isn't one [10:56:15] I giggle every time you say backup team [10:56:17] yeah that :) [10:56:27] I do not want to mention 1 person [10:56:39] please mention exactly what you mean [10:56:44] if that's alex, say alex :) [10:57:39] i do think it makes sense to split software/storage, as the storage is "useless" until we have adapted everything to use it [10:57:55] plus, there are no hardware yet [10:58:02] ok, can we do what I asked now? :) [10:58:19] finish === Long term backups requirements === [10:58:39] in the DBA-sync pad [10:58:41] paravoid: sure, mostly is copying it from what it is in the goals etherpad :) [10:59:32] great [11:06:19] I think the confusion is, what we are really going to do is [11:06:36] "program a bunch of scripts and test the hell out of them" [11:06:46] and nothing to do with bacula for a while [11:07:04] the confusion is that this isn't really written down in detail yet [11:07:06] after that, we will say- ok, we have something, let's but a machine and put them into use [11:07:13] other than a few bullets in goal language form [11:07:20] oh [11:07:22] actually [11:07:23] it is [11:07:44] https://phabricator.wikimedia.org/T138562 [11:07:50] let's link to that then [11:07:59] we have been talking about this for years/months [11:08:16] we==DBAs [11:08:31] there is a meta ticket, and very concrete actionables [11:08:35] as subtickets [11:08:54] you can't possibly think that someone other than you two can understand what that task description is about :) [11:09:19] ok, so you ask, which is what you are doing :-) [11:09:25] haha [11:09:34] yeah those tickets are just very vague bullets :) [11:10:24] i think you two should work together on writing a detailed plan in a single document [11:11:08] if you have been talking about it for a year and mostly agree, that should not cost a lot of time, but have the benefit that everyone else will understand [11:11:16] (and then later large parts of it can serve as documentation) [11:12:41] comments? [11:12:45] oh, I totally agree [11:13:09] the existing tickets are mostly pointers to stuff in your heads ;) [11:13:14] that is why partially I wasn't usure about the goal [11:13:24] because we were not yet ready [11:13:38] e.g. the prioriry of each thing [11:13:52] Yeah, that is the thing that is holding us back I think, what needs to be addressed first [11:13:52] we can set the priority after we all know what the problems and potential solutions are [11:14:01] so let's forget about that for now [11:14:10] please work together on writing the full picture [11:14:26] explain rationale, explain how you want to implement things, what is firm and what is being considered [11:14:30] then we can discuss this again next week [11:15:14] yeah, sounds good to me [11:15:18] for now I think we'll establish that we're going to work on improving db backups next quarter :) [11:15:25] Ok, I will start a new clean and fresh etherpad, with the current situation, and ideas to improve, I will send it to jynus to complete/modify/remove and then once that is clear between jaime and myself, I will send it you to you guys [11:15:29] and what exactly that will be, maybe next week we'll know better [11:15:33] does that make sense? [11:15:45] google doc may make it easier for others to comment [11:15:49] IIRC, the only documentation we had about this is some archeology I did here: https://wikitech.wikimedia.org/wiki/MariaDB/Backups [11:15:58]