[02:13:47] 10DBA: Is WikimediaMaintenance/ourUsers.php still used - https://phabricator.wikimedia.org/T204184 (10Reedy) [02:15:55] 10DBA, 10Technical-Debt: Is WikimediaMaintenance/ourUsers.php still used - https://phabricator.wikimedia.org/T204184 (10Reedy) p:05Triage>03Low [02:20:35] 10DBA, 10MediaWiki-extensions-WikimediaMaintenance, 10Technical-Debt: Is WikimediaMaintenance/ourUsers.php still used - https://phabricator.wikimedia.org/T204184 (10Legoktm) [05:10:11] 10DBA, 10Datacenter-Switchover-2018: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10Marostegui) >>! In T204127#4578456, @jcrespo wrote: >>>! In T204127#4578185, @Marostegui wrote: >> We could probably reclone one of these hosts (for example db2054) from an eqiad slave, and then move i... [05:23:18] 10DBA, 10MediaWiki-extensions-WikimediaMaintenance, 10Technical-Debt: Is WikimediaMaintenance/ourUsers.php still used - https://phabricator.wikimedia.org/T204184 (10Marostegui) We, DBAs, don't use it. [05:31:49] 10DBA, 10Datacenter-Switchover-2018: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10Marostegui) [05:33:49] 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: Execute the schema change for Partial Blocks - https://phabricator.wikimedia.org/T204006 (10Marostegui) Thanks [06:19:32] jynus: I would like to start disconnecting replication eqiad -> codfw [06:25:19] isn't scheduled for 1 hour? [06:25:26] *in [06:25:38] Don't know, I picked a random hour [06:25:52] I wanted to do some sanity checks, unless that is a blocker for you [06:26:01] (first) [06:26:12] can you give me a few minutes? 0:-) [06:26:18] nope, not a blocker :) [06:26:28] no rush, take your time :) [06:32:33] so we have some, but not worrying contention on the masters (s1, s5) due to user-related locks when editing [06:32:58] we already had some of that before [06:33:10] but people are editing, saving preferences [06:33:26] and because locks they fail as one can only do 1 at a time [06:33:40] But those were also present in eqiad (maybe a bit less) - not saying it is good, I am saying it is not a lot worse with the failover [06:34:06] not an issue, maybe it can be sligtly more frequent because reads could be slower too [06:35:12] the first replica with bad query perf [06:35:15] is db2062 [06:35:19] mostly Title::getFirstRevision [06:36:43] https://phabricator.wikimedia.org/P7541 [06:36:55] I am checking if there is some underlying reason [06:37:03] different schema maybe? [06:37:51] it is strange [06:38:00] because that query takes only 0.24 second to me [06:38:07] not hugely fast, but not slow [06:38:44] that's one of the old hosts, no? [06:39:03] Maybe we are just too used to SSDs and 512GB ram? [06:39:05] And we are biased [06:39:36] it still scans 22K rows and does a filesort [06:39:44] so I can see how unders stress it can be slow [06:41:24] Yeah, check the disk latency of that host vs a powerful one https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&from=now-24h&to=now&var-dc=codfw%20prometheus%2Fops&var-server=db2062&var-port=9104 vs https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=db2088&var-port=13311 [06:42:30] so now that almost there is no connection errors [06:42:40] I would like to focus on performance optimization [06:42:49] there may not be so bad that connections fail [06:43:00] but we need to balance resources [06:43:06] mostly load vs CPU [06:43:20] and number of connections to optimize fast response [06:43:55] Yes, but I wouldn't change much weights now, we need to see how the stuff goes now that the servers are not cold anymore [06:45:34] https://grafana.wikimedia.org/dashboard/db/prometheus-cluster-breakdown?orgId=1&var-datasource=codfw%20prometheus%2Fops&var-cluster=mysql&var-instance=All&from=1536810324634&to=1536821124634 [06:46:32] I am looking if I am seein an outlier in load [06:47:18] Good dashboard [06:47:21] for example, db2093 seems low loaded, but probably it is a misc :-) [06:47:53] db2093 is the tendril host if i remember correctrly [06:49:33] ok, technically misc [06:49:36] :-D [06:49:43] xdd [06:49:53] db2036 seems at 60% capacity disk [06:50:14] also 43 and 45 [06:50:34] 50 51 [06:50:52] 55 57 58 [06:51:30] sorry [06:51:34] those are old hosts I guess with 3.3 instead of 4? [06:51:35] I am looking at disk usage [06:51:42] I wanted to look at utilization [06:51:44] my bad [06:52:07] 41 is high [06:52:25] 47 50 [06:52:35] 55 [06:52:49] 57 [06:53:00] (I thik you are fixin that one?) [06:53:13] 63 [06:53:16] which one? [06:53:40] 57? [06:53:44] or is it other? [06:53:44] Nope, 54 [06:53:47] ah [06:53:58] meanwhile the ssds are almost idle [06:54:11] and they have quite a bunch of load [06:54:21] in the end the rc slaves are getting a lot more load that we predicted [06:54:56] looking at the 24 hour trends [06:55:01] I would definitely tune down 41 [06:55:38] it has weight 100 in s2 [06:56:00] so maybe give the rc 50 more each so they reach 200 (db2088 and db2091) [06:56:16] as well as 55 62 and 63 [07:04:11] https://gerrit.wikimedia.org/r/460207 [07:04:21] I am not 100% sure about that,^ [07:04:27] checking [07:04:47] How are db2088 an db2091 doing? [07:04:52] maybe those can eat a bit more? [07:05:23] let me see [07:06:12] they are ok [07:06:18] there are other with less load: [07:06:25] maybe they can get 50 more each? [07:06:26] 89 [07:07:06] 76 [07:07:43] 80 [07:12:12] I made further adjustments [07:12:19] let me see [07:12:40] s5 has issues? [07:12:42] interesting [07:12:44] no [07:12:48] on the other side [07:12:53] it is so confortable [07:13:09] that I optimized it beacuse it needs no main load on rcs [07:13:24] all there are ssds [07:13:36] so some hosts are idle [07:13:42] on s5 you gave more load to rcs [07:13:49] did I? [07:13:53] yeah [07:13:54] sorry [07:13:58] I meant s8 [07:13:59] and you removed it from s8 (which makes sense) [07:14:29] on s5 indeed the rcs are mostly idle [07:14:40] so I wanted to give the SSDs more load [07:15:19] small change based on what hosts I saw that could take more load [07:15:27] but s5 wasn't an issue [07:15:28] but is s5 in trouble? [07:15:32] mostly s2 [07:17:04] and a bit s6 [07:17:27] +1ed, let's see the effect [07:18:00] I made an extra change [07:18:16] s6 rcs more load [07:18:35] looks good [07:18:55] we are missing I think an SSD host on s2 [07:19:13] yeah, one at least [07:19:30] and we need enwiki on all SSDs [07:19:37] we could maybe take one from s8 to s2 [07:19:43] but let's not make many changes at the same time [07:19:57] push your change and then we can evaluate [07:20:09] I don't really think we need it NOW but purchase it this upcoming qarter [07:20:17] oh yeah, that, definitely [07:20:23] add at least 1 ssd to s2 and convert s1 to full ssds [07:20:29] s8 was also not an issue [07:20:36] because it was all big hosts [07:20:47] despite the high load of that section [07:31:26] good morning [07:38:24] hi [07:39:13] what can I do for the databases today - regarding the yesterday's switchover? Or shall I get back to the marvellous world of debian packaging? :) [07:41:08] So, we'd need to see how we organize ourselves, I would like to start getting some of the schema changes thru already [07:41:20] I will take care of the s7 hosts [07:41:27] Then we should disconnect eqiad -> codfw [07:41:34] And maybe we can also start upgrading masters [07:43:27] T-21d for maintenance window ending [07:43:34] xdddddd [07:45:23] banyek: you could take care of https://phabricator.wikimedia.org/T203565 once you get taught to do so, as that one isn't on a rush (unlike s7 kinda) [07:47:23] so I appreciate a lot you offering to do the cloning- but volans has a point [07:47:41] you are the master of schema changes- and while you shouldn't do those on your own [07:47:58] maybe I can suggest to work on a plan to start those ASAP [07:48:04] sounds good [07:48:08] db2054 is half way thru though [07:48:10] :) [07:48:17] not for you to do those on your own, eh! [07:48:32] I will finish db2054 and leave db2068 for you or banyek? :) [07:48:33] just to setup a schedule or the ones you want to do or ones taht can be done at the same time [07:48:43] yeah, that is the other part [07:48:52] I would like to show one of those to banyek [07:49:01] and then he can do on his own the one you proposed [07:49:11] sounds good! Take maybe db2068 once db2054 is finished? [07:49:22] I will ping you once it is done [07:49:32] being the most valued DBA, I woudl like you to work on the most imporant tasks :-) [07:49:37] hahaha [07:49:49] the most valuled dba is volans [07:50:05] you can leave cloning which to us 2 peasants :-) [07:50:09] maybe you can upgrade some masters in eqiad? and I can run schema changes in the ones you don't take today? [07:50:13] let me find the docs about recloning [07:50:21] marostegui: propose something [07:50:26] and I will do it [07:50:28] * banyek digs into the wikis [07:50:29] haha [07:50:34] rotfl, wasn't I the most veteran just yesterday? :-P [07:50:41] ok, let me write something quicky on our etherpad [07:50:49] volans: veteran != valued [07:51:07] (I hope you understand I am jokin here!?) [07:51:17] we love you volans [07:51:18] (yes, totally ;) ) [07:51:24] that is why we meme about you [07:54:40] jynus banyek (and I know volans will do too) check line 53 of our etherpad [07:54:44] What about that menu for today? [07:54:56] marostegui: which etherpad? :D [07:55:00] hahaha [07:56:44] sounds like a plan [07:56:54] looks ok [07:57:10] although not sure we will finish all in 1 day [07:57:19] yeah, that can extend thru tomorrow of course [07:57:22] and will probably do [07:58:56] banyek: while there is https://wikitech.wikimedia.org/wiki/Setting_up_a_MySQL_replica [07:59:10] we have packaged most things nicely in a script [07:59:14] so, we need to run stop slave; reset slave all; on codfw masters s1-s8, x1, es2, es3 [07:59:17] shall I do that? [07:59:19] well, not nicely [07:59:58] banyek: I propose you to listen to me and then redo that and link it from the MariaDB page as an assignment [08:00:10] marostegui: please do [08:00:15] \o/ [08:00:29] jynus: ok [08:00:42] you will need to do the reset slave all quicky after stop, or alerts may go off [08:00:52] or we can downtime the replica lag checks [08:00:55] will be done at the same time [08:05:10] I have added @ etherpad the host list if you want to double check it [08:06:53] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) Replication has been disconnected from eqiad to codfw: ``` root@neodymium:/home/marostegui# for i in db2048 db2035 db2043 db2051 db2052 db2039 db2040 db2045 d... [08:07:54] while read host; do mysql.py -h $host -e "show slave status\G"; done < masters.txt --looks good to me [08:09:29] marostegui: so green light to go with maintenance? [08:09:37] jynus: yep! [08:09:40] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Perform schema change to add externallinks.el_index_60 to all wikis - https://phabricator.wikimedia.org/T153182 (10Marostegui) [08:09:42] marostegui: thanks [08:10:05] banyek: one sec and we do a quick video meetup [08:11:57] ok [08:16:41] marostegui: did you do a full reimage of the other codfw host or just a reprovision [08:17:57] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Patch-For-Review, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10Marostegui) 05stalled>03Open p:05Low>03Normal [08:18:05] jynus: no, I just did a apt full-upgrade to get new kernel and mariadb [08:18:08] and a reboot [08:18:12] ok [08:18:16] thanks [08:34:56] 10DBA, 10Core-Platform-Team, 10Patch-For-Review, 10Schema-change: Fix WMF schemas to not break when comment store goes WRITE_NEW - https://phabricator.wikimedia.org/T187089 (10Marostegui) [08:36:32] 10DBA, 10Schema-change, 10Tracking: [DO NOT USE] Schema changes for Wikimedia wikis (tracking) [superseded by #Blocked-on-schema-change] - https://phabricator.wikimedia.org/T51188 (10Marostegui) [08:36:35] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) 05stalled>03Open [08:36:58] 10DBA, 10Core-Platform-Team, 10Patch-For-Review, 10Schema-change: Fix WMF schemas to not break when comment store goes WRITE_NEW - https://phabricator.wikimedia.org/T187089 (10Marostegui) 05stalled>03Open [08:46:21] 10DBA, 10Core-Platform-Team, 10Patch-For-Review, 10Schema-change: Fix WMF schemas to not break when comment store goes WRITE_NEW - https://phabricator.wikimedia.org/T187089 (10Marostegui) [09:04:39] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [09:05:10] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [09:18:34] jynus banyek db2054 has been recloned and it is catching up, you can start with db2068 whenever you wish [09:21:50] thanks [09:21:54] about to finish explanations [09:21:58] and will get to that [09:22:05] great! [09:25:29] would there be an easy way to track which db hosts are being cloned from which sources? [09:25:40] preferably without any additional manual steps to do that? :) [09:27:14] I guess we could use zarcillo and get transfer.py to write to a table to say there is a transfer going from $source to $destination [09:27:21] zarcillo DB I mean [09:27:29] what is zarcillo? [09:28:01] It is a new DB we have placed inside of tendril hosts, which is being used for backups and in the future it might replace or complement tendril as a source of truth [09:28:15] yes, that would work [09:28:23] and probably also add it to Server Admin Log when it's happening [09:28:44] indeed [09:28:48] good idea [09:29:07] Not every transfer needs to be a cloning, but yeah [09:29:24] well [09:29:34] ideally as we automate things, cloning would be one automated step, right [09:29:48] yeah, part of the provisioning big project [09:30:08] you could make that a separate operation [09:30:18] that it does include a transfer as part of its implementation is secondary [09:36:31] anything against me merging https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/459994/ ?? [09:36:46] (this is the removal of the require in mariadb::service) [09:37:17] elukey: good from my side, I think that should do the trick as I mentioned a couple of days ago [09:41:25] any objection to start enabling GTID on eqiad masters? [09:41:43] +1 [09:41:58] I forgot about that [09:42:00] ok, I will get that done [09:42:17] I can also do it, I think I have sceduled its reimage [09:42:22] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Patch-For-Review, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10Marostegui) [09:42:38] liars, you told me you'll disable GTID and now you're enabling them! :-P [09:42:52] * volans jocking ofc ;) [09:42:52] volans: gtid has to be enabled [09:42:54] but they are passive masters now! [09:42:54] :) [09:42:59] because innodb transactionality [09:43:08] it is just horrible to use as replication control [09:43:14] 10DBA, 10Schema-change, 10Tracking: [DO NOT USE] Schema changes for Wikimedia wikis (tracking) [superseded by #Blocked-on-schema-change] - https://phabricator.wikimedia.org/T51188 (10Marostegui) [09:43:17] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Patch-For-Review, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10Marostegui) 05Open>03Resolved All done [09:43:43] mark: our transfer.py script is the first step towards automatic provisioning [09:44:03] eventually they would discover themselves and be part of the reimaging process [09:44:25] but we need backups servers setup first- ;-) [09:44:37] yes [09:44:42] but there are small things we can do before, like this [09:45:03] what do you mean with "this"? [09:45:25] logging the source of a transfer somewhere [09:45:31] so we have that in a single easy to lookup place [09:45:40] sure, we can add that to transfer.py [09:45:48] even if it's not perfect, and not the way we end up doing things, it could still help and it's probably not hard to add [09:46:52] so it gets logged to tendril-replacement database, like we do with backup generation [09:47:11] on backups we log where they are and were they were taken from [09:47:32] we could mimic the same thing- for now we are using !log mostly to track the clonings [09:48:02] and most should come from the "provisioning/db backup generation" sever from now on [09:51:10] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) GTID enabled on all eqiad masters but db1071 (s8) and db1068 (s4) as they are currently running a big alter. [09:55:39] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) db1068 enabled [09:59:02] db1071 will be showing up in tendril with long "show global status" or "show slave status" as they get stuck as there is a long running query and I tried to stop slave and it got stuck there, so killed it, and thread is still cleaning up [09:59:15] Not worrying, but just saying in case you see them on tendril's activity [10:05:26] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10Marostegui) [10:05:57] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10Marostegui) db2054 has been recloned, it is catching up. Once it has sync'ed with its master, I will remove its downtime and repool it into s7 [10:17:29] Was the query killer deployed? Asking because I am seeing a query that has been running for 9 hours on db2055 [10:18:02] Ah, it is vslow [10:21:50] no it wasn't [10:21:55] it just still runs by screen [10:22:29] (if we are talking from the same one) [10:22:48] No, you are talking about the labs one [10:22:53] (we don't have labs in codfw) [10:22:58] I am talking about the core query killer [10:24:02] do a show events on "ops" database [10:25:26] ah, soorry [10:25:30] *sorry [10:27:20] Don't be sorry! :) [10:27:31] You have to ask questions! [10:29:31] Going to deploy this: https://gerrit.wikimedia.org/r/460304  [10:31:33] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [11:45:17] marostegui: heads up, I will kill 3 2017 screens on db1075 [12:02:19] 10DBA, 10Analytics, 10Patch-For-Review: mariadb::service and managed services don't play well on Stretch - https://phabricator.wikimedia.org/T204074 (10elukey) First issue solved, now I can see the following one: ``` elukey@hadoop-coordinator-2:~$ sudo puppet agent -tv Info: Using configured environment 'pr... [12:25:36] if you guys have time for a puppet consult, --^ looks strange to me [12:25:44] but I might miss something trivial [12:34:11] 10DBA, 10MediaWiki-extensions-WikimediaMaintenance, 10Technical-Debt: Is WikimediaMaintenance/ourUsers.php still used - https://phabricator.wikimedia.org/T204184 (10Reedy) 05Open>03Resolved a:03Reedy [12:47:06] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [12:47:57] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) a:03Marostegui [12:49:19] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) db1071 GTID enabled [12:52:28] yeah no idea where my puppet code calls mediawiki::state [13:01:43] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) [13:02:21] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) The table has been imported in eqiad hosts. ``` root@neodymium:/... [13:03:28] jynus: I think we should decrease weight for db2055 in s1 [13:03:43] It is vslow and has 50 as main traffic weight, and had had a query running for 11h already [13:03:59] oh [13:04:02] interesting [13:04:03] We should probably give it 0 weight [13:04:33] not disagreeing, problem is s1 was quite busy yesterday [13:04:39] yeah :( [13:05:02] send a patch if you want and I can think with that in front and see alernatives [13:05:07] but not sure if that long running query can affect performance and hence the small % of main traffic it gets [13:05:24] well, mostly it is the history [13:05:27] check that on the host [13:05:38] one day is normally ok [13:05:47] if it is once a month or os [13:05:54] Yeah, I don't really want to touch weights in s1 [13:05:59] So I am not totally convinced [13:06:05] let's leave it for now [13:06:07] if you have a proposal, send it [13:06:13] That is the problem, I don't! :) [13:06:14] I see the issue [13:06:22] but I don't have a good alternative [13:06:30] Let's leave it for now, we are in a good state [13:06:30] we can depool some hosts from elsewhre [13:07:11] I would monitor the performance [13:07:24] yeah, i will do that [13:07:36] icinga should should be able to show connection and query latency [13:07:47] or it will, once I deploy the read only check [13:07:52] that may help? [13:08:22] would you be ok with me working on that? [13:08:28] it should be non-paging [13:08:35] yeah! [13:08:36] so worse case scenario only irc spam [13:08:51] s3 maintenance takes a while due to mysql_upgrade :-) [13:08:55] so I have time [13:09:00] hahaha [13:09:03] take a nap even XD [13:09:18] https://grafana.wikimedia.org/dashboard/db/mysql?panelId=11&fullscreen&orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=db2055&var-port=9104&from=now-24h&to=now [13:09:32] yeah, that's it mostly [13:09:42] the actual impact depends on the amount of writes [13:09:52] e.g. very large for labsdb/toolsd [13:10:49] root@sarin:~$ wmfmariadbpy/wmfmariadbpy/check_health.py -h db2055 [13:10:57] "query_latency": 0.0029032230377197266 [13:11:01] The host isn't doing bad compared to a host with the same weight as others [13:11:03] "connection_latency": 0.06639242172241211 [13:11:15] Yeah, it is not bad, I am just being over careful :) [13:11:16] do we have a most similar host, for comparison? [13:11:20] db2062 [13:11:23] which as api too [13:11:29] 50 of main weight + api [13:11:34] "connection_latency": 0.06807661056518555 [13:11:39] "query_latency": 0.0021581649780273438 [13:11:45] so it is the same [13:11:47] of course, it needs some stats [13:11:52] *statistics [13:12:07] yeah, but a quick glance at grafana for query response, it is more or less the same [13:12:10] so we are good [13:12:10] and the querys affected may not be representative on reality [13:12:15] I saw issues mostly on dbstores [13:12:15] let's keep an eye on it from time to time [13:12:28] when alter tables and backups made it go up all the time [13:13:05] can I take db1066 (s2 master)? [13:13:12] yes for me [13:13:20] I am only touching db1075 [13:13:22] I will add it to the list in the therpad [13:13:23] cool [13:13:23] dbstore2001 [13:13:26] and db2068 [13:13:37] let's sync on masters mostly [13:13:55] Yeah, I am touching db1068, db1071 and now db1066 [13:14:36] will do db1062 next [13:14:52] probably in a few minutes (or hours, depending on s3 :-)) [13:14:53] sounds good [13:15:03] I won't touch any other master apart from db1068, db1071 and db1066 today [13:15:15] neither do me [13:15:26] aside the 2 I mentioned [13:15:52] cool [13:23:40] 10DBA, 10Operations, 10Research, 10Services (designing): Storage of data for recommendation API - https://phabricator.wikimedia.org/T203039 (10Joe) >>! In T203039#4574768, @Pchelolo wrote: >> don't have libraries and abstractions for accessing MySQL from our nodejs services. Is that correct? > > That's th... [13:48:23] 10DBA, 10Analytics, 10Patch-For-Review: mariadb::service and managed services don't play well on Stretch - https://phabricator.wikimedia.org/T204074 (10elukey) 05Open>03Resolved a:03elukey Of course if I don't grep the latest operations/puppet code I will not find what I am looking for: ``` git grep m... [14:00:32] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [14:41:52] jynus: you done with db1075? [14:42:01] actually, nevermind [14:42:06] I won't touch it today [14:46:58] almost, latest checks, lag recovering [15:14:20] I may have to restart db1075 [15:14:27] installer may have done the wrong kernel [15:14:41] +1 [15:15:01] looks like moritz.m talked to you already! :-) [15:15:03] because it is getting late, I may have to do the other I mentioned tomorro [15:15:12] no worries from my side [16:27:16] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10jcrespo) [16:27:26] 10DBA, 10Analytics, 10Growth-Team, 10Notifications: Purge all Schema:Echo data after 90 days - https://phabricator.wikimedia.org/T128623 (10Milimetric) p:05Triage>03High [16:27:39] 10DBA, 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Notifications: Purge all Schema:Echo data after 90 days - https://phabricator.wikimedia.org/T128623 (10Milimetric) a:03elukey [16:27:48] 10DBA, 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Notifications: Purge all Schema:Echo data after 90 days - https://phabricator.wikimedia.org/T128623 (10Milimetric) p:05High>03Normal [16:28:07] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10jcrespo) a:03jcrespo db2068 has been recloned, but needs time to catch up replication and then be slowly repooled with the above patch. [16:36:53] 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, 10Patch-For-Review: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569 (10Bstorm) [21:10:49] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) another place where the mysql module is used is "m" in "role(simplelamp)" `modules/role/manifests/simplelamp.pp: class { '... [21:11:34] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10zhuyifei1999) [21:14:52] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) quarry switched now to mariadb, subtask resolved. updating the list in the ticket description here. thanks to @zhuyifei1999 fo... [21:15:13] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) [21:16:58] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) [21:17:18] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) [21:26:54] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) [21:28:03] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) [21:30:40] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Dzahn) i think for the ones in the statistics and wikimetrics we should ask Analytics how they feel about converting that to mariadb....