[05:59:27] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: codfw API slaves overloaded during the 2017-04-19 codfw switch - https://phabricator.wikimedia.org/T163351#3196540 (10Marostegui) db2062 and db2069 are in the same state as yesterday (but with a lot less IOPS than after the initial peak) so that is... [06:18:11] 10DBA, 06Operations, 05codfw-rollout: Pool db2071? - https://phabricator.wikimedia.org/T163413#3196542 (10Marostegui) [06:18:20] 10DBA, 06Operations, 05codfw-rollout: Pool new server db2071? - https://phabricator.wikimedia.org/T163413#3196556 (10Marostegui) [07:00:21] do I setup db1071? [07:00:27] db2071? [07:00:40] yes [07:00:42] If you want to, yes, it is already rebooted [07:00:46] and looks good [07:00:58] so I would say it can be a great addition to s1 :) [07:03:47] 10DBA, 06Operations, 05codfw-rollout: Pool new server db2071 - https://phabricator.wikimedia.org/T163413#3196599 (10jcrespo) [07:04:13] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3196601 (10jcrespo) [07:04:16] 10DBA, 06Operations, 05codfw-rollout: Pool new server db2071 - https://phabricator.wikimedia.org/T163413#3196542 (10jcrespo) [07:11:47] 07Blocked-on-schema-change, 10DBA, 13Patch-For-Review: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3196615 (10Marostegui) db1040 oldimage table ready: ``` root@neodymium:~# mysql --skip-ssl commonswiki -hdb1040 -e "show cr... [07:31:08] db1067 ? [07:31:31] I was going to use it for cloning [07:37:55] Oh [07:38:01] I haven't done anything with it [07:38:03] yet [07:38:06] I can pick another host [07:38:09] with no issues [07:38:20] whatever you prefer [07:38:23] no, I already moved to db1080 [07:38:30] ok :) [07:38:40] but please send me the review so I can see it in advance [07:38:56] you do not even have to wait for me to +1 [07:39:03] ah sorry [07:39:06] I didn't want to spam you [07:39:08] but add me so I can see it on email [07:39:13] no, I prefer to be spammed [07:39:18] sure, you asked for it! :) [07:39:20] so that I know what is going on [07:40:25] are all enwiki hosts pending the revision changes? [07:40:45] because if they are, we may prefer to wait to clone? [07:40:50] nope [07:40:58] just 3 (including the master) [07:41:38] I think db1080 is done [07:41:49] db1080 is done yes [07:41:56] only pending: 65,67 and the master [07:42:34] 67 is only missing an index, the PK is correct [07:44:03] ok, I will deploy https://gerrit.wikimedia.org/r/349162 now [07:44:14] ok! [07:46:35] I am glad you didn't use db1067, because it has a mess of indexes :) [07:46:48] it was the old master [07:46:52] it makes sense [07:47:26] maybe we can run ANALYZE on revision at some point on all servers [07:47:33] on revision and logging [07:47:47] yes [07:47:57] I will run it on the hosts I am changing today [07:48:17] and then we can compare it with the top slow queries on codfw [07:50:30] https://phabricator.wikimedia.org/P5295 [08:03:44] BTW, be careful with db2019 lag- it gets computed from itself [08:04:01] but it still replicates from s4-master-eqiad [08:04:39] but it shouldn't get any, no? [08:05:05] not on icinga, but yes on SHOW SLAVE STATUS [08:05:30] or for example, if you restart the passive master, the active master complains [08:05:32] I don't see it there either [08:05:50] ah, because the alter blocks replication [08:05:58] but not pt-heartbeat [08:06:33] ah [08:06:39] I am just telling you this because I have generated alerts in the past with the passive datacenter [08:06:43] yeah yeah [08:06:51] good to know indeed [08:07:05] I first altered the smallest of the two tables (11G) to see what would happen [08:07:08] as I was doubting myself [08:07:17] but it went well, so I went ahead with the big one [08:07:22] but yes, good to know [08:09:22] I am going to sleep a lot better once we have db2071 there serving [08:10:51] wow, cloning from SSDs to SSDs is FAST (I think this is the first time I do that) [08:11:03] transfer rate?! [08:11:05] 300MiB/s [08:11:16] wow! [08:11:18] on a 1 Gb ethernet [08:11:44] because of compression and despite encryption [08:20:51] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: Pool new server db2071 - https://phabricator.wikimedia.org/T163413#3196712 (10Marostegui) db2072 is also now ready to be used if needed. All went fine after rebooting it. [08:20:53] jynus ^ [08:27:19] why do they have puppet default stuff? [08:27:28] is there are role applied by default? [08:27:38] I am running it manually :) [08:27:51] to be able to ssh to it and all that [08:27:51] no [08:28:04] what I mean is, what do you run? [08:28:06] puppet run? [08:28:08] ah, just puppet [08:28:09] yes [08:28:27] because it is not part of a spare role [08:28:45] Ah, I get what you mean [08:29:03] but it gets added to icinga, etc. [08:29:12] and sets up ssh accounts [08:29:30] true true [08:29:42] how is that possible? [08:29:46] specially the accounts [08:30:14] ah [08:30:46] No, nevermind [08:30:52] that is weird yes [08:31:06] is it defined elsewhere and we are duplicating the role? [08:31:16] I cannot find it on the repo [08:31:24] is there an automagically spare role? [08:31:32] not even by grepping for: "|72" and stuff like that [08:31:41] The spare role at least months ago, you had to manually add it [08:31:43] to make it spare [08:32:39] the only commits in gerrit for db2071 are the ones you added [08:32:42] and db2072 has none [08:34:17] but see it on icinga [08:34:36] yes [08:34:38] it just came up [09:09:54] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: Pool new server db2071 - https://phabricator.wikimedia.org/T163413#3196888 (10Marostegui) More servers are now online and ready to be used if needed, I will post it on the original tracking task (T162159) so I don't hijack this one. [09:13:31] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3196923 (10Marostegui) @Papaul the following servers were ready to get puppet enabled and all that, so I did so, and rebooted them db2071 db2072 db2073 db2075 db2076 db2079 db2... [09:16:52] marostegui: db2071 will be the first mysql server with Linux 4.9, I don't expect any problems, but if something is weird, please let me know [09:17:33] moritzm: will do, thanks :) [09:33:58] will we have enough space on db1040? [09:35:01] I checked one of the hosts with file per table, and the table was 135G [09:35:08] *is [09:35:40] There is still 163G to go on db1040 and the alter has been running for 2h already [09:35:42] so I think we will [09:36:01] do you know how long it will take, based on the other alters? [09:36:12] the other alters were taking around 4-5h [09:36:18] good [09:36:36] https://img.memesuper.com/08fdae7898ea66c240e710db5b8069d1_yes-let-the-hate-flow-through-you-emperor-palpatine-quickmeme-sidious-good-meme_494-358.jpeg [09:36:43] xdddddddd [09:39:11] [09:39:04] marostegui@db2071:~$ df -hT /srv/ [09:39:11] Filesystem Type Size Used Avail Use% Mounted on [09:39:11] /dev/mapper/tank-data xfs 3.6T 1.3T 2.4T 34% /srv [09:39:13] almost done? [09:39:16] :) [09:39:24] almost [11:17:53] marostegui, I will go to have lunch, I will write to ticket later [11:18:04] jynus: sure thing! [11:18:06] or you can do it if you want to do it before [11:18:08] enjoy [11:18:18] i will head for food in 5 minutes too [11:18:37] db2071 is working but replicationg from eqiad master [11:18:53] I will swich it now to codfw master [11:19:14] cool, will you stop eqiad master and get the position and then change it to the same one? [11:19:23] that was not very well explained, but I think you get what I meant [11:19:25] https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=db2071&from=now-3h&to=now [11:19:33] CHANGE MASTER TO with GTIDs :-) [11:19:45] oh yes, we have gtids!! [11:19:46] automatic, no need for the script [11:19:48] I forgot [11:19:52] :) [11:19:55] that is why I started as is [11:19:57] is uit using SSL already? [11:20:13] of course, no cross-dc should do it without ssl [11:20:21] :) [11:20:28] all replication channels use it, in fact [11:20:46] we should enforce it when we setup the better SSL monitoring [11:24:11] STOP SLAVE; CHANGE MASTER TO MASTER_HOST='db2016.codfw.wmnet'; START SLAVE; [11:24:18] it worked seamlessly [11:24:25] <3 [11:24:31] gtid! \o/ [11:25:34] we can pool it already [11:25:57] i can do it before going for lunch [11:26:04] with some small weight [11:26:07] yeah [11:26:15] it hasn't finished loading the buffer pool [11:26:20] yet [11:26:22] I will give it 20 or so [11:26:28] but I get it will be faster now [11:26:36] than the other servers [11:26:40] for sure [11:28:36] https://gerrit.wikimedia.org/r/#/c/349202/ [11:28:58] are main or as api? [11:29:03] both [11:29:07] 20 as main and 1 as api [11:29:15] good [11:30:00] we can probably leave it with 50 as main and then more api than the others [11:30:12] and maybe even depool one of the others to run analyze over the weekend [11:30:45] I am starting 10.0.30 on 1080 [11:30:56] cool [11:31:32] I put mysql_upgrade on path on the latest packages [11:31:41] yeah [11:31:53] and i loved it when i set up the last s5 hosts [11:35:01] deployed [11:38:38] https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=db2071 [11:40:37] The new server is filesorting too [11:41:47] https://phabricator.wikimedia.org/P5293#28339 [11:41:47] it wasn't on the original one [11:42:09] but now it does [11:42:20] which one did you use in the end? [11:42:22] db1080? [11:42:25] yes [11:42:29] because it does filesort too :| [11:42:35] it didn't before [11:42:37] maybe after restarting mysql the plan goes crazy? [11:42:45] *optimizer [11:45:30] at least a server with ssd filesorting is still better than the other two with more weight: https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=db2071&var-network=eth0 [11:46:04] going to get lunch [12:31:58] maybe we can try analazying table on db1080, then checking if the filesort is gone, and then restarting mysql to see if we can reproduce that error [12:41:11] <_joe_> so I just saw this in #-staff [12:41:12] <_joe_> 14:07 < Trizek> I have a user who wants to open this link: https://commons.wikimedia.org/w/index.php?title=Special:NewFiles&dir=prev&offset=20160520124303&limit=500. I [12:41:16] <_joe_> don't know how the offset value has been set, but it is apparently what is broken. [12:41:19] <_joe_> 14:08 < Trizek> However, https://commons.wikimedia.org/w/index.php?title=Special:NewFiles&dir=prev&offset=20060730182009&limit=500 works. I'm confused. Any idea? [12:41:35] <_joe_> &offset=20160520124303 seems like the kind of parameter that causes a very slow query [12:41:52] <_joe_> am I wrong? [12:43:44] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: codfw API slaves overloaded during the 2017-04-19 codfw switch - https://phabricator.wikimedia.org/T163351#3197434 (10Marostegui) db2071 has now been serving traffic for around 1 hour: https://grafana.wikimedia.org/dashboard/file/server-board.json?... [12:45:28] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: codfw API slaves overloaded during the 2017-04-19 codfw switch - https://phabricator.wikimedia.org/T163351#3197438 (10Marostegui) Maybe with this extra server, we can now depool one of the other "old" ones and let them run analyze over the weekend,... [12:46:06] _joe_: I am not completely sure how that works, but it doesn't look like somthing nice to the DB indeed, no [12:47:42] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: Pool new server db2071 - https://phabricator.wikimedia.org/T163413#3197446 (10Marostegui) I have pooled db2071 with the same main traffic as the other api servers (50) but with weight 2 in API, instead of 1 as the other servers. We'll see how it goe... [12:50:53] 07Blocked-on-schema-change, 10DBA, 13Patch-For-Review: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3197453 (10Marostegui) db1040 is done for the image table too: ``` root@neodymium:~# mysql --skip-ssl -hdb1040 commonswiki... [12:58:17] 10DBA, 07Epic, 13Patch-For-Review, 05codfw-rollout: Database maintenance scheduled while eqiad datacenter is non primary (after the DC switchover) - https://phabricator.wikimedia.org/T155099#3197496 (10Marostegui) [12:58:20] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197497 (10Marostegui) [12:58:23] 07Blocked-on-schema-change, 10DBA, 13Patch-For-Review: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3197495 (10Marostegui) 05Open>03Resolved [13:16:29] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197527 (10jcrespo) > This tracking task could be converted into a project (eg #wmf-schema-change) but I'll leave that to @Springle and @jcrespo. I am about to deprecate this tracking... [13:17:09] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197528 (10jcrespo) [13:23:40] db1040 alter finished, we'll see how long it takes to reclaim the disk space back after the alter [13:24:24] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3197549 (10Addshore) [13:24:28] 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, 06TCB-Team, and 2 others: Allow setting the watchlist table to read-only on a per-wiki basis - https://phabricator.wikimedia.org/T160062#3087424 (10Addshore) 05Open>03declined [13:25:09] 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, 06TCB-Team, and 2 others: Allow setting the watchlist table to read-only on a per-wiki basis - https://phabricator.wikimedia.org/T160062#3087424 (10Addshore) Looks like we are not going to do this, as it will not be needed as wl_id should be added d... [13:26:06] 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, 06TCB-Team, and 2 others: Allow setting the watchlist table to read-only on a per-wiki basis - https://phabricator.wikimedia.org/T160062#3197568 (10Marostegui) I still think it would be nice to have that feature there if needed [13:27:23] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3197573 (10Addshore) [13:27:29] 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, 06TCB-Team, and 2 others: Allow setting the watchlist table to read-only on a per-wiki basis - https://phabricator.wikimedia.org/T160062#3197571 (10Addshore) 05declined>03Open >>! In T160062#3197568, @Marostegui wrote: > I still think it would b... [13:27:42] :) [13:27:49] thanks! [13:28:00] if it is already done, i think it would be nice to have. You never know what the future brings! [13:28:21] very true :) [13:29:01] in a dream world i guess you could mark individual tables in mediawiki as read-=only / to be handled correctly ;) [13:30:09] haha that would be ideal, you think we can have that by 2030? :) [13:30:26] who knows! :P [13:32:39] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3197582 (10jcrespo) [13:34:30] 10DBA: Update change tag indexes - https://phabricator.wikimedia.org/T42867#3197586 (10jcrespo) [13:34:32] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197587 (10jcrespo) [13:34:55] 10DBA: Update change tag indexes - https://phabricator.wikimedia.org/T42867#467053 (10jcrespo) Probably done, but still pending to check. [13:37:29] 10DBA, 07Schema-change: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190#553002 (10jcrespo) Probably pending to apply. If that is the case, it should be added to the #blocked-on-schema-change project. If done within a week, it may be able to done quicky, otherwise, it may have to wait another... [13:37:38] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197594 (10Anomie) [13:37:45] 10DBA, 07Schema-change: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190#3197596 (10jcrespo) [13:37:47] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197597 (10jcrespo) [13:38:43] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3197600 (10Anomie) [13:40:24] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3197607 (10jcrespo) [13:41:57] 10DBA, 07Schema-change: Dropping rc_moved_to_title/rc_moved_to_ns on wmf databases - https://phabricator.wikimedia.org/T51191#553101 (10jcrespo) Probably done? Needs checking. Otherwise, adding the #blocked-on-schema-change. [13:42:21] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#853220 (10jcrespo) [13:42:23] 10DBA, 07Schema-change: Dropping rc_moved_to_title/rc_moved_to_ns on wmf databases - https://phabricator.wikimedia.org/T51191#3197611 (10jcrespo) [14:55:46] interesting: http://mysqlhighavailability.com/more-metadata-is-written-into-binary-log/ [14:58:08] that is nice [14:58:16] I wonder how much is that going to increase the binlog size [14:59:40] I read this yesterday: https://mydbops.wordpress.com/2017/04/13/binlog-expiry-now-in-seconds-mysql-8-0/ [14:59:43] which is nice too [15:00:10] bah [15:00:16] I liked more the percona way [15:00:24] expiratino based on size [15:00:40] yes, that is a great feature [15:01:02] is it not in mysql 8? :( [15:02:18] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197864 (10Nirmos) [15:02:29] not really that important, that is one that can be simulated manually [15:04:43] I don't get at all what that ticket means, that that user got his watchlist list reseted I guess? [15:05:24] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197864 (10jcrespo) I can give the user (and only himself) a list of watched articles 24 hours ago, and maybe he can import it, but we will not recover it automatically, as that is not easy or practical to do. [15:06:19] he wants a backup- I should have denied it, because if we offer it once, other people will ask [15:06:28] and we cannot attend 1000000 users [15:06:56] Yeah I am seeing that [15:07:12] I was doing the recovery exercise myself and I think I got the data :) [15:07:20] for learning purproses [15:07:31] I just weas feelining like working until midnight today [15:07:46] xddddd [15:07:51] actually, learning the schema meaning [15:08:03] is something that probably will eventually will help you [15:08:10] Yeah, that is why I did it [15:08:17] And i think I got it :) [15:08:19] but note that is private data [15:08:21] For this ticket I mean [15:08:24] Oh yes, of course [15:08:29] we cannot publish it [15:08:40] for example, imagine the request is fake [15:08:49] No no, I was just doing it for me to resolve the ticket in my mind [15:08:58] Yeah, how can we verify the user actually? [15:09:04] user has to provide a private way to do it [15:09:24] not sure how, wiki admins maybe can figure out a way [15:11:21] I did mysql -h dbstore1002.eqiad.wmnet svwiki -e "SELECT wl_namespace, wl_title FROM watchlist JOIN user ON user_id = wl_user WHERE user_name = 'I99pema' " [15:11:28] yes [15:11:30] Same thing :) [15:11:44] but if it has only 20 items may be too late [15:12:51] I see 46, but yes [15:12:56] we will see what he says [15:14:09] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197894 (10I99pema) 05Open>03Resolved a:03I99pema A list would be of great help! [15:14:18] :| [15:14:32] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197897 (10jcrespo) 05Resolved>03Open [15:14:35] User Since [15:14:35] Thu, Apr 20, 17:10 (3 m, 41 s) [15:14:47] lets check the user is joined with the global account [15:15:05] then we can maybe paste it on phab with permissions only for NDA and him [15:15:49] I would ask for the user to indetify itself [15:15:53] yes [15:15:57] on wiki [15:16:11] I would ask him to edit his wiki page [15:16:12] exactly [15:16:25] saying he is the same user on phabricator [15:16:51] do you want to handle that? [15:17:00] or Do I? [15:17:13] I will [15:18:13] export it in plain text, it will be easier (mysql instead of mysqldump) [15:18:24] sure [15:18:38] oph, you did [15:18:45] the .sql was fake [15:18:48] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197864 (10Marostegui) Hello @I99pema, we first need to make sure that you are who you said you are before handling private data to you. Can you please edit your personal wikipage affirming that you are the same person on sv.wikipedia.org... [15:18:57] yes, I tend to use .sql for all the stuff coming out from mysql [15:19:19] buuuuuh [15:19:27] xddddddd [15:26:30] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197956 (10I99pema) Like this? https://sv.wikipedia.org/w/index.php?title=Användare%3AI99pema&type=revision&diff=39623432&oldid=38428127 [15:28:00] jynus ^ that looks good to me [15:28:03] same user [15:28:07] and edited with the same user [15:29:58] technically, he linked phab to mediawiki, and not the other way round, but let's give it the ok [15:32:32] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197972 (10Marostegui) Thanks This is what you had in your watchlist: https://phabricator.wikimedia.org/P5297 [15:34:44] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3197976 (10I99pema) No, that's what has been added to the list after it was deleted earlier today [15:35:06] oh, you used dbstore1002? [15:35:13] just realised :) [15:36:30] he's got 10k ones :) [15:36:45] you can add; SELECT max(rev_timestamp) FROM revision; or SELECT max(ts) FROM heartbeat.heartbeat [15:41:15] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3198022 (10Marostegui) Sorry, went to the wrong server. Check again the new update. [15:59:43] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3198109 (10I99pema) Thats the correct one. I have now restored my Watch list and the case can be closed. Thank you so much for your help and quick response! [16:00:26] 10DBA: I99pema reset watchlist - https://phabricator.wikimedia.org/T163450#3198117 (10Marostegui) 05Open>03Resolved a:05I99pema>03Marostegui Great! [16:06:25] 10DBA, 13Patch-For-Review: Reclone db1068 to become a slave in s4 - https://phabricator.wikimedia.org/T163110#3198153 (10Marostegui) Server is recloned. Catching up SSL enabled GTID using: slave_pos [16:07:29] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3198158 (10Papaul) @Marostegui Thanks for for update. [16:08:12] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3198161 (10Papaul) [16:15:21] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3198202 (10Marostegui) db1065 is done: ``` root@neodymium:~# mysql --skip-ssl -hdb1065 enwiki -e "show create table revision\G" *************************** 1. row ****... [16:25:18] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3198283 (10Papaul) @Marostegui none of the systems were ready. [16:35:58] 10DBA, 06Operations, 13Patch-For-Review, 05codfw-rollout: codfw API slaves overloaded during the 2017-04-19 codfw switch - https://phabricator.wikimedia.org/T163351#3198322 (10Marostegui) Just for the record, I finished an alter table on the revision table on db1065, and it is not filesorting. ``` root@db... [17:38:16] hey guys [18:16:20] 10DBA: Network maintenance on row D (databases) - https://phabricator.wikimedia.org/T162681#3198923 (10Marostegui) db1068 has been promoted to master (we had to do it kind unexpectedly today) on s4 - it is affected by the recabling but no for the server move. So that is good. [18:16:48] 10DBA, 13Patch-For-Review, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3198924 (10Marostegui) db1068 is now s4 master. [18:17:58] 10DBA, 13Patch-For-Review, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3198929 (10Marostegui) [18:21:00] 10DBA, 13Patch-For-Review, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3198937 (10Marostegui) [18:22:40] 10DBA, 13Patch-For-Review, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3153482 (10Marostegui) [18:22:43] 10DBA, 13Patch-For-Review: Reclone db1068 to become a slave in s4 - https://phabricator.wikimedia.org/T163110#3198940 (10Marostegui) 05Open>03Resolved db1068 is now serving as a master see: T162133 [18:24:18] 10DBA, 13Patch-For-Review: Reclone db1068 to become master in s4 - https://phabricator.wikimedia.org/T163110#3198956 (10Marostegui) a:03jcrespo [18:48:30] 10DBA, 13Patch-For-Review: Reclone db1068 to become master in s4 - https://phabricator.wikimedia.org/T163110#3199105 (10jcrespo) For the record, I ran: ``` ./repl.pl --switch-child-to-sibling --parent=db1040.eqiad.wmnet --child=dbstore1002.eqiad.wmnet --child-set="default_master_connection='s4'" /repl.pl --sw... [18:49:12] db1047 [21:49:39] 10DBA, 06Operations, 10Phabricator: Intermitten outage on phabricator, needs investigation - https://phabricator.wikimedia.org/T163507#3199906 (10Aklapper) p:05Unbreak!>03High This is intermittent so I don't see why this should be Unbreak Now [21:50:02] 10DBA, 06Operations, 10Phabricator: Intermittent DB connectivity problem on phabricator, needs investigation - https://phabricator.wikimedia.org/T163507#3199908 (10Aklapper) [23:58:26] 10DBA, 07Schema-change: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190#3200317 (10tstarling) {P5302} Tested locally.