[00:59:42] 10DBA, 10Community-Tech, 10cloud-services-team, 10Security: create production ip_changes table for RangeContributions - https://phabricator.wikimedia.org/T173891#3578646 (10kaldari) @MusikAnimal: Do we want this table (minus the revdeleted content) available on Labs? (It will require creating a specialized... [06:24:55] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579007 (10Marostegui) I just realised that db1089 (enwiki) does not have those indexes. db1083 does. The table size differences are as follows: no... [06:40:08] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579030 (10Marostegui) The ALTER itself takes seconds to run (less than 10 seconds on a SSD host). ``` root@PRODUCTION s1 slave[enwiki]> show create... [09:21:58] jynus: hey, do you think you can do this? https://gerrit.wikimedia.org/r/#/c/375741/ [09:39:18] how much faster is that? [09:53:42] Amir1 ? [09:54:06] jynus: 10-20% [10:45:18] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3579430 (10jcrespo) [11:05:16] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3579482 (10jcrespo) A first evaluation would point to **nova** as the cause, but I only have indirect metrics saying that, so I am not 100% sure. [11:08:37] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3579491 (10jcrespo) The other only possible candidate would be testreduce_0715 cc @ssastry [11:20:13] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3579530 (10jcrespo) We do not have detailed monitoring on db1009, but I can see an increase number of UPDATEs and INSERTs creating contention among themselves and bl... [11:33:49] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579566 (10Marostegui) This is template links after the defragmentation: ``` root@db1083:/srv/sqldata/enwiki# ls -lh templatelinks.ibd -rw-rw---- 1 my... [11:36:47] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579567 (10jcrespo) > Almost half the size. Probably 30 GB comes from the index like in the above comparions, the rest from degragmenting/compression... [11:38:01] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579568 (10jcrespo) Maybe we should do a quick "index deletion" online and then track here the defragmention? [11:38:50] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579574 (10Marostegui) >>! In T174509#3579568, @jcrespo wrote: > Maybe we should do a quick "index deletion" online and then track here the defragment... [11:41:26] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579581 (10jcrespo) > You mean drop the indexes quickly and leave the defragmentation for later? Yeah, thinking of making the schema compatible ASAP... [11:44:54] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3579582 (10Marostegui) Sounds good to me. I will finish db1083 for as the defgramentation is already started on pagelinks. Let's leave db1083 and db1... [11:57:44] 10DBA: Remove ReaderFeedback tables from wikis - https://phabricator.wikimedia.org/T174586#3579639 (10Marostegui) Tables dropped from s2 [12:27:38] 10DBA: Remove ReaderFeedback tables from wikis - https://phabricator.wikimedia.org/T174586#3579712 (10Marostegui) Tables dropped from s3 [12:28:47] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#3579716 (10Marostegui) [12:28:49] 10DBA: Remove ReaderFeedback tables from wikis - https://phabricator.wikimedia.org/T174586#3579713 (10Marostegui) 05Open>03Resolved a:03Marostegui Tables dropped from: s5,s6 and s7. All done [12:47:54] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3579784 (10chasemp) I'm wondering if this is related: https://phabricator.wikimedia.org/T170492#3579682 I think @hashar issues a command that tried to purge all no... [14:11:10] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3580106 (10Marostegui) Actually, we could also optimize the table without depooling as it is an INPLACE operation (and I actually remember optimizing... [14:24:28] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580140 (10ssastry) >>! In T175002#3579491, @jcrespo wrote: > The other only possible candidate would be testreduce_0715 cc @ssastry There is no round trip test run... [14:27:31] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580151 (10jcrespo) @ssastry - I agree, I think Chase's comment are the best fit right now, but I had to ask around to all users of such database. [14:31:41] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580165 (10jcrespo) @andrew Is there something that could be done to reduce the amount of connections per service? every one precreates 40 or so, and among them they... [14:57:52] 10DBA: Drop pr_index from wikis where ProofreadPage isn't enabled - https://phabricator.wikimedia.org/T174782#3580246 (10Marostegui) 05Open>03Resolved a:03Marostegui All dropped. [14:57:55] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#3580250 (10Marostegui) [15:04:46] 10DBA, 10Data-Services, 10Patch-For-Review: `pr_index`to be replicated to Labs public databases - https://phabricator.wikimedia.org/T113842#3580271 (10jcrespo) With T174782 solved, the next step is to, for each of the 7 shards: 1) stop replication on sanitarium 2) copy the tables, on existing wikis (wikiso... [15:06:36] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580282 (10Andrew) >>! In T175002#3580165, @jcrespo wrote: > @andrew Is there something that could be done to reduce the amount of connections per service? In theor... [15:08:32] 10DBA, 10Community-Tech, 10cloud-services-team, 10Security: create production ip_changes table for RangeContributions - https://phabricator.wikimedia.org/T173891#3580291 (10MusikAnimal) >>! In T173891#3578646, @kaldari wrote: > @MusikAnimal: Do we want this table (minus the revdeleted content) available on... [15:09:25] 10DBA: Drop old devwikiinternal and rel13testwiki from s3 - https://phabricator.wikimedia.org/T118764#1808750 (10Marostegui) So, devwikiinternal and rel13testwiki and last write happened on 2015. I have taken a mysqldump from those two databases and placed them at: ``` root@dbstore1001:/srv/tmp/T118764# pwd /... [15:15:19] 10DBA, 10cloud-services-team: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580318 (10jcrespo) > if you can feed me instructions for how to check the number of connections for a given service. ```lang=bash root@neodymium:~$ mysql --defaults... [15:29:46] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3580368 (10Marostegui) And pagelinks: ``` root@db1083:/srv/sqldata/enwiki# ls -lh pagelinks.ibd -rw-rw---- 1 mysql mysql 146G Sep 5 15:29 pagelinks.i... [15:42:45] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580425 (10Marostegui) I was thinking that maybe we can reclone db1092 (for example) using db1049's data, (old master), so we can just decommission it... [16:16:19] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580556 (10jcrespo) Cool with me, but I would actually keep db1092 untouched, and clone it to a new hosts, as I think the easiest way to setup s8 would be to clone s5 and later delete databases, so we can overprovision s5 for a while w... [16:23:10] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580610 (10Marostegui) >>! In T161294#3580556, @jcrespo wrote: > Cool with me, but I would actually keep db1092 untouched, and clone it to a new hosts, as I think the easiest way to setup s8 would be to clone s5 and later delete databa... [16:28:43] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580641 (10jcrespo) > We can reclone db1071 from db1049 for instance then :) No no, I meant a brand new one from this list- T172679. db1071 will probably be kept, we will have an exact copy of db1049 (which will be decommed) and we wi... [16:32:14] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580655 (10Marostegui) >>! In T161294#3580641, @jcrespo wrote: >> We can reclone db1071 from db1049 for instance then :) > > No no, I meant a brand new one from this list- T172679. db1071 will probably be kept, we will have an exact c... [16:38:51] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580679 (10jcrespo) Well, of the new hosts, we will decide later which one goes to s8 and which ones go to s5. Let's just mark a new host as master (because it is a copy of db1049), and later, before the database drop we decide if and... [16:40:49] 10DBA: run pt-tablechecksum on s5 - https://phabricator.wikimedia.org/T161294#3580687 (10Marostegui) >>! In T161294#3580679, @jcrespo wrote: > Well, of the new hosts, we will decide later which one goes to s8 and which ones go to s5. Let's just mark a new host as *old* master (because it is a copy of db1049), an... [16:49:57] 10DBA, 10Analytics, 10Data-Services, 10Research, 10cloud-services-team (Kanban): Implement technical details and process for "datasets_p" on wikireplica hosts - https://phabricator.wikimedia.org/T173511#3580724 (10Halfak) [16:56:57] 10DBA, 10Analytics, 10Data-Services, 10Research, 10cloud-services-team (Kanban): Implement technical details and process for "datasets_p" on wikireplica hosts - https://phabricator.wikimedia.org/T173511#3530850 (10bd808) [16:57:36] 10DBA, 10Analytics, 10Data-Services, 10Research, 10cloud-services-team (Kanban): Implement technical details and process for "datasets_p" on wikireplica hosts - https://phabricator.wikimedia.org/T173511#3530850 (10bd808) [16:59:13] 10DBA, 10cloud-services-team, 10Patch-For-Review: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580822 (10jcrespo) a:03Andrew This can be closed now for me, until we have negative feedback. [16:59:22] 10DBA, 10cloud-services-team, 10Patch-For-Review: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580824 (10jcrespo) p:05Triage>03Normal [17:06:09] 10DBA, 10Operations, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3580853 (10jcrespo) > Going to remove the DBA tag from this task as our part is done, but I will remain subscribed as once the wiki is created, we'd need to do... [17:09:36] 10DBA, 10cloud-services-team, 10Patch-For-Review: db1009 (m5, used primarily for cloud services) unresponsive for minutes - https://phabricator.wikimedia.org/T175002#3580873 (10Andrew) If nova proves happy with this change then we can further reduce the workers if needed. I'd like to give it a few weeks tho... [17:22:41] 10DBA, 10Data-Services, 10Quarry: CHAR_LENGTH does not return the character count - https://phabricator.wikimedia.org/T174543#3580920 (10jcrespo) 05Open>03Resolved a:03jcrespo See: **https://quarry.wmflabs.org/query/21367 which counts by characters, not by bytes.** Mediawiki on WMF-hosted wikis uses B... [17:28:16] 10DBA, 10MediaWiki-Database, 10MediaWiki-Maintenance-scripts: Create MW Schema Diff maintenance script - https://phabricator.wikimedia.org/T174648#3569019 (10jcrespo) For me this is a duplicate of T104459, or at most, a subtask. Short term, what I would do is maintain a tables.sql on "operations side", mean... [17:31:41] 10DBA, 10MediaWiki-Database, 10MediaWiki-Maintenance-scripts: Create MW Schema Diff maintenance script - https://phabricator.wikimedia.org/T174648#3580990 (10jcrespo) Most of the tools you mention are unsuitable for WMF hosts because they cannot handle the hundred of thousands of objects of s3 or the large,... [17:32:49] 10DBA, 10MediaWiki-Database, 10MediaWiki-Maintenance-scripts: Create MW Schema Diff maintenance script - https://phabricator.wikimedia.org/T174648#3581009 (10Reedy) Subtask maybe, yeah. I wanted to do something for non WMF wikis, as we know how of skew Wikimedia wikis get, due to missing patches. So people... [17:39:19] 10DBA, 10Patch-For-Review: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679#3581041 (10Marostegui) [17:41:10] 10DBA, 10Operations, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3581045 (10Marostegui) >>! In T174370#3580853, @jcrespo wrote: >> Going to remove the DBA tag from this task as our part is done, but I will remain subscribed... [17:44:46] 10DBA, 10MediaWiki-Database, 10MediaWiki-Maintenance-scripts: Create MW Schema Diff maintenance script - https://phabricator.wikimedia.org/T174648#3581054 (10jcrespo) Sorry I missunderstood you- because you added us #dba s and used WMF as a context, I thought you wanted a WMF-only solution, not a mediawiki o... [17:51:29] 10DBA: truncate l10n_cache table on WMF wikis - https://phabricator.wikimedia.org/T150306#2781385 (10Reedy) Truncating these tables tables should be an easy win, and makes applying {T146591} across the cluster easier too [17:53:42] 10DBA: truncate l10n_cache table on WMF wikis - https://phabricator.wikimedia.org/T150306#3581083 (10Marostegui) >>! In T150306#3581067, @Reedy wrote: > Truncating these tables tables should be an easy win, and makes applying {T146591} across the cluster easier too Indeed! [17:54:27] 10DBA: truncate l10n_cache table on WMF wikis - https://phabricator.wikimedia.org/T150306#3581086 (10Marostegui) You think we need to backup them (in case they are not empty?) [17:56:08] 10DBA, 10Data-Services, 10Quarry: CHAR_LENGTH does not return the character count - https://phabricator.wikimedia.org/T174543#3565363 (10Base) @jcrespo , is this why I also fail to get normal results while attempting to match title against a regex? https://quarry.wmflabs.org/query/21026 [17:56:39] 10DBA: truncate l10n_cache table on WMF wikis - https://phabricator.wikimedia.org/T150306#3581095 (10Reedy) Nope, very much not so :) They're full of ooold data that has no value :) [18:12:56] 10DBA, 10Data-Services, 10Quarry: CHAR_LENGTH does not return the character count - https://phabricator.wikimedia.org/T174543#3581128 (10jcrespo) > is this why I also fail to get normal results while attempting to match title against a regex? I cannot say, I would tell you to try if it helps :-) Some of the... [18:15:18] 10DBA: Drop "trackbacks" table on all wikis that have it - https://phabricator.wikimedia.org/T175051#3581131 (10demon) [18:17:05] 10DBA: truncate l10n_cache table on WMF wikis - https://phabricator.wikimedia.org/T150306#3581163 (10Marostegui) This table exists on: s1: `enwiki` s2: ``` bgwiki bgwiktionary cswiki enwikiquote enwiktionary eowiki fiwiki idwiki itwiki nlwiki nowiki plwiki ptwiki svwiki thwiki trwiki zhwiki ``` s3: ``` aawiki... [18:28:34] 10DBA: Drop "trackbacks" table on all wikis that have it - https://phabricator.wikimedia.org/T175051#3581131 (10Marostegui) Thanks for the initial list! This is the complete list of wikis where it exists: s1: Nothing as originally mentioned by @demon s2: ``` bgwiki bgwiktionary cswiki enwikiquote enwiktionary... [19:27:05] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1059 - https://phabricator.wikimedia.org/T174857#3581561 (10Cmjohnson) Replaced the disk and it's rebuilding Enclosure Device ID: 32 Slot Number: 6 Drive's position: DiskGroup: 0, Span: 3, Arm: 0 Enclosure position: 1 Device Id: 6 WWN: 5000C5006821E074 S... [19:29:12] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1059 - https://phabricator.wikimedia.org/T174857#3575136 (10jcrespo) Thank you very much, will monitor when it finishes and close the ticket. [19:30:41] 10DBA, 10Operations, 10Phabricator: Decom db1048 (BBU Faulty - slave lagging) - https://phabricator.wikimedia.org/T160731#3581600 (10Cmjohnson) p:05Triage>03Low [19:34:58] 10DBA, 10Operations, 10Phabricator: Decom db1048 (BBU Faulty - slave lagging) - https://phabricator.wikimedia.org/T160731#3581627 (10jcrespo) @Cmjohnson We are going to decom db1048 (but we are not ready yet), please do not take any action here, we will just clone it and ask you to unrack it. Opened for DBA... [19:39:44] 10DBA: Run pt-table-checksum on s4 (commonswiki) - https://phabricator.wikimedia.org/T162593#3581648 (10jcrespo) I thought it was going to be fast, but db1056 took a whole day. db1053 is next. Hopefully the rest will not take as much. [19:41:30] 10DBA, 10Operations, 10Phabricator: Decom db1048 (BBU Faulty - slave lagging) - https://phabricator.wikimedia.org/T160731#3581657 (10Cmjohnson) no worries, I was just moving it to a lower priority for me..I am couple of weeks away from tacking decom's [20:03:53] 10DBA, 10Cloud-Services: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#3581790 (10bd808) [20:03:57] 10DBA, 10Cloud-Services, 10Cloud-VPS, 10Epic, 10Tracking: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788#3581789 (10bd808) [20:04:16] 10DBA, 10Cloud-Services: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#2546917 (10bd808) [20:04:18] 10DBA, 10Cloud-Services, 10Cloud-VPS, 10Epic, 10Tracking: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788#2475959 (10bd808) [20:05:08] 10DBA, 10Cloud-Services: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#2546917 (10bd808) [20:05:11] 10DBA, 10Cloud-Services, 10Cloud-VPS, 10Epic, 10Tracking: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788#2475959 (10bd808) [20:10:18] 10DBA, 10Data-Services, 10cloud-services-team: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#3581840 (10bd808) [20:20:55] 10DBA, 10Operations, 10Scoring-platform-team, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3581887 (10bd808) [20:20:58] 10DBA, 10Data-Services, 10cloud-services-team: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#3581886 (10bd808) [20:23:44] 10DBA, 10Operations, 10Phabricator, 10hardware-requests: Decom db1048 (BBU Faulty - slave lagging) - https://phabricator.wikimedia.org/T160731#3581897 (10Peachey88) [21:11:40] 10DBA, 10MediaWiki-Database, 10Patch-For-Review, 10PostgreSQL, 10Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#3582158 (10Reedy) [22:25:42] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Create and announce timeline for shutting down labsdb100[13] - https://phabricator.wikimedia.org/T175086#3582412 (10bd808) [23:08:17] 10DBA, 10Data-Services, 10cloud-services-team: Identify tools hosting databases on labsdb100[13] and notify maintainers - https://phabricator.wikimedia.org/T175096#3582626 (10bd808)