[02:12:45] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3045390 (10kaldari) p:05Triage>03Normal [02:14:58] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3045392 (10kaldari) @jcrespo: Does the approach in https://gerrit.wikimedia.org/r/339109 look sane? [02:15:11] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3045393 (10kaldari) a:03kaldari [02:20:13] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3045401 (10Bawolff) If there are issues with changing the field size, we could potentially just hash it if it doesnt fit. [06:42:10] 10DBA: Install and reimage dbstore1001 as jessie - https://phabricator.wikimedia.org/T153768#3045551 (10Marostegui) Once the backups from today have been finished I will copy them to dbstore2001 as well and start the transfer of /srv/sqldata to one of the esXX servers within the same DC [07:04:12] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3045611 (10Marostegui) db2055 is done: ``` Table: revision Create Table: CREATE TABLE `revision` ( `rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT, `rev_pag... [07:33:29] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3045680 (10Marostegui) >>! In T132416#3043576, @EBernhardson wrote: > T158454 is about the rev_timestamp index on db1065's (vslow) enwiki database. Basically the Cirrus... [07:46:02] 10DBA, 10fundraising-tech-ops: fundraising database tuning - https://phabricator.wikimedia.org/T158446#3045687 (10Marostegui) >>! In T158446#3044038, @Jgreen wrote: > innodb_buffer_pool_size yep, I've increased that to 75-80% of system RAM. > > The tables are all innodb but the config dates back before we wer... [09:25:17] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3045933 (10Marostegui) I am running the following command now for s2 for the first iteration of testing with real data and real tables (I tested it first with just on... [09:49:30] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3046055 (10Marostegui) >>! In T130067#3043617, @jcrespo wrote: > There is one thing that worries me more- in your #... [11:04:43] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3046174 (10Marostegui) The first iteration of the above run finished: - No lag generated - 1 error with db1047 ``` 02-22T10:20:40 Skipping table nlwiki.protected_tit... [11:05:52] 10DBA: db1045 disk space - compression needed - https://phabricator.wikimedia.org/T155399#3046175 (10jcrespo) 05Open>03Resolved All wiki tables have been compressed. [11:06:44] ^ \o/ [12:56:32] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1049 - https://phabricator.wikimedia.org/T158761#3046470 (10Marostegui) [13:00:31] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1049 - https://phabricator.wikimedia.org/T158761#3046481 (10Marostegui) p:05Triage>03High a:03Cmjohnson This is correct, that disk is broken: ``` Enclosure Device ID: 32 Slot Number: 4 Drive's position: DiskGroup: 0, Span: 2, Arm: 0 Enclosure positio... [13:11:22] 10DBA, 10Monitoring: m1 slaves all broke replication due to bacula.DelCandidates temporary table - https://phabricator.wikimedia.org/T158764#3046529 (10jcrespo) [13:11:26] ^ [13:40:51] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3046584 (10Marostegui) So, I have done some testing with the data itself. - Deleted 22 rows from the master and th... [13:43:57] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3046585 (10Marostegui) For the record: Replication got broken on db1069 (sanitarium) with: ``` Last_Error: Error 'Table 'nlwiki.pr_index' doesn't e... [14:00:38] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3046640 (10jcrespo) Maybe we should ignore all edits on '%.__wmf_checksums' ? After all, if writes happen on the master, this method will not catch differences due to... [14:20:33] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3046687 (10Marostegui) >>! In T154485#3046640, @jcrespo wrote: > Maybe we should ignore all edits on '%.__wmf_checksums' ? After all, if writes happen on the master,... [14:33:07] marostegui, jynus: ok to restart apache on dbmonitor* for the the openssl update? AFAICT it's just a few internal users, right? [14:34:09] moritzm: works for me [14:34:31] k, doing that now [14:34:53] done [14:36:38] :) [14:37:25] it is not in production, I think [15:28:14] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3046849 (10Marostegui) I have restarted db1069 and db1095 and they have the new filter applied: ``` %.__wmf_checksums ``` [15:54:46] 10DBA, 10fundraising-tech-ops: fundraising database tuning - https://phabricator.wikimedia.org/T158446#3046880 (10Jgreen) >>! In T158446#3045687, @Marostegui wrote: >>>! In T158446#3044038, @Jgreen wrote: >> innodb_buffer_pool_size yep, I've increased that to 75-80% of system RAM. >> >> The tables are all inn... [15:54:59] 10DBA, 10fundraising-tech-ops: fundraising database tuning - https://phabricator.wikimedia.org/T158446#3046881 (10Jgreen) p:05Triage>03Normal [15:55:50] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1049 - https://phabricator.wikimedia.org/T158761#3046882 (10Cmjohnson) The disk has been swapped and is rebuilding Enclosure Device ID: 32 Slot Number: 4 Drive's position: DiskGroup: 0, Span: 2, Arm: 0 Enclosure position: N/A Device Id: 4 WWN: 5000C5005E8... [15:56:19] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1049 - https://phabricator.wikimedia.org/T158761#3046883 (10Marostegui) Thanks Chris, I will keep an eye on it and close the ticket once it is finished! [16:19:15] backups will finish soon-ish. I would stop the db when it finally does and start copying it away so it is done during the night [16:19:30] yep [16:19:33] that was the plan :) [16:20:33] will copy the db during the night and those backups in the morning [16:20:34] it is going by ss [16:20:46] by ss? [16:20:51] sswiki [16:20:54] ah [16:20:59] alphabetically [16:21:01] yeah, I checked earlier and it was with nowiki i believe [17:27:51] 10DBA, 10Monitoring: m1 slaves all broke replication due to bacula.DelCandidates temporary table - https://phabricator.wikimedia.org/T158764#3047236 (10jcrespo) a:03jcrespo [17:36:46] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3047251 (10jcrespo) The user impact/dependencies are not 100% clear for this maintenance, which will be the long one (maybe a couple of days), so I requested help of some... [18:01:51] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1049 - https://phabricator.wikimedia.org/T158761#3047356 (10Marostegui) 05Open>03Resolved All good now! Thanks! ``` root@db1049:~# megacli -PDRbld -ShowProg -PhysDrv [32:4] -aALL Device(Encl-32 Slot-4) is not in rebuild process Exit Code: 0x00 root@d... [18:03:30] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: Replace BBU for db1060 - https://phabricator.wikimedia.org/T158194#3047361 (10Marostegui) Thanks @Cmjohnson - the BBU now looks good! ``` root@db1060:~# megacli -AdpBbuCmd -aAll BBU status for Adapter: 0 BatteryType: BBU Voltage: 3937 mV Current: 468... [18:03:45] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: Replace BBU for db1060 - https://phabricator.wikimedia.org/T158194#3047362 (10Marostegui) 05Open>03Resolved [18:47:20] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: Replace BBU for db1060 - https://phabricator.wikimedia.org/T158194#3047552 (10Marostegui) Repooled db1060 with less weight (and still not serving API again) so it can warm up a bit. [20:10:56] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3047844 (10matmarex) I think this is way overkill. We only support a couple dozen collations. Even if we supported multiple collations f... [20:19:52] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3047897 (10thiemowmde) Just hash it, please, as I already suggested on the original patch. MD5 is perfectly fine for this use case. Secu... [20:55:08] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3048045 (10Bawolff) > Security does not matter here (no user input but configuration) and collisions are non-existing for such a small s... [21:17:15] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3048124 (10kaldari) Hashing it seems like a bad idea to me. We would be obfuscating the data on 100% of wikis in order to accommodate th... [21:44:16] 10DBA, 10MediaWiki-Categories, 03Community-Tech-Sprint, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3048220 (10Bawolff) We could just hash only in the case that the key is too long. I think ICU minor versions could cause the sort orde...