[05:50:04] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for id_internalwikimedia - https://phabricator.wikimedia.org/T196748#4270895 (10Marostegui) a:05Marostegui>03None All sanitariums host have been restarted and they've picked up the filters. This is now fine fr... [05:53:39] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#4270908 (10Marostegui) [05:53:46] 10DBA, 10Operations, 10Goal, 10Patch-For-Review: Convert all sanitarium hosts to multi-instance and increase its reliability/redundancy - https://phabricator.wikimedia.org/T190704#4270902 (10Marostegui) 05Open>03Resolved a:03Marostegui Everything has been fine for more than a week now (including the... [05:55:13] 10DBA, 10Operations, 10Goal, 10Patch-For-Review: Convert all sanitarium hosts to multi-instance and increase its reliability/redundancy - https://phabricator.wikimedia.org/T190704#4270909 (10Marostegui) [05:56:02] 10DBA, 10Operations, 10Goal, 10Patch-For-Review: Convert all sanitarium hosts to multi-instance and increase its reliability/redundancy - https://phabricator.wikimedia.org/T190704#4245083 (10Marostegui) [05:57:55] 10DBA, 10MediaWiki-API, 10MediaWiki-Database, 10MW-1.29-release-notes, and 4 others: ApiQueryExtLinksUsage::run query has crazy limit - https://phabricator.wikimedia.org/T59176#4270911 (10Marostegui) [06:05:17] 10DBA, 10MediaWiki-Platform-Team (MWPT-Q4-Apr-Jun-2018), 10Patch-For-Review, 10Schema-change: Schema change to make archive.ar_rev_id NOT NULL - https://phabricator.wikimedia.org/T191316#4270912 (10Marostegui) [06:05:38] 10DBA, 10Multi-Content-Revisions, 10Patch-For-Review, 10Schema-change: Schema change to drop archive.ar_text and archive.ar_flags - https://phabricator.wikimedia.org/T192926#4270913 (10Marostegui) [06:05:54] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10MediaWiki-Database, 10Patch-For-Review, 10Wikidata-Ministry-Of-Magic: Schema change for ct_tag_id field to change_tag - https://phabricator.wikimedia.org/T195193#4270914 (10Marostegui) [06:34:06] 10DBA, 10Commons, 10MediaWiki-API, 10MediaWiki-Database, and 2 others: API: Contributions: Database query error - https://phabricator.wikimedia.org/T131065#4270947 (10Marostegui) 05Open>03Resolved a:03Anomie >>! In T131065#2495366, @gerritbot wrote: > Change 280047 merged by jenkins-bot: > API: Use r... [06:46:26] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4270961 (10Marostegui) @Bstorm please try again on labsdb1010, I have added the `GRANT` grant ``` | GRANT ALL PRIVILEGES ON `%wik%\_p`.* TO 'maintainview... [07:30:24] 10DBA, 10MediaWiki-Platform-Team (MWPT-Q4-Apr-Jun-2018), 10Patch-For-Review, 10Schema-change: Schema change to make archive.ar_rev_id NOT NULL - https://phabricator.wikimedia.org/T191316#4270981 (10Marostegui) [07:30:34] 10DBA, 10Multi-Content-Revisions, 10Patch-For-Review, 10Schema-change: Schema change to drop archive.ar_text and archive.ar_flags - https://phabricator.wikimedia.org/T192926#4270982 (10Marostegui) [07:30:41] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10MediaWiki-Database, 10Patch-For-Review, 10Wikidata-Ministry-Of-Magic: Schema change for ct_tag_id field to change_tag - https://phabricator.wikimedia.org/T195193#4270983 (10Marostegui) [07:30:46] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737#4270984 (10Marostegui) [07:32:48] 10DBA, 10MediaWiki-Platform-Team (MWPT-Q4-Apr-Jun-2018), 10Patch-For-Review, 10Schema-change: Schema change to make archive.ar_rev_id NOT NULL - https://phabricator.wikimedia.org/T191316#4270987 (10Marostegui) s4 eqiad progress [] labsdb1009 [] labsdb1010 [] labsdb1011 [] db1102 [] db1125 [] dbstore1002... [07:32:52] 10DBA, 10Multi-Content-Revisions, 10Patch-For-Review, 10Schema-change: Schema change to drop archive.ar_text and archive.ar_flags - https://phabricator.wikimedia.org/T192926#4270988 (10Marostegui) s4 eqiad progress [] labsdb1009 [] labsdb1010 [] labsdb1011 [] db1102 [] db1125 [] dbstore1002 [] db1081 []... [07:32:56] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10MediaWiki-Database, 10Patch-For-Review, 10Wikidata-Ministry-Of-Magic: Schema change for ct_tag_id field to change_tag - https://phabricator.wikimedia.org/T195193#4270989 (10Marostegui) s4 eqiad progress [] labsdb1009 [] labsdb1010 [] labsdb1011... [07:32:58] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737#4270990 (10Marostegui) s4 eqiad progress [] labsdb1009 [] labsdb1010 [] labsdb1011 [] db1102 [] db1125 [] dbstore1002 [] db1081 [] db1084 [] d... [07:33:20] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737#4270991 (10Marostegui) [07:33:38] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10MediaWiki-Database, 10Patch-For-Review, 10Wikidata-Ministry-Of-Magic: Schema change for ct_tag_id field to change_tag - https://phabricator.wikimedia.org/T195193#4270992 (10Marostegui) [07:34:01] 10DBA, 10Multi-Content-Revisions, 10Patch-For-Review, 10Schema-change: Schema change to drop archive.ar_text and archive.ar_flags - https://phabricator.wikimedia.org/T192926#4270993 (10Marostegui) [07:34:16] 10DBA, 10MediaWiki-Platform-Team (MWPT-Q4-Apr-Jun-2018), 10Patch-For-Review, 10Schema-change: Schema change to make archive.ar_rev_id NOT NULL - https://phabricator.wikimedia.org/T191316#4270994 (10Marostegui) [08:09:46] 10DBA, 10Patch-For-Review: Failover s2 primary master - https://phabricator.wikimedia.org/T194870#4271051 (10Marostegui) The draft with the steps and the patches is now done. @jcrespo please review them! Thanks [08:12:19] 10DBA, 10Patch-For-Review: MariaDB missing logrotate for error and slow logs - https://phabricator.wikimedia.org/T127636#4271053 (10Marostegui) 05Open>03declined After my chat with @Volans this is no longer needed as with the migration to stretch logs are no longer stored on `/srv/` as they are being handl... [08:44:09] 10DBA, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for id_internalwikimedia - https://phabricator.wikimedia.org/T196748#4271117 (10Marostegui) Removing cloud-services as there is nothing for them to do here as it is private. [09:59:44] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#4271324 (10elukey) Just wanted to confirm that the Analytics team will go forward with the replacement of dbstore1002 with a multi instance set up during the upcoming fiscal year. I'll... [10:00:20] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#4271328 (10Marostegui) Awesome news! Thanks for the heads up! [10:57:04] 10DBA, 10Analytics, 10WMDE-Analytics-Engineering: Replicate wikitech wikis to analytics-store.eqiad.wmnet - https://phabricator.wikimedia.org/T126218#4271566 (10Addshore) 05Open>03stalled Scripts like this https://github.com/wikimedia/analytics-wmde-scripts/blob/9a4c17cd89000eb52c50ed5c50dbad92b7c8adf8/s... [11:06:34] 10DBA, 10Gerrit, 10Operations, 10Phabricator: Massive increase of writes in m3 section - https://phabricator.wikimedia.org/T196840#4271595 (10Marostegui) [11:07:39] 10DBA, 10Gerrit, 10Operations, 10Phabricator: Massive increase of writes in m3 section - https://phabricator.wikimedia.org/T196840#4271598 (10Paladox) This https://phabricator.wikimedia.org/D1067 will fix it so no more new notedb refs are cloned. [11:09:11] 10DBA, 10Gerrit, 10Operations, 10Phabricator: Massive increase of writes in m3 section - https://phabricator.wikimedia.org/T196840#4271605 (10Marostegui) >>! In T196840#4271598, @Paladox wrote: > This https://phabricator.wikimedia.org/D1067 will fix it so no more new notedb refs are cloned. When are you p... [11:09:54] 10DBA, 10Gerrit, 10Operations, 10Phabricator: Massive increase of writes in m3 section - https://phabricator.wikimedia.org/T196840#4271614 (10Paladox) Need someone to approve it, merge it and then i think @mmodell would have to deploy it. [11:10:27] 10DBA, 10Gerrit, 10Operations, 10Phabricator: Massive increase of writes in m3 section - https://phabricator.wikimedia.org/T196840#4271615 (10Marostegui) Excellent - thanks! :) [13:01:14] 10DBA, 10Analytics, 10WMDE-Analytics-Engineering: Replicate wikitech wikis to analytics-store.eqiad.wmnet - https://phabricator.wikimedia.org/T126218#4271939 (10Marostegui) 05stalled>03declined As per our IRC chat [14:15:20] hi! I was catching up on meeting notes today, FYI mydumper upstream is on github now so https://bugs.launchpad.net/mydumper/+bug/1558164 might go unnoticed [14:16:50] I think jaime also opened it on github [14:17:31] https://github.com/maxbube/mydumper/issues/110 [14:17:33] he reused that [14:17:45] he actually talked to the maintainer I reckon [14:19:20] ah! even better, thanks :)) [14:30:04] 10DBA, 10Operations, 10ops-codfw: replace bad disk in db2059 - https://phabricator.wikimedia.org/T196709#4272315 (10Papaul) a:05Papaul>03Marostegui Disk replaced [14:31:04] 10DBA, 10Operations, 10ops-codfw: replace bad disk in db2059 - https://phabricator.wikimedia.org/T196709#4272318 (10Marostegui) Thanks! ``` physicaldrive 1I:1:12 (port 1I:box 1:bay 12, SAS, 600 GB, Rebuilding) ``` Will report back once it is done [15:35:26] 10DBA, 10Operations, 10ops-eqiad: Bad disk on db1063 - https://phabricator.wikimedia.org/T196804#4272583 (10Marostegui) Disk replaced by @Cmjohnson and RAID rebuilding: ``` root@db1063:~# megacli -PDRbld -ShowProg -PhysDrv [32:3] -aALL Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 2% in 1 Min... [15:39:32] 10DBA, 10Operations, 10ops-eqiad: Bad disk on db1065 - https://phabricator.wikimedia.org/T196806#4272605 (10Marostegui) Disk replaced by @Cmjohnson and now rebuilding: ``` root@db1065:~# megacli -PDRbld -ShowProg -PhysDrv [32:1] -aALL Rebuild Progress on Device at Enclosure 32, Slot 1 Completed 1% in 1 Minu... [15:54:41] 10DBA, 10Operations, 10ops-eqiad: Bad disk on db1063 - https://phabricator.wikimedia.org/T196804#4272710 (10Marostegui) [15:55:15] 10DBA, 10Operations, 10ops-eqiad: Bad disk on db1065 - https://phabricator.wikimedia.org/T196806#4272715 (10Marostegui) [16:09:12] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272778 (10Bstorm) I removed the idwikimedia_p database on labsdb1010, then I ran the script with debug mode on (so I am damned sure where it dies next t... [16:11:41] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272802 (10Marostegui) >>! In T193187#4272778, @Bstorm wrote: > I removed the idwikimedia_p database on labsdb1010, then I ran the script with debug mode... [16:12:48] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272805 (10Bstorm) I certainly can. Running it for T196362 [16:12:57] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272806 (10Marostegui) Grants for `idwikimedia_p` look fine as I created that one manually, let's try with another wiki only on labsdb1010. [16:14:03] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272807 (10Bstorm) *sigh* It failed on CREATE. I have no idea how or why: ``` 2018-06-11 16:13:01,612 DEBUG Removing 0 dbs as sensitive 2018-06-11 16:... [16:15:01] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272808 (10Bstorm) It should have created the grants for it, though :) [16:16:40] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272842 (10Marostegui) The grants look good: ``` | GRANT SELECT, SHOW VIEW ON `sahwikiquote\_p`.* TO 'labsdbuser' ``` I did a `FLUSH PRIVILEGES` just in... [16:19:10] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272846 (10Bstorm) Fascinating! It died on the GRANT this time (possibly since the grant is already in place?). The additional backslash should just be... [16:21:35] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272849 (10Marostegui) Yeah, I was wondering how you'd handle the fact that the grant is already added. No other grants were added, so can you try to ski... [16:22:43] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4272850 (10Bstorm) Yeah, that's easy enough to add. Let me try that. [16:39:34] 10DBA, 10Analytics, 10EventBus, 10MediaWiki-Categories, and 3 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4272921 (10Anomie) Digging into this deeper, I see an increase in database deadlocks at the end of... [16:41:09] 10DBA, 10Operations, 10ops-eqiad: Bad disk on db1065 - https://phabricator.wikimedia.org/T196806#4272925 (10Marostegui) The disk finished its rebuilt, but unfortunately has lots of errors and SMART alert too, so we need a new one :( ``` Predictive Failure Count: 1 Last Predictive Failure Event Seq Number: 6... [16:45:49] 10DBA, 10Operations, 10ops-eqiad: Bad disk on db1063 - https://phabricator.wikimedia.org/T196804#4272932 (10Marostegui) 05Open>03Resolved All looking good! ``` Drive has flagged a S.M.A.R.T alert : No Drive has flagged a S.M.A.R.T alert : No Drive has flagged a S.M.A.R.T alert : No Drive has flagged a S.... [16:53:55] 10DBA, 10Operations, 10ops-codfw: replace bad disk in db2059 - https://phabricator.wikimedia.org/T196709#4272971 (10Marostegui) 05Open>03Resolved All went good! ``` logicaldrive 1 (3.3 TB, RAID 1+0, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 600 GB, OK) physicaldrive 1I:1:2 (p... [17:59:01] 10DBA, 10Gerrit, 10Operations, 10Phabricator: Massive increase of writes in m3 section - https://phabricator.wikimedia.org/T196840#4273195 (10mmodell) I'm going to stop phd and attempt to clear out the backlog from the queue (it's a lot of useless updates that we don't need to write to the db ultimately) [18:20:00] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4273229 (10Marostegui) @Bstorm and myself are trying to debug the issue and it is all very weird sometimes commands work, sometimes they don't, this is a... [18:21:22] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4273272 (10Marostegui) More food for thought: 1) Create the grant as root 2) connect as maintainviews and play with the role (revoking+granting) works 3)... [18:24:19] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4273281 (10Marostegui) Looks like a bug?: ``` root@labsdb1010:~# mysql --skip-ssl -uroot Welcome to the MariaDB monitor. Commands end with ; or \g. Yo... [18:38:29] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4273356 (10Bstorm) Views and indexes are all set for this one. [18:39:47] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4273361 (10Marostegui) 05Open>03Resolved We are troubleshooting a views creation issues (https://phabricator.wikimedia.org/T193187) but for this one, we have done it manually running: ```... [18:48:30] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4253485 (10Urbanecm) @Marostegui Are you sure it is working? ``` urbanecm@tools-bastion-02 ~ $ sql --cluster=analytics bnwikivoyage Could not find requested database Make sure to ask for a d... [18:49:17] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4273381 (10Bstorm) Sorry, I have one more step. Let me run that. There's an additional script I haven't yet run. [18:52:15] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4273388 (10Bstorm) There, the last script triggers a few more actions, relating to DNS. Give it a few minutes and try again. [20:17:55] 10DBA, 10Cloud-Services, 10Patch-For-Review, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4273679 (10Marostegui) I have been doing more tests with this and it really looks like something weird is going on, so I have filed a MariaDB bug: https:...