[06:16:52] 10DBA, 06Operations, 10ops-eqiad: db1055: degraded array - https://phabricator.wikimedia.org/T147172#2691692 (10Marostegui) Hey @Cmjohnson Looks like the disk failed again (same slot), could this be the disk bay or even worse...the controller itself? ``` Adapter #0 Enclosure Device ID: 32 Slot Number: 0... [06:19:55] 10DBA, 06Operations, 10ops-eqiad: db1065: Degraded RAID - https://phabricator.wikimedia.org/T147396#2691693 (10Marostegui) [06:25:44] 10DBA, 10MediaWiki-General-or-Unknown, 06Operations, 13Patch-For-Review: img_metadata queries for PDF files saturates s4 slaves - https://phabricator.wikimedia.org/T147296#2691706 (10Marostegui) Thanks @aaron - once it is pushed I will keep an eye on the graphs to see if this mitigate the spikes [06:59:25] 10DBA, 06Operations, 10ops-codfw: db2017 failed disk (degraded RAID) - https://phabricator.wikimedia.org/T145844#2691717 (10Marostegui) Thanks @Papaul, everything looks good now! ``` Device Present ================ Virtual Drives : 1 Degraded : 0 Offline... [06:59:37] 10DBA, 06Operations, 10ops-codfw: db2017 failed disk (degraded RAID) - https://phabricator.wikimedia.org/T145844#2691718 (10Marostegui) 05Open>03Resolved [07:10:00] 10DBA: Unify commonswiki.revision - https://phabricator.wikimedia.org/T147305#2691723 (10Marostegui) a:03Marostegui [07:16:27] 10DBA, 10Analytics-EventLogging, 10ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2691729 (10Marostegui) I am tempted to close this ticket as the only table that is still in use (39 records now) is: ImageMetricsCorsSu... [08:25:03] 10DBA, 06Operations, 13Patch-For-Review: db1019: Decommission - https://phabricator.wikimedia.org/T146265#2691818 (10Marostegui) It still needs to be deleted from DNS now that I think about it. We can probably do this at the same time it gets delete from the array of hosts of db-equiad|codfw.php files. [09:55:33] 10DBA, 06Operations, 10ops-eqiad: db1055: degraded array - https://phabricator.wikimedia.org/T147172#2692088 (10Marostegui) I have been trying to debug the issue to see if there is a disk bay or controller problem. Unfortunately the drac isn't giving much information after checking every single checkable tar... [11:15:21] 10DBA, 06Operations, 13Patch-For-Review: db1019: Decommission - https://phabricator.wikimedia.org/T146265#2692192 (10jcrespo) >>! In T146265#2688549, @Marostegui wrote: > @jcrespo can you confirm if it can be deleted from there without breaking the site as the header states? :-) Create a CR, add me as a rev... [12:05:57] 10DBA, 06Operations, 10ops-eqiad: db1055: degraded array - https://phabricator.wikimedia.org/T147172#2692272 (10Cmjohnson) More than likely it's the disk. These are repurposed disks, I will replace again. [12:07:08] 10DBA, 06Operations, 10ops-eqiad: db1055: degraded array - https://phabricator.wikimedia.org/T147172#2692274 (10Marostegui) Thanks! [12:14:42] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#2692285 (10MarcoAurelio) [12:27:59] marostegui: about? [12:28:32] chasemp: o/ [12:28:45] heyoo! quick ping on https://phabricator.wikimedia.org/T147413 [12:28:48] checking [12:28:55] simple but wanted to catch your attention [12:31:21] chasemp: It would make sense to drop the view from the scripts (so we are sure they are dropped everywhere) and of course amend them as you said [12:31:50] Can someone access those views now? [12:31:58] ie: leak content? [12:32:04] marostegui: I'm not in a place where I feel comfortable running either script against the setup, so I was hoping one of you gents might drop that manually on the two in question an dI'll amend the script [12:32:08] yes afaik [12:32:21] two servers in question: labsdb1001 and 1003 [12:32:26] let me check [12:38:54] So they are only in jamwiki and adywiki? [12:41:01] What is the name of the view the script creates? [12:43:04] I am running show full tables in both databases and I cannot see a view (labsdb1001) so I am sure I am missing something :) [12:43:48] Aaaah they are called jamwiki_p and adywiki_p [12:43:51] right, I see the views there [12:45:03] chasemp: I can drop them from jamwiki_p and adywiki_p if you want, but are we sure that is not going to break anything? [12:47:22] marostegui: anything it breaks shouldn't exist I think [12:47:30] (sorry was distracted for a minute) [12:47:37] No worries [12:47:38] also this view was purged on all other wiki's [12:48:17] Ok, so I can drop jamwiki_p.abuse_filter_history and adywiki_p.abuse_filter_history [12:48:27] yes [12:48:45] ok, give me a sec [12:49:52] done in labsdb1001 [12:49:54] going to *3 now [12:50:11] done [12:50:40] sweet thanks [12:50:53] this highlights the lack of sanity around this process historically I think [12:51:20] XDD [12:51:26] Will you take care of the scripts then? [12:51:30] sure will [12:51:43] can you add commentary to the task on what you did etc? [12:51:48] I just did :) [12:51:54] nice [12:52:00] let me know if you want more detail :) [12:52:04] in case something does come out of it it's all on the up and up [12:56:22] marostegui: with a "manual" intervention even if on the task it's probably legit to "!log" it in -ops, I imagine dropping view/tables manually isn't too common and the System Admin Log (SAL) is the first place ppl will check in case it started a slow fire for wahtever reason [12:56:41] I only mention it as an fyi [12:56:52] chasemp: I did that too :-) [12:57:00] totally slipped past me [12:57:04] well played [12:57:06] No worries, it is your morning :) [12:57:14] very early yes [12:57:15] Are you recovered from jetlag already by the way? [12:57:34] last night I fell down dead at 9 pm and woke up at 3:30 am so probably the answer is 'no' [12:57:54] hahahaha [12:58:27] I'm getting back about an hour each day [12:58:35] tomorrow 4:30 am here I come [12:58:49] not too bad though [12:58:50] You will be ready by the weekend then, well done [12:59:14] Plus you get to overlap a bit more with the DBAs :p [12:59:26] 6 am is a hard wakeup time w/ 2 small children under 10 [12:59:39] if someone isn't banging on something by then it's a miracle [12:59:51] hahahaha [14:34:17] 10DBA, 06Operations, 10ops-eqiad: db1055: degraded array - https://phabricator.wikimedia.org/T147172#2692766 (10Cmjohnson) swapped the disk (again) [14:48:06] 10DBA, 06Operations, 10ops-eqiad: db1055: degraded array - https://phabricator.wikimedia.org/T147172#2692931 (10Marostegui) Thanks - I will check tomorrow if it built successfully [16:59:33] 10DBA, 06Labs, 10Labs-Infrastructure, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2693879 (10chasemp) I feel comfortable that https://gerrit.wikimedia.org/r/#/c/295607/ is a replication of https://github.com/wi... [17:17:31] marostegui: still about? https://gerrit.wikimedia.org/r/#/c/314305/ [18:52:03] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#2694367 (10greg) [20:18:34] 10DBA, 06Labs, 10Labs-Infrastructure, 07Blocked-on-Operations: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2694905 (10AlexMonk-WMF) a:05AlexMonk-WMF>03chasemp Chase is working on figuring out what else we need to do before we can run the script. https:/...