[07:26:19] 10DBA, 10Patch-For-Review: Decommission db1039 - https://phabricator.wikimedia.org/T184262#3881874 (10Marostegui) [07:27:08] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Decommission db1039 - https://phabricator.wikimedia.org/T184262#3881876 (10Marostegui) a:05Marostegui>03Cmjohnson db1039 is now ready to be decommissioned by @Cmjohnson [08:04:58] 10DBA: Decommission db1030 - https://phabricator.wikimedia.org/T184397#3881935 (10Marostegui) [08:05:19] 10DBA: Decommission db1030 - https://phabricator.wikimedia.org/T184397#3881946 (10Marostegui) p:05Triage>03Normal [08:05:52] 10DBA: Decommission db1030 - https://phabricator.wikimedia.org/T184397#3881935 (10Marostegui) [08:05:56] 10DBA, 10Operations, 10Goal, 10Patch-For-Review: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#3881949 (10Marostegui) [08:10:08] 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1039 - https://phabricator.wikimedia.org/T184262#3881976 (10Marostegui) [08:32:31] 10DBA: db1011 possibly faulty BBU - https://phabricator.wikimedia.org/T184401#3882006 (10Marostegui) [08:32:42] 10DBA: db1011 possibly faulty BBU - https://phabricator.wikimedia.org/T184401#3882016 (10Marostegui) p:05Triage>03Normal [08:34:44] es1017 is at almost 90% disk usage, I am going to see why [08:35:10] maybe it has backups? [08:36:19] ah yeah, that srv/tmp [09:13:52] 10DBA, 10Operations, 10ops-eqiad: db1059 possibly BBU issues - https://phabricator.wikimedia.org/T184160#3882079 (10Marostegui) `˜/icinga-wm 10:13> PROBLEM - MegaRAID on db1059 is CRITICAL: CRITICAL: 1 LD(s) must have write cache policy WriteBack, currently using: WriteThrough` [09:14:42] hello people! As FYI maintenance on db1107 is completed and Eventlogging is back to push data normally. Today I'll start the el purging/sanitization script again on db1107 and it should work fine this time [09:19:28] did you try BBU force on db1011 already? [09:19:56] I see now you did [09:20:01] I will ack the alert [09:20:16] thanks :) [09:23:59] did you also see my update on meltdown? [09:24:35] I think that makes upgrade to strech a hard dependency [09:25:27] The last one I read was one that says around 10% regression, is that the last? [09:25:54] yes [09:26:03] yeah, read it [09:26:11] still a lot better than what I was afraid of [09:26:41] well, people discussed it wouldn't be the worse case scenario [09:30:48] The archeology on enwiki.archive looks like it is going to be a fun one :-( [09:31:14] At least it is a goal so it has priority :-) [09:39:04] we have a 4.9 kernel on jessie as well (and it also support PCID) [09:39:23] so this only leaves trusty in the dark (but those are being phased out anyway) [09:40:25] oh, that is a good news- although I would try to upgrade if we have to do a reboot anyway [09:42:55] sure, makes sense [10:13:44] 10DBA, 10Data-Services: Create backups of user tables from decommissioned database servers - https://phabricator.wikimedia.org/T183758#3882264 (10Marostegui) I have backuped all the `_p` databases listed on T183758#3879024 ``` ls -lh labsdb1003/ total 4.8G -rw-r--r-- 1 root root 5.9M Jan 8 08:48 p50380g50491... [10:39:53] <_joe_> I was about to ask you people if you were looking at https://phabricator.wikimedia.org/T12331, but I see you're already on it :) [10:45:40] 10DBA, 10Patch-For-Review: Run pt-table-checksum on s1 (enwiki) - https://phabricator.wikimedia.org/T162807#3882309 (10Marostegui) First iteration reveals drifts on: ``` archive change_tag oldimage ores_classification tag_summary text user_newtalk [10:46:08] 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3882310 (10Marostegui) s7 master is done [10:46:41] 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3882311 (10Marostegui) [15:22:57] "Timed out waiting on db1073 pos db1052-bin.005274/302521201" [15:23:23] that is unrelated to wikidata [15:23:30] something weird is going on [15:35:48] could that be a "normal" thing [15:35:51] as in, we just didn't notice it? [15:36:28] it could be pre-existing [15:36:35] but I am not sure it is normal [15:36:47] yeah, normal as "it has been there we just didn't notice" [15:51:53] I am going to do some full table scans on the s8 hosts [16:03:29] 10DBA, 10Data-Services, 10Dumps-Generation, 10MediaWiki-Platform-Team: Configure Toolforge replica views and dumps for the new MCR tables - https://phabricator.wikimedia.org/T184446#3883142 (10Anomie) p:05Triage>03Normal [16:12:11] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: MCR schema migration stage 0: create tables - https://phabricator.wikimedia.org/T183486#3883178 (10Anomie) #DBA: Since creating tables isn't classified as a schema change actually needing DBA intervention, do you have an... [16:13:17] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: MCR schema migration stage 0: create tables - https://phabricator.wikimedia.org/T183486#3854821 (10Marostegui) >>! In T183486#3883178, @Anomie wrote: > #DBA: Since creating tables isn't classified as a schema change actu... [16:22:41] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: MCR schema migration stage 0: create tables - https://phabricator.wikimedia.org/T183486#3883251 (10Anomie) Ok, I put it on the deployment page for [[https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180109... [17:31:43] db2060 failed disk, I guess the RAID task will arrive soon [17:35:03] yep, saw it [17:36:33] lots of db2050+ disks failed? [17:36:50] 2054,2055 and 2060, are them from the same batch? [17:36:57] probably [17:37:04] I will check how old they are [17:37:09] but probably not too old [17:37:21] couple of years at most [17:37:26] I think robh checked db2055 and the warranty expires 18th Jan or something like that [17:38:11] db2060 expires on 14th jan [17:38:48] I will put the task (when it arrives) with high priority so we can get it ordered before it expires :) [17:41:57] I didn't know so soon [17:42:09] ah, it makes sense, I set them up [17:42:13] haha [17:42:22] but I think they had been idle for some time without me knowing [17:42:25] for a while [17:42:29] when I started here [17:44:21] 10DBA, 10Operations, 10ops-codfw: Degraded RAID on db2060 - https://phabricator.wikimedia.org/T184464#3883611 (10Marostegui) [18:58:35] 10DBA, 10Operations, 10ops-codfw: Degraded RAID on db2055 - https://phabricator.wikimedia.org/T184285#3883874 (10Papaul) a:05Papaul>03Marostegui Disk replacement complete. [19:59:42] 10DBA, 10Operations, 10ops-codfw: Degraded RAID on db2060 - https://phabricator.wikimedia.org/T184464#3884113 (10Papaul) Dear Mr Papaul Tshibamba, Thank you for contacting Hewlett Packard Enterprise for your service request. This email confirms your request for service and the details are below. Your reque... [20:55:21] 10DBA, 10Data-Services: Create backups of user tables from decommissioned database servers - https://phabricator.wikimedia.org/T183758#3884267 (10bd808) >>! In T183758#3882264, @Marostegui wrote: > However, I would like also to leave it on a server that #cloud-services-team own, so it is there too. Which one s...