[06:13:27] 10DBA: Install 1 buster+10.4 host per section - https://phabricator.wikimedia.org/T246604 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1098.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202003040613_marostegui_252219.log`. [06:17:33] 10DBA: Install 1 buster+10.4 host per section - https://phabricator.wikimedia.org/T246604 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1098.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202003040617_marostegui_252857.log`. [06:36:44] 10DBA: Install 1 buster+10.4 host per section - https://phabricator.wikimedia.org/T246604 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1098.eqiad.wmnet'] ` and were **ALL** successful. [06:45:48] 10DBA: Install 1 buster+10.4 host per section - https://phabricator.wikimedia.org/T246604 (10Marostegui) [06:47:31] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) [06:48:35] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) [08:10:15] o/ [08:10:31] marostegui: if its okay with you i'll start cache warming, and do a config deploy after you do your things? :) [08:11:00] sounds good [08:11:09] grand! [09:46:31] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) es2 after read_only is set on es1015 and pt-hearbeat stopped: ` root@... [10:07:23] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) [10:34:24] going from 20 -> 25 million in the next mins [10:34:41] ok [10:38:16] now at 25 million [10:39:07] this was the load decrease as the cache warming came to a stop for that 5 million btw [10:39:08] https://usercontent.irccloud-cdn.com/file/sHa97Sa3/image.png [10:40:58] all looks good to me [10:41:04] I'll start warming 25-30 million [10:43:28] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) es2 was set to read-only successfully. We will be deploying es5 on T... [10:47:41] I've just executed "insert into masters values ('es4', 'eqiad', 'es1020'), ('es4', 'codfw', 'es2020');" on zaricillo too, I forgot to do that yesterday [10:47:57] it wasn't there? [10:48:00] ah, on the masters table [10:48:07] I will do that for es5 too [10:48:08] thanks [10:50:43] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10jcrespo) Prometheus config after updating the script and the db: `lines=10 - labe... [10:51:33] Down to around 20k ops on the wb_terms table now (for reads) :D [10:52:37] nice!! [12:47:42] snapshotting: 0 failures since fix [13:18:09] I fixed db1098's "Check systemd state" [13:20:39] 10DBA, 10Cleanup: Drop DB tables for now-deleted fixcopyrightwiki from production - https://phabricator.wikimedia.org/T246055 (10Marostegui) In order to avoid this alert from firing, I have dropped this database on db1112 and db2074 (sanitarium masters) with replication enabled, so it has been dropped from san... [13:20:46] jynus: oh, you merged the change? [13:22:31] not yet, did it manually so not to lose metrics [13:22:43] there is also one issue with the package [13:23:07] which enables prometheus-mysqld-exporter (without @) automatically after install [13:23:10] with the exporter package? [13:23:13] and it fails [13:23:15] ah [13:23:16] yes [13:23:19] on multisource [13:23:23] for that I just manually disable it [13:23:25] I think we saw that some time ago [13:23:26] yeah [13:23:30] and then reset failed [13:23:35] we could force it on puppet [13:23:41] but never bothered [13:23:52] yeah, I think that's what we did last time, disable too [13:23:56] for multi instance [13:25:44] so all this is a pet peeve of mine of havin as less alerts for db hosts as possible [13:26:00] *few [13:26:11] so I can catch when there are new ones [13:26:39] I may research later to make the read only (and by extension, wmfmariadbpy) on percona [13:26:50] make the read only? [13:27:26] sorry, there is only 1 red db-related icinga check [13:27:44] db1114 MariaDB read only test-s1 [13:27:48] that is my fault [13:28:00] will have a look at it some point [13:28:16] ah, sorry, didn't get what you meant [13:28:31] yeah, just fix that one if you are bored, not in a rush at all for that one [13:28:51] as I said, it is a personal fight of mine, indeed not important [13:28:54] 0:-) [13:29:15] haha [13:29:19] I hope you win [13:29:53] it makes me win time, because otherwise I see icinga and have to remember why things are red, every morning [13:37:16] s/win/save/ [14:26:39] marostegui: where did you see the deadlocks? in logstash? [14:26:47] yeah [14:27:41] aaha yeah, i found them! [14:27:51] I was going to share an example with you [14:32:32] so, only locks on wbt_text ? [14:32:48] that i can see [15:25:24] 10DBA, 10Data-Services, 10WMF-Legal, 10cloud-services-team (Kanban): Expose ar_content_format and ar_content_model columns of archive table on Labs replicas - https://phabricator.wikimedia.org/T89741 (10Bstorm) 05Open→03Resolved This is definitely exposed on replicas, however it appears like it is actu... [16:21:35] * addshore watches another wikidata edit spike and looks for deadlocks [16:32:51] There are still a few deadlocks [16:32:57] * addshore has made and will probably backport https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/576883/ [21:25:29] 10DBA, 10Cloud-Services, 10Core Platform Team, 10CPT Initiatives (Developer Portal): Prepare and check storage layer for dev.wikimedia.org - https://phabricator.wikimedia.org/T246946 (10WDoranWMF) [21:26:56] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Add systemd timer to run `maintain-meta_p` daily on all Wiki Replica servers - https://phabricator.wikimedia.org/T246948 (10bd808) [21:44:52] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Prepare and check storage layer for ngwikimedia - https://phabricator.wikimedia.org/T240772 (10bd808) 05Open→03Resolved a:03bd808 `lines=10 $ sql ngwikimedia_p -- show tables More than one argument given; joining SQL query words with spaces. +----... [22:06:32] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Add systemd timer to run `maintain-meta_p` daily on all Wiki Replica servers - https://phabricator.wikimedia.org/T246948 (10Bstorm) I know that some tools (or at least one because I was pinged about it last time) will start trying to poll the databases...