[05:12:14] 10DBA, 10Wikimedia-Site-requests, 10Patch-For-Review: advisorswiki is not in any s?.dblist - https://phabricator.wikimedia.org/T202904 (10Marostegui) >>! In T202904#4550768, @Framawiki wrote: > Can this task be closed ? @Marostegui From the DB side it is all done but I will leave that question up to @Anomie... [05:48:53] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) Thanks! We will try to get this done today or tomorrow! [07:14:44] 10Blocked-on-schema-change, 10MediaWiki-extensions-Translate, 10Language-2018-July-September: Apply schema change to translate_reviews in WMF - https://phabricator.wikimedia.org/T201011 (10Marostegui) p:05Triage>03Normal [07:17:12] 10Blocked-on-schema-change, 10MediaWiki-extensions-Translate, 10Language-2018-July-September: Apply schema change to translate_reviews in WMF - https://phabricator.wikimedia.org/T201011 (10Marostegui) Quick note: this table is small enough (10MB on commons and wikidata) that this can probably be done directl... [07:43:03] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) I have done this on codfw (testwikidatawiki and wikidatawiki). S... [07:43:58] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) [08:12:10] 10DBA, 10Cloud-Services: Prepare and check storage layer for fixcopyright.wikimedia.org - https://phabricator.wikimedia.org/T202820 (10Marostegui) There is one row now there (I just checked the user table) and it is filtered correctly. I would, however, as Jaime said, wait a bit more until we have a few more r... [08:46:27] 10DBA, 10Operations, 10ops-codfw: db2042 RAID battery failed - https://phabricator.wikimedia.org/T202051 (10Marostegui) 05Open>03stalled p:05Triage>03Normal I am marking this as Stalled and if no one objects I think we should proceed with T202051#4541285 leaving the RAID controller with WB enforced. [08:50:07] 10DBA, 10MediaWiki-Database, 10PostgreSQL, 10Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441 (10Marostegui) [08:50:13] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) [09:48:09] <_joe_> hi! is tendril still eqiad-only? [09:48:55] what do you mean read only? [09:49:08] you mean the DB? [09:49:08] _joe_: yes replication makes impossible to make it work crossdc [09:49:20] tendril replacemnt will be both dcs [09:49:29] but probably active-passive [09:49:51] <_joe_> ok [09:50:02] <_joe_> just checking https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/456639/2/cookbooks/sre/switchdc/mediawiki/08-update-tendril.py [11:12:40] dear DBAs, would you like to add anything to this comments? [11:12:42] https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/456510/2/cookbooks/sre/switchdc/mediawiki/03-set-db-readonly.py#15 [11:18:20] volans: I don't know about that comment- being a theoretical scenario- I don't think I have any argument before or after [11:18:58] we are going to deploy soon tests for read only on core hosts https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/450228/ [11:19:08] so I guess we could do it only on from hosts? [11:20:08] I think when it was first done, we said, what should it happen, and I said "all hosts should be in read only" [11:20:37] how we get there is not important- so do whatever is simpler, I guess? [11:20:59] we can skip mysql.set_core_masters_readonly(args.dc_to) [11:21:01] last time to be honest we set only the dc_from one but we verified both [11:21:07] ok [11:21:11] I'll skip it then [11:21:22] it would be nice to veryfy both still? [11:21:26] sure [11:21:28] is it possible [11:21:30] ? [11:21:41] everything is possible :) [11:21:44] :-) [11:21:55] my point is that we have read only checks [11:22:00] *will have [11:22:09] that will prevent those issues in the first place [11:22:24] however, those alerts will complain during failover [11:22:32] we may need some custom icinga commands? [11:22:49] (it doesn't have to happen for this switch however) [11:23:40] some icinga helpers are surely in the TODO for spicerack in my vision, let's see if we can have something in time [11:24:01] low prio [11:24:11] for this, we will manually handle most of them [11:25:15] it would be nice if we could add to the wiki page the list of icinga alerts to downtime during the switchover [11:28:42] yep, I have not worked on that yet [11:29:03] because we have some changes since last time [11:29:15] for the most part, I think no db alerts should happen [11:29:27] as the read only check has only been implemented on non-masters [11:29:28] nice! [11:29:47] but I need to recheck [11:30:08] of course, we will have alerts if we have overloads [11:30:13] but that is expected [11:30:56] sure, we just don't want unnecessary noise [13:17:11] 10DBA: Add support for socket path to redact_sanitarium.sh - https://phabricator.wikimedia.org/T203394 (10Marostegui) [13:17:38] banyek: if you want to take a look at ^, maybe I can give you an introduction of sanitarium+labs environment and you can get that small task? [13:17:48] I already did that [13:17:52] oh! [13:17:54] <3 [13:18:00] 10DBA: Add support for socket path to redact_sanitarium.sh - https://phabricator.wikimedia.org/T203394 (10Marostegui) p:05Triage>03Normal [13:18:01] although we didn't into details about how sanitarium works [13:18:10] because the issue [13:18:21] 10DBA: Add support for socket path to redact_sanitarium.sh - https://phabricator.wikimedia.org/T203394 (10Marostegui) [13:18:25] I prefered for him to solve the pt-kill-wikimedia stuff first [13:18:35] Sure [13:19:42] 10DBA, 10Schema-change: Drop externallinks.el_from_namespace on wmf databases - https://phabricator.wikimedia.org/T114117 (10Marostegui) s7 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1002 [] db1125 [x] db1101 [] db1098 [] db1094 [] db1090 [] db1086 [] db1079 [] db1062 [13:19:47] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping rc_moved_to_title/rc_moved_to_ns on wmf databases - https://phabricator.wikimedia.org/T51191 (10Marostegui) s7 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1002 [] db1125 [x] db1101 [] db1098 [] db1094... [13:19:52] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping rc_cur_time on wmf databases - https://phabricator.wikimedia.org/T67448 (10Marostegui) s7 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1002 [] db1125 [x] db1101 [] db1098 [] db1094 [] db1090 [] db1086 [... [13:19:56] 10DBA: Add support for socket path and/or port (multiinstance support) to redact_sanitarium.sh - https://phabricator.wikimedia.org/T203394 (10jcrespo) [13:23:40] 10DBA, 10Cloud-Services: Prepare and check storage layer for fixcopyright.wikimedia.org - https://phabricator.wikimedia.org/T202820 (10Marostegui) a:05jcrespo>03None I talked to Jaime about that existing user, and when he first sanitized it, the user wasn't there, so it means it was created after the trigg... [13:23:52] 10DBA, 10Data-Services: Prepare and check storage layer for fixcopyright.wikimedia.org - https://phabricator.wikimedia.org/T202820 (10Marostegui) [13:26:21] 10DBA, 10Schema-change: Drop externallinks.el_from_namespace on wmf databases - https://phabricator.wikimedia.org/T114117 (10Marostegui) [13:26:38] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping rc_moved_to_title/rc_moved_to_ns on wmf databases - https://phabricator.wikimedia.org/T51191 (10Marostegui) [13:27:15] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping rc_cur_time on wmf databases - https://phabricator.wikimedia.org/T67448 (10Marostegui) [13:38:15] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Patch-For-Review, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10GoranSMilovanovic) @Marostegui I see that in `enwiki` there is no more `eu_touched` in the `wbc_entity_usage` table. This made one of my crucia... [13:40:24] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Patch-For-Review, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10Marostegui) >>! In T144010#4553520, @GoranSMilovanovic wrote: > @Marostegui I see that in `enwiki` there is no more `eu_touched` in the `wbc_ent... [13:41:37] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Patch-For-Review, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10GoranSMilovanovic) @Marostegui Thank you! [13:51:07] marostegui: sure, sorry I missed that [13:52:55] No worries! If you have time and want only - I don't want to overwhelm you! [13:55:40] everything which helps me to get to the context welcomed [13:57:52] 10DBA: Add support for socket path and/or port (multiinstance support) to redact_sanitarium.sh - https://phabricator.wikimedia.org/T203394 (10Banyek) a:03Banyek [13:58:30] banyek: so what we can do is I can go with you over how sanitariums work with some more details tomorrow morning on a call and then you can take a look at that script once you've gotten the context? [13:58:37] Does that sound good? [13:58:45] that sounds perfect [13:58:57] ok, I will send you a calendar request [13:59:05] 👍 [14:31:57] 10DBA, 10Cloud-VPS, 10Patch-For-Review: VPS puppet enc 'prefix' field size too small - https://phabricator.wikimedia.org/T203104 (10Marostegui) >>! In T203104#4542953, @bd808 wrote: > The "ERROR 1709 (HY000): Index column size too large." error means that you are hitting the InnoDB table type maximum index l... [14:32:24] 10DBA, 10Cloud-VPS, 10Patch-For-Review: VPS puppet enc 'prefix' field size too small - https://phabricator.wikimedia.org/T203104 (10Marostegui) p:05Triage>03Normal [15:48:35] is anyone working with instances in db2078 ? [15:48:52] not touching db2078 [15:49:00] what's wrong? [15:49:14] prometheus errors for 13323,22,21 [15:49:24] https://grafana.wikimedia.org/dashboard/db/mysql-aggregated?orgId=1&var-dc=codfw%20prometheus%2Fops&var-group=core&var-shard=All&var-role=All&from=now-1h&to=now [15:49:59] could be just a missconfiguration, I rarely look at codfw [15:50:29] dbstore2001:13320 I think is a prometheus config bug [15:50:44] (x1 was deleted but not updated on config) [15:50:49] I am checking db2078 [15:51:07] ok! [15:51:12] let me know if you need help [15:51:35] sure, just wanted to make sure you were not doing a reboot or something [15:51:36] thank you [16:09:17] caches are quite cold- getting the history of a page now is taking 20 second for the first time [16:09:37] 10DBA, 10Cloud-VPS, 10Patch-For-Review: VPS puppet enc 'prefix' field size too small - https://phabricator.wikimedia.org/T203104 (10Andrew) Done. Thanks all! [16:09:43] then it halves for every subsequent run [16:11:01] 10DBA, 10Cloud-VPS, 10Patch-For-Review: VPS puppet enc 'prefix' field size too small - https://phabricator.wikimedia.org/T203104 (10Andrew) 05Open>03Resolved a:03Andrew [16:35:54] 10DBA, 10Operations, 10cloud-services-team, 10wikitech.wikimedia.org, 10Release-Engineering-Team (Watching / External): Move some wikis to s5 - https://phabricator.wikimedia.org/T184805 (10jcrespo) Wikitech is current only m5- however, on switchover to codfw, it will point to db2037. However, m5-master w... [16:37:35] 10DBA, 10Data-Services, 10cloud-services-team, 10wikitech.wikimedia.org: Move wikitech and labstestwiki to s5 - https://phabricator.wikimedia.org/T167973 (10jcrespo) T184805#4554020 [17:21:44] 10DBA, 10Data-Services, 10cloud-services-team, 10wikitech.wikimedia.org: Move wikitech and labstestwiki to s5 - https://phabricator.wikimedia.org/T167973 (10Marostegui) Should we maybe change db-codfw.php to get db1073 instead of db2037 until we come up with a better solution or that wouldn't fix the issue?