[02:40:19] 10DBA, 06Labs, 10MediaWiki-extensions-Babel, 06Security-Team, 06WMF-Legal: Replicate babel db table on Labs - https://phabricator.wikimedia.org/T160713#3108521 (10Bawolff) All this is public info (available already via API action=query&meta=babel, or just user page). It is totally ok to make this public... [06:13:57] 10DBA, 06Labs, 10MediaWiki-extensions-Linter: Make "linter" table available on Labs - https://phabricator.wikimedia.org/T160611#3105094 (10Marostegui) The table is already replicated to labs, we'd only need the view to be created. ``` mysql:root@localhost [enwiki]> select @@hostname; +------------+ | @@host... [06:29:40] 07Blocked-on-schema-change, 10DBA, 13Patch-For-Review: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3132871 (10Marostegui) The last host of eqiad (apart from the primary master) is now running the alter. Once it is done, I... [06:34:38] 10DBA, 10Analytics-EventLogging, 06Analytics-Kanban, 10ImageMetrics, 13Patch-For-Review: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#3132872 (10Marostegui) So far so good, but let's give it a couple of more days: ``` root@EV... [06:47:29] 10DBA, 06Labs: Expose ar_content_format and ar_content_model columns of archive table on Labs replicas - https://phabricator.wikimedia.org/T89741#1044220 (10Marostegui) These two columns are replicated in labs, but missing from the view. ``` mysql:root@localhost [enwiki]> select @@hostname; +------------+ | @@... [07:10:16] 10DBA, 10Wikidata, 13Patch-For-Review, 07Performance, and 3 others: Implement ChangeDispatchCoordinator based on RedisLockManager - https://phabricator.wikimedia.org/T151993#3132931 (10jcrespo) It just occurred to me an extra reason to avoid using a db master- master failover is a relative frequent operati... [08:18:57] we should grep mediawiki-core in case there is a FORCE(key_name) somewhere [08:19:41] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07PostgreSQL, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#3133036 (10Marostegui) I have started with db1089 (depooled) running (via osc_host): ``` alter table cate... [08:19:53] jynus: Good idea - will do [08:45:41] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3133052 (10jcrespo) I think this will work with no issue and no impact on production, wh... [09:01:33] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3133089 (10Volans) I'm not saying it will not work, I just suggested to monitor it, beca... [09:04:28] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3133092 (10jcrespo) > If so I would consider not having it for the cross-DC or gather so... [09:31:34] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3133166 (10Volans) Right, I was forgetting the details of the implementation, I agree th... [10:20:15] jynus: given that we're on topic, there is another place where the DB depends on mw_primary and is the replication_is_critical boolean for the icinga check [10:21:22] that can stay like that [10:21:37] even fully-asynchronous [10:22:16] as long as puppet is eventually executed, we do not care if it happens 30 minutes after the failover [10:22:40] yeah, do you want to force a puppet run right after the switchover? [10:22:49] no need [10:22:53] *do you want me to add a step in the procedure to do that [10:22:58] no need [10:23:00] ok :D [10:23:11] thanks [10:23:19] if something goes bad during it, we will notice immediately [10:23:30] and better not having a page storm [10:23:35] like last time [11:32:28] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3133548 (10jcrespo) This is now deployed, only applying to shards other than s6 is left.... [11:48:32] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3133565 (10Marostegui) db1053 is done: ``` root@neodymium:~# mysql... [11:49:59] 07Blocked-on-schema-change, 10DBA, 13Patch-For-Review: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3133566 (10Marostegui) db1053 is done: ``` root@neodymium:~# mysql --skip-ssl -hdb1053 commonswiki -e "show create table im... [12:46:41] 10DBA: Run pt-table-checksum on es2 - https://phabricator.wikimedia.org/T161510#3133751 (10Marostegui) [14:30:07] volans, there is actually 1 thing that would be nice to have on failover [14:30:27] jynus: all ears [14:30:47] tendril curenly has its own manual way of saying which host is the master [14:31:03] (that will change when etc, etc) [14:31:19] but doing 1-time query for now would be nice [14:31:47] not high priority, though [14:31:50] sure, just give me the query/commands to execute and I'll add them [14:37:24] marostegui, I am going to break tendril's tree/dbtree for a second for a quick test [14:39:05] done [14:39:57] sure [14:41:28] volans, is that enough? https://phabricator.wikimedia.org/P5138 [14:42:14] sure [14:42:42] maybe you want to run like the other commands from localhost on db1011 [14:43:19] yes, I'll probably do that, for the host i'll get it from role::mariadb::tendril [14:43:24] so I don't have to hard code it [14:43:24] if you create a placeholder [14:43:30] I can do the actual action [14:43:35] *write [14:44:26] this can be done after we are back in read/write? [14:44:40] yes [14:44:43] great [14:44:46] it is mostly cosmetic [14:44:56] yeah, but useful for you [14:45:02] while checking if everything is ok ;) [14:45:03] for the tree to start on the real master [14:46:00] you can even test it (I did with ​new_shard_master_fqdn=db2016.codfw.wmnet;) [14:46:11] eheheh perfect! :D [14:46:12] it will just put the root of the tree there [14:46:22] no functionality changes [14:48:46] while deploying semisync replication I discovered a breaking bug on our topology, by the way [14:48:59] maybe not breaking, but annoying [14:49:27] which one? [14:51:18] we were not replicating ES from codfw -> eqiad [14:51:37] uf [14:51:43] i am glad you found it now XD [14:51:44] not a huge issue [14:52:06] because we would have missed it on tendril [14:54:12] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3134069 (10Marostegui) db1040 (primary master) is done: ``` root@ne... [14:57:24] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3134078 (10Marostegui) 05Open>03Resolved All done: ``` root@neo... [15:30:44] it is not clear to me how I can execute something on all mysql masters with cumin [15:31:23] jynus: given our current puppetization is not the easiest task but can be achieved with: [15:31:40] I am trying to get acostumed to it as joe suggested [15:31:47] I do not mind if it is not perfect now [15:31:54] I just want to use it [15:31:57] 'R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = master' [15:35:37] escaping hell: [15:35:43] cumin 'R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = master' "mysql --skip-ssl -e \"SHOW GLOBAL STATUS like 'Rpl_semi_sync_master_clients'\"" [15:36:08] yes, or the other way around :D [15:36:44] Not sure I understand the output [15:37:07] does it group same outputs? [15:37:23] yes [15:37:29] ah, nice [15:37:29] and tell you which hosts [15:37:33] how many and which ones [15:37:37] gave you that output [15:39:56] I think feature wise is complete to me [15:40:50] is it possible to format the output differently? [15:41:21] not yet, part of the TODO, you want a more easily greppable output :D [15:41:26] yes [15:41:32] that is actually the only request [15:41:39] current is ok for humans [15:42:53] sure, and also we'll push to use it as a library to make tools for the more common uses [15:48:00] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3134267 (10jcrespo) 05Open>03Resolved Semi sync is deployed on all masters, independ... [15:53:46] 10DBA, 06Operations, 05DC-Switchover-Prep-Q3-2016-17, 13Patch-For-Review, 07Wikimedia-Multiple-active-datacenters: Decouple Mariadb semi-sync replication from $::mw_primary - https://phabricator.wikimedia.org/T161007#3134344 (10Marostegui) >>! In T161007#3134267, @jcrespo wrote: > > We could consider add... [15:56:28] 10DBA, 10ProofreadPage, 13Patch-For-Review: Implements ContentHandler abstraction for ProofreadPage Index: pages - https://phabricator.wikimedia.org/T161524#3134352 (10Tpt) @DBA Change https://gerrit.wikimedia.org/r/#/c/328543/ contains a maintenance script to migrate old Index: pages to the new content mode... [15:56:45] 10DBA, 10ProofreadPage, 13Patch-For-Review: Implements ContentHandler abstraction for ProofreadPage Index: pages - https://phabricator.wikimedia.org/T161524#3134355 (10Tpt) [15:57:55] 10DBA, 10ProofreadPage, 13Patch-For-Review: Implements ContentHandler abstraction for ProofreadPage Index: pages - https://phabricator.wikimedia.org/T161524#3134296 (10jcrespo) How much activity do you predict to have (I do not need an accurate count, just if it is thousands or millions or revisions). [16:01:17] 10DBA, 10ProofreadPage, 13Patch-For-Review: Implements ContentHandler abstraction for ProofreadPage Index: pages - https://phabricator.wikimedia.org/T161524#3134398 (10Tpt) The biggest Wikisources have less than 20,000 Index: pages each. So on all wikis combined it is something around ~100,000 rows. [16:06:56] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: db1057 does not react to powercycle/powerdown/powerup commands - https://phabricator.wikimedia.org/T160435#3134411 (10Cmjohnson) I am not sure which one you think we can pull from? All the db's that are being decom'd are different server types and olde... [16:07:58] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: db1057 does not react to powercycle/powerdown/powerup commands - https://phabricator.wikimedia.org/T160435#3134420 (10Marostegui) >>! In T160435#3134411, @Cmjohnson wrote: > I am not sure which one you think we can pull from? All the db's that are > be... [16:09:24] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: db1057 does not react to powercycle/powerdown/powerup commands - https://phabricator.wikimedia.org/T160435#3134427 (10Cmjohnson) We may be able to salvage data.....I can try moving the raid card and disks to a decom R510 [16:59:30] 10DBA: Run pt-table-checksum on es2 - https://phabricator.wikimedia.org/T161510#3134612 (10Marostegui) The first 700 wikis have been checksummed with no differences so far. [17:12:01] 10DBA, 10ProofreadPage, 13Patch-For-Review: Implements ContentHandler abstraction for ProofreadPage Index: pages - https://phabricator.wikimedia.org/T161524#3134646 (10jcrespo) A first look from the database point of view looks fine -activity is batched, and wait for slaves is run after every batch. Please a... [17:24:20] 10DBA, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Provision with data the new labsdb servers and provide replica service with at least 1 shard from a sanitized copy from production - https://phabricator.wikimedia.org/T147052#3134731 (10madhuvishy) [17:24:22] 10DBA, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Migrate existing labs users from the old servers, if possible using roles and start maintaining users on the new database servers, too - https://phabricator.wikimedia.org/T149933#3134729 (10madhuvishy) 05Open>03Resolved Closing this :) [17:38:03] 10DBA, 10ProofreadPage, 13Patch-For-Review: Implements ContentHandler abstraction for ProofreadPage Index: pages - https://phabricator.wikimedia.org/T161524#3134757 (10Tpt) > Please allow me a few extra days to give a closer look at the query Ok! Thank you! [23:50:01] 10DBA, 10Wikimedia-Site-requests: Disable miser mode ($wgMiserMode) on small wikis (wikis in small.dblist) - https://phabricator.wikimedia.org/T48098#3135801 (10Krinkle)