[07:18:28] 10DBA, 06Labs, 10Tool-Labs, 13Patch-For-Review: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2810051 (10Marostegui) a:03Marostegui Thanks @yuvipanda I will leave this ticket open and ping you in two weeks! Thanks again! [08:10:48] 10DBA, 13Patch-For-Review: Unify commonswiki.revision - https://phabricator.wikimedia.org/T147305#2810093 (10Marostegui) Running on db1069: ``` ./software/dbtools/osc_host.sh --host=db1069.eqiad.wmnet --port=3314 --db=commonswiki --table=revision --method=ddl --no-replicate "add key page_user_timestamp (rev_p... [08:42:59] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810119 (10Marostegui) enwikivoyage database has been imported. I have placed the following flag in my.cnf as a temporary one: `replicate-do-db=enwikivoyage` (not added to puppet yet) Replication i... [09:12:34] 07Blocked-on-schema-change, 10DBA, 10Wikimedia-Site-requests, 06Wikisource, and 2 others: Schema change for page content language - https://phabricator.wikimedia.org/T69223#2810195 (10Nemo_bis) Except Meta-Wiki (and secondarily Wikidata and Commons), those wikis don't need `$wgPageLanguageUseDB = true`. Ca... [09:41:24] 07Blocked-on-schema-change, 10DBA, 10Wikimedia-Site-requests, 06Wikisource, and 2 others: Schema change for page content language - https://phabricator.wikimedia.org/T69223#2810228 (10jcrespo) > those wikis don't need $wgPageLanguageUseDB = true. Can the wmgUseTranslate wikis other than those three proceed... [09:43:35] 10DBA, 13Patch-For-Review: Unify commonswiki.revision - https://phabricator.wikimedia.org/T147305#2810240 (10Marostegui) db1069 is done: ``` root@neodymium:~# mysql -hdb1069 -P3314 -A commonswiki -e "show create table revision\G" *************************** 1. row *************************** Table: rev... [10:17:12] 10DBA, 13Patch-For-Review: Unify commonswiki.revision - https://phabricator.wikimedia.org/T147305#2810270 (10Marostegui) Running on db1068: ``` ./software/dbtools/osc_host.sh --host=db1068.eqiad.wmnet --port=3306 --db=commonswiki --table=revision --method=ddl --no-replicate "add key page_user_timestamp (rev_... [10:23:43] doing anything with db2065? I want to deploy a schema change there [10:23:50] nope [10:23:52] go ahead [10:24:10] I will block replication there [10:24:20] ok [10:59:45] 10DBA, 13Patch-For-Review: Unify commonswiki.revision - https://phabricator.wikimedia.org/T147305#2810306 (10Marostegui) db1068 is done ``` root@neodymium:~# mysql -hdb1068 -A commonswiki -e "show create table revision\G" *************************** 1. row *************************** Table: revision Cr... [12:07:24] 10DBA, 06Operations, 10ops-codfw: db2041: Disk RAID predictive failure - https://phabricator.wikimedia.org/T151203#2810386 (10Marostegui) [12:21:11] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810431 (10Marostegui) `replicate-do-db` has been changed in for: `replicate-wild-do-table = enwikivoyage.%` [14:01:17] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810597 (10Marostegui) @jcrespo looking at my notes, I have that the script to sanitize is: `redact_standard_output.sh` Before running it tomorrow I will ping you anyways so we are on the same page... [14:04:27] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810600 (10jcrespo) Yes, but there are private tables that have to be dropped (they are on the replication filters). Also, non-regular wikis have a different process. [14:06:35] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2802738 (10Krenair) Non-regular wikis? What does that refer to? [14:10:11] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810611 (10jcrespo) @Marostegui, is is replicating from s3-master? We should think a strategy to test ROW-based replication triggers (from dbstore1002? from another slave?). [14:10:19] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810612 (10Marostegui) Thanks - as per the replication filters, these tables woulld need to be dropped from `enwikivoyage` then: ``` click_tracking cu_changes cu_log echo_email_batch echo_event ec... [14:11:11] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810616 (10Marostegui) >>! In T150960#2810611, @jcrespo wrote: > @Marostegui, is is replicating from s3-master? We should think a strategy to test ROW-based replication triggers (from dbstore1002?... [14:15:42] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810621 (10Marostegui) I have enabled GTID on it right now, so maybe we can move it under another slave in codfw (so we do not need to mess with eqiad) with ROW based replication. [14:17:07] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810622 (10jcrespo) >>! In T150960#2810612, @Marostegui wrote: > Thanks - as per the replication filters, these tables woulld need to be dropped from `enwikivoyage` then: That is the part I do not... [14:19:35] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 - https://phabricator.wikimedia.org/T150960#2810624 (10jcrespo) > I have enabled GTID on it right now, so maybe we can move it under another slave in codfw While that could be possible (I think it could too much overhead when we get more sh... [14:24:06] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 (temporary db1069 - sanitarium replacement) - https://phabricator.wikimedia.org/T150960#2810631 (10jcrespo) [14:24:37] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 (temporary db1069 - sanitarium replacement) - https://phabricator.wikimedia.org/T150960#2810632 (10Marostegui) >>! In T150960#2810622, @jcrespo wrote: >>>! In T150960#2810612, @Marostegui wrote: >> Thanks - as per the replication filters, th... [14:57:48] 10DBA, 06Labs, 10Labs-Infrastructure: Migrate existing labs users from the old servers, if possible using roles and start maintaining users on the new database servers, too - https://phabricator.wikimedia.org/T149933#2810718 (10chasemp) I was in a position this week to work through a few issues with create-d... [15:00:09] 10DBA, 06Labs, 10Labs-Infrastructure: Migrate existing labs users from the old servers, if possible using roles and start maintaining users on the new database servers, too - https://phabricator.wikimedia.org/T149933#2810721 (10chasemp) I think what this boils down to is: if you guys #DBA's can verify the ro... [15:13:48] 10DBA, 06Labs, 10Labs-Infrastructure: Migrate existing labs users from the old servers, if possible using roles and start maintaining users on the new database servers, too - https://phabricator.wikimedia.org/T149933#2810735 (10jcrespo) Yes, only 2 comments- * Try to avoid looping over created accounts. MyS... [15:19:32] 10DBA, 06Labs, 10Labs-Infrastructure: Migrate existing labs users from the old servers, if possible using roles and start maintaining users on the new database servers, too - https://phabricator.wikimedia.org/T149933#2810744 (10Marostegui) >>! In T149933#2810735, @jcrespo wrote: > >> Make it the canonical s... [15:42:21] 10DBA, 06Labs, 10Labs-Infrastructure: Initial data tests for db1095 (temporary db1069 - sanitarium replacement) - https://phabricator.wikimedia.org/T150960#2810791 (10Marostegui) I have changed db1095 to replicate from db2057 which has ROW based binlogs: ``` MariaDB PRODUCTION s3 localhost (none) > show glo... [15:43:08] 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2810793 (10Papaul) 05Open>03Resolved The old disks are wiped. Good to close this task. [15:43:27] 10DBA, 06Operations, 10ops-codfw: db2041: Disk RAID predictive failure - https://phabricator.wikimedia.org/T151203#2810797 (10Papaul) p:05Triage>03Normal [15:44:32] 10DBA, 06Labs, 10Labs-Infrastructure: Migrate existing labs users from the old servers, if possible using roles and start maintaining users on the new database servers, too - https://phabricator.wikimedia.org/T149933#2810800 (10chasemp) That would probably be best as the first question in dealing w/ a static... [15:52:47] 10DBA, 06Operations, 10ops-codfw: Several es20XX servers keep crashing (es2017, es2019, es2015, es2014) since 23 March - https://phabricator.wikimedia.org/T130702#2810811 (10Papaul) [15:52:49] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: es2015 crashed with no os logs (kernel logs or other software ones) - it shuddenly went down - https://phabricator.wikimedia.org/T147769#2810809 (10Papaul) 05Open>03Resolved It has been a month now this system hasn't reported the same error after sw... [17:50:54] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, and 3 others: Move dbproxy1010 and dbproxy1011 to labs-support network, rename them to labsdbproxy1001 and labsdbproxy1002 - https://phabricator.wikimedia.org/T149170#2811283 (10Cmjohnson) I can move these to rack c5. Can these be moved anytime or do the... [17:57:22] 10DBA, 10Edit-Review-Improvements-RC-Page, 06Collaboration-Team-Triage (Collab-Team-Q2-Oct-Dec-2016), 13Patch-For-Review: Implement functionality for RC page 'Experience level' filters - https://phabricator.wikimedia.org/T149637#2811347 (10SBisson) It was suggested on the patch that we make the user experi... [18:02:19] 10DBA, 10Edit-Review-Improvements-RC-Page, 06Collaboration-Team-Triage (Collab-Team-Q2-Oct-Dec-2016), 13Patch-For-Review: Implement functionality for RC page 'Experience level' filters - https://phabricator.wikimedia.org/T149637#2811363 (10jmatazzoni) I had imagined only that people would shift the definit... [18:16:36] 10DBA, 06Operations, 10ops-codfw: db2041: Disk RAID predictive failure - https://phabricator.wikimedia.org/T151203#2811422 (10Papaul) a:05Papaul>03Marostegui Disk replacement complete. [18:19:52] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, and 3 others: Move dbproxy1010 and dbproxy1011 to labs-support network, rename them to labsdbproxy1001 and labsdbproxy1002 - https://phabricator.wikimedia.org/T149170#2811430 (10jcrespo) They can be moved at any time, I will schedule downtime now on icinga. [18:31:10] 10DBA, 06Operations, 10ops-codfw: db2041: Disk RAID predictive failure - https://phabricator.wikimedia.org/T151203#2811460 (10Papaul) a:05Marostegui>03Papaul Wrong task the disk replacement was for db2035 not db2041. Discard this for now. [18:32:55] 10DBA, 06Operations, 10ops-codfw: db2035: RAID disk about to fail - https://phabricator.wikimedia.org/T150511#2811467 (10Papaul) Disk replacement complete. [18:57:09] 10DBA, 13Patch-For-Review: db2034: investigate its crash and reimage - https://phabricator.wikimedia.org/T149553#2811528 (10Papaul) The HP Tech didn't show up. [19:18:49] 10DBA, 10Edit-Review-Improvements-RC-Page, 06Collaboration-Team-Triage (Collab-Team-Q2-Oct-Dec-2016), 13Patch-For-Review: Implement functionality for RC page 'Experience level' filters - https://phabricator.wikimedia.org/T149637#2811612 (10jmatazzoni) > It was suggested on the patch that we make the user e... [19:50:34] 10DBA, 10MediaWiki-API, 10MediaWiki-Database, 07Performance, 07Schema-change: ApiQueryExtLinksUsage::run query has crazy limit - https://phabricator.wikimedia.org/T59176#2811780 (10Anomie) [19:55:26] 10DBA: Media errors on db1048 are creating lag - https://phabricator.wikimedia.org/T151039#2805396 (10mmodell) >>! In T151039#2805577, @jcrespo wrote: > BTW, this is the best graph to see this issues (assuming the load has not changed): https://grafana.wikimedia.org/dashboard/db/mysql?var-dc=eqiad%20prometheus%2... [22:58:28] 10DBA, 10Phabricator, 07Upstream: Editing a recurring event overrides all past instances - https://phabricator.wikimedia.org/T151228#2812646 (10mmodell) How much data was lost? We can probably restore the past descriptions from a snapshot with the help of a dba. @jcrespo? [23:57:58] 10DBA: db1092 crash - https://phabricator.wikimedia.org/T151272#2812799 (10RobH) [23:59:46] 10DBA, 06Operations, 13Patch-For-Review: db1092 crash - https://phabricator.wikimedia.org/T151272#2812814 (10RobH)