[02:54:42] 10DBA, 10Toolforge: rc_actor in recentchanges table refers to no one record in actor table for edits from IP - https://phabricator.wikimedia.org/T276738 (10MBH) [02:55:46] 10DBA, 10Toolforge: rc_actor in recentchanges table refers to no one record in actor table for edits from IP - https://phabricator.wikimedia.org/T276738 (10MBH) p:05Triage→03Unbreak! [02:56:37] 10DBA, 10Toolforge: rc_actor in recentchanges table refers to no one record in actor table for edits from IP - https://phabricator.wikimedia.org/T276738 (10Reedy) Presumably a dupe of {T276698} [03:01:33] 10Blocked-on-schema-change, 10DBA: Drop default of rc_timestamp - https://phabricator.wikimedia.org/T276156 (10Ladsgroup) The tests failure doesn't have anything to do with this work. They inject fake rc entries to then query later with ores filtering (I wrote that extension and those faulty tests) and the tes... [05:56:06] 10DBA, 10SRE, 10ops-eqiad: db1162 crashed - https://phabricator.wikimedia.org/T275309 (10Marostegui) >>! In T275309#6879207, @Cmjohnson wrote: > This has been moved to this coming Friday at 10am local time (1500UTC) Was this done past Friday in the end? Thanks [05:56:27] 10Blocked-on-schema-change, 10DBA: Drop default of rc_timestamp - https://phabricator.wikimedia.org/T276156 (10Marostegui) So is this ok to keep proceeding? [06:02:25] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) @razzi is this host ready for getting data on it? [06:05:44] 10DBA, 10Patch-For-Review: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) db2145 is now replicating (built from a logical dump) and catching up [06:19:06] 10DBA: Check all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) [06:19:18] 10DBA: Check all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) p:05Triage→03Medium [06:21:41] 10DBA: Check all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) [06:29:57] 10DBA: Check all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) [06:31:07] 10DBA: Check for errors on all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) [06:45:47] 10DBA, 10Patch-For-Review: Evaluate the impact of changing innodb_change_buffering to inserts - https://phabricator.wikimedia.org/T263443 (10Marostegui) All parsercache hosts have been changed to `innodb_change_buffering = none` [07:01:08] 10DBA: Check for errors on all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) On-going checks and fixes: - db1166 - db1168 [07:01:35] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10elukey) >>! In T269211#6890919, @Marostegui wrote: > @razzi is this host ready for getting data on it? @Marostegui we only quickly checked that the /srv part... [07:02:22] 10DBA, 10Patch-For-Review: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) [07:07:59] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) It should be fine to add instances while I do the first copies. All that will happen is that puppet will attempt to create /srv/sqldata.sX and if... [07:10:08] 10DBA, 10Patch-For-Review: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) [07:36:17] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) For the record I just ran: ` root@clouddb1021:/srv# pvs PV VG Fmt Attr PSize PFree /dev/sda3 tank lvm2 a-- 13.92t <4.83t root@cl... [07:36:59] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) Transfer from clouddb1013 (s1 and s3) to clouddb1021 is now on-going [07:38:52] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) Removed labsdb1012 from tendril and zarcillo [09:46:52] 10DBA, 10Analytics-Clusters, 10Patch-For-Review: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) @elukey @razzi the data for s1 and s3 has been transferred and I have moved their data directories to their final location. I am not going to copy... [12:03:05] 10DBA: Check for errors on all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) [12:13:01] jynus: btw, loading s1 logical dump took around 3d, just letting you know as we both weren't too sure about it past friday. Now we have some new datapoint [12:16:03] that was compressed, right? [12:16:29] yep [12:16:49] and with the default number of thread (I believe that's 18) [13:15:44] 10Blocked-on-schema-change, 10DBA: Drop default of rc_timestamp - https://phabricator.wikimedia.org/T276156 (10Ladsgroup) Yup [14:22:39] 10DBA, 10Parsoid, 10Parsoid-Tests: testreduce_vd database in m5 still in use? - https://phabricator.wikimedia.org/T245408 (10Marostegui) >>! In T245408#6176995, @ssastry wrote: > Please do not drop the testreduce_vd database! :) Only testreduce_0715. We are yet to make a decision where and how we will run ou... [14:46:43] 10DBA, 10Parsoid, 10Parsoid-Tests: testreduce_vd database in m5 still in use? - https://phabricator.wikimedia.org/T245408 (10ssastry) We switched over to local databases on testreduce1001 around end of Jan and looks like everything is stable at this point. So, it is safe to drop both the testreduce and testr... [14:47:12] 10DBA, 10Parsoid, 10Parsoid-Tests: testreduce_vd database in m5 still in use? - https://phabricator.wikimedia.org/T245408 (10Marostegui) Sweet! Thank you! [14:48:35] 10DBA: Drop testreduce and testreduce_vd from m5 master - https://phabricator.wikimedia.org/T276787 (10Marostegui) [14:48:51] 10DBA: Drop testreduce and testreduce_vd from m5 master - https://phabricator.wikimedia.org/T276787 (10Marostegui) p:05Triage→03Medium [14:49:33] ^I believe there was no configured backups for those [14:49:45] yep, no backups as far as I remember [14:49:55] let me double check on the repo [14:50:56] yep: https://gerrit.wikimedia.org/r/c/operations/puppet/+/657801/9/modules/profile/templates/mariadb/grants/dumps-eqiad-m5.sql.erb [14:51:04] yep as in "no backups" [14:51:08] excellent [14:51:48] 10DBA, 10Parsoid, 10Parsoid-Tests: testreduce_vd database in m5 still in use? - https://phabricator.wikimedia.org/T245408 (10Marostegui) [14:51:50] 10DBA: Drop testreduce and testreduce_vd from m5 master - https://phabricator.wikimedia.org/T276787 (10Marostegui) [15:01:48] 10DBA: Check for errors on all tables on some hosts - https://phabricator.wikimedia.org/T276742 (10Marostegui) [16:30:54] 10DBA, 10SRE, 10ops-eqiad: db1162 crashed - https://phabricator.wikimedia.org/T275309 (10Cmjohnson) The motherboard was swapped on friday but did not fix the issue. The Dell tech did more troubleshooting and it was determined the backplane is bad. Waiting on the part and tech to schedule a time with me to r... [16:32:19] 10DBA, 10SRE, 10ops-eqiad: db1162 crashed - https://phabricator.wikimedia.org/T275309 (10Marostegui) Thanks! [19:09:04] Hi, curious if kormat / marostegui are around, I have a question for clouddb1021: if I enable the role wmcs::db::wikireplicas::dedicated::analytics_multiinstance for the sections that have been populated (s1 and s3), are there any commands I need to run for mariadb to recognize the populated data? [19:10:10] razzi: no, don't do anything [19:10:23] I would need to bring MySQL up tomorrow and configure everything [19:10:38] so you can apply the puppet role and no need to do anything, I will take care of it tomorrow [19:11:02] ok, sounds good! Thanks for the help :) [19:11:28] thanks! :) [19:12:01] thank you! [19:15:16] razzi: once applied the puppet change please comment on the task so I can see it tomorrow and know whether I should or shouldn't proceed :-) [19:15:46] Sounds good marostegui, will do [19:15:49] Merging now [19:15:56] excellent thanks [20:48:44] PROBLEM - MariaDB sustained replica lag on db1160 is CRITICAL: 19.4 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1160&var-port=9104 [20:49:54] RECOVERY - MariaDB sustained replica lag on db1160 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1160&var-port=9104 [21:19:08] 10DBA, 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10Wikimedia-Rdbms, and 2 others: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255 (10Krinkle)