[06:41:44] 10DBA, 10Patch-For-Review: Defragment echo_event tables on x1 - https://phabricator.wikimedia.org/T217591 (10Marostegui) [06:42:28] 10DBA, 10Patch-For-Review: Defragment echo_event tables on x1 - https://phabricator.wikimedia.org/T217591 (10Marostegui) 05Open→03Resolved This is all done! [08:27:13] backups are all going well this week [08:27:28] nice! [08:27:52] nice work on x1 [08:28:00] did the master behave well? [08:28:29] (I guess less trafic than on SX? [08:28:32] ) [08:29:09] yes, way less traffic [08:38:28] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) 05Open→03Resolved I am going to close this as resolved. Everything is essentially already done as we h... [08:38:39] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [09:25:42] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) [10:51:28] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) s3 eqiad progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [] db1124 [] db1123 [x] db1095 []... [12:03:31] 10DBA, 10Operations: Predictive failures on disk S.M.A.R.T. status - https://phabricator.wikimedia.org/T208323 (10jcrespo) [14:52:47] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions. [14:57:05] ? [14:57:22] test host db1114, after an improper shutdown [14:57:27] aaaah [14:57:34] you scared me! [14:57:35] not good [14:57:48] don't say those things without indicating the hostname! [14:57:53] sorry [14:58:12] :) [14:58:16] I thought you knew I was working on test-s1 [14:58:19] * marostegui goes back to normal heart rate [14:58:23] sorry [14:58:27] :) [14:58:48] is that 10.1? [14:58:51] or 10.3? [14:58:57] 10.1 on buster [14:59:01] gtid? [14:59:05] yes [14:59:36] which surprisingly (buster) doesn't break anything, even the mariadb package works without a rebuild [14:59:42] was that a crash? [14:59:56] or you rebooted without stopping mysql or similar? [15:00:00] I think I restarted the server without stopping the instance [15:00:05] on porpuse [15:00:14] as it was a reimage [15:00:30] yeah, interesting…maybe worth trying a few more times to see if it is a recurrent thing [15:00:42] but we couldn't reimage because blockers and was told to upgrade in place for now [15:05:21] actually, I just figured out how to install from scratch despite the current blocker, so for the next test host we can do that as well :-) [15:13:18] was a few dbstores with failing prometheus metrics, although right now it is only dbstore1003:s1 [15:13:25] *there [15:13:36] will look at it later [15:44:53] I fixed a bad grant on dbstore1003:s1 for metrics collection [15:45:25] ah, thanks :) [16:09:50] So we have a working buster host: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1114&var-port=9104&from=1551884971305&to=1551888571305 [16:10:15] lovely!!!! \o/ [16:10:32] without a mariad upgrade, work on the package next [16:12:54] buster has OpenSSL 1.1.1, let me know if you run into any issues building mariadb with it [16:13:05] yeah, that is pending [16:13:16] in theory, 10.2 made the work [16:13:39] I am guessing the official debian links to 1.1 already [16:13:53] so all work done (in theory) [16:14:15] unfortunately not [16:14:26] ah, yes, yassl or whatever is called now [16:14:31] yeah :-/ [16:14:36] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=921488 [16:14:40] wolfssl? [16:15:37] but before that happened, the packages were using OpenSSL and Otto merged a patch for OpenSSL 1.1 in 10.3.11-1: [16:15:46] Update SSL/TLS keys as OpenSSL since 1.1.0 rejects weak keys by default [16:16:06] possibly only related to the test suite, but we can have a closer look if there are actual issues [16:16:22] I've been fighting upstream at https://jira.mariadb.org/browse/MDEV-12811 [16:16:42] but in theory 10.2+ works : https://jira.mariadb.org/browse/MDEV-10332 [16:16:56] since a year ago [16:17:23] there is also some workarounds I have to undo because older systemd version and other things [16:18:29] I was more worried about puppet and buster, but I am surprised that went quite smotthly [16:19:10] probably because trusy -> jessie had many isssues, and jessie -> stretch had many updates + systemd for mariadb [16:37:58] 10DBA, 10Operations, 10monitoring, 10Patch-For-Review: MySQL metrics monitoring - https://phabricator.wikimedia.org/T143896 (10jcrespo) [16:38:06] 10DBA, 10Operations, 10Patch-For-Review, 10User-fgiunchedi: Upgrade mysqld_exporter in production - https://phabricator.wikimedia.org/T161296 (10jcrespo) 05Open→03Stalled a:05jcrespo→03None Fixed configuration for buster, but with no additional metrics (same metrics as before). We can thing of ena... [18:32:50] 10DBA, 10wikitech.wikimedia.org: Rename database labswiki to wikitech - https://phabricator.wikimedia.org/T171570 (10JAllemandou) [18:32:54] 10DBA, 10wikitech.wikimedia.org: Move wikitech and labstestwiki to s5 - https://phabricator.wikimedia.org/T167973 (10JAllemandou) [19:15:48] 10DBA, 10Analytics, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labsdb1012.eqiad.wmnet - https://phabricator.wikimedia.org/T215231 (10elukey) [19:16:54] 10DBA, 10Analytics, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labsdb1012.eqiad.wmnet - https://phabricator.wikimedia.org/T215231 (10elukey) @jcrespo @Marostegui thoughts? What would it be best in your opinion? I'd prefer another dbproxy-based domain but not sure how complicated to create/... [20:27:18] in case some db people are around, is there any interest in reporting Lock wait timeout errors? [20:28:37] a poor deferred linkupdate tries to UPDATE `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_title = 'Flickr_images_reviewed_by_FlickreviewR_2' [20:28:43] from Function: WikiPage::updateCategoryCounts [20:29:06] i guess there are bunch of jobs running in parallel to update it ;) [20:29:57] yeah 2 million + members in that category :/