[00:40:06] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Papaul) @Marostegui in most cases the CPU1/CPU2 Machine check error detected is caused from outdated BIOS. I will recommend that we first update the BIOS. The sys... [05:08:02] 10DBA, 10Analytics, 10Analytics-Kanban, 10Data-Services, and 2 others: Not able to scoop comment table in labs for mediawiki reconstruction process [EPIC} - https://phabricator.wikimedia.org/T209031 (10Nuria) 05Open→03Resolved [05:24:48] 10DBA, 10Operations, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10Nuria) [06:01:46] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) >>! In T214840#4953022, @Papaul wrote: > @Marostegui in most cases the CPU1/CPU2 Machine check error detected is caused from outdated BIOS. I will rec... [06:07:19] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: db1114 crashed - https://phabricator.wikimedia.org/T214720 (10Marostegui) @Cmjohnson should we also try to exchange the DIMM modules listed at T214720#4937872 and see if they fail again? [06:51:46] so only db1063 and db1065 are left to restart [07:04:15] did you see my edit of the bacula expansion state? [07:24:37] backups went all fine this week [07:35:28] Yeah, I did see that [08:01:20] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) After the FW and BIOS upgraded I have rebooted db1106 a number of times with 4.9.0-8 and this is the result: 1st reboot: OK 2nd reboot: OK 3rd reboot... [08:16:03] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10jcrespo) > After the FW and BIOS upgraded I have rebooted db1106 a number of times with 4.9.0-8 and this is the result yay {meme, src="goat-for-it"} [09:25:42] buster: mariadb 10.2, 10.3, 10.4 or 8.0? [09:25:59] 10.3 and 8.0 I would say [09:26:16] buster is likely to have 10.2 [09:26:25] and 10.4 is likely to be released soon [09:26:30] FYI [09:26:45] it has 10.3: https://packages.qa.debian.org/m/mariadb-10.3.html [09:26:51] oh, really? [09:27:10] 10.2 is already removed [09:27:17] https://packages.qa.debian.org/m/mariadb-10.2.html [09:27:23] :) [09:27:25] I think that makes it more firm [09:28:26] apparently I have already a wmf-mariadb103 package, but I think it needs tweaking on the notify systemd type [09:29:03] do you know if there are any API compat issues on the client side between 10.3 and 8? buster only ships the libmariadbclient from 10.3 instead of libmysqlclient, so I'm wondering if there are any compat issues if used against 8.0? [09:29:16] there was [09:29:22] not sure if it has been corrected [09:29:31] 8 has a new authentication method [09:29:42] which requires connector (and client) upgrade [09:29:56] ack, ok [09:30:09] it is not a huge issue because you can force the old, insecure method [09:30:30] and I would guess mariadb implemented it, but I would have to check [09:31:45] funny, php did upgrade, but go decided to not support it https://github.com/go-sql-driver/mysql/issues/785 [09:32:55] so apparently the C connector got the features: https://mariadb.com/kb/en/library/authentication-plugin-sha-256/ [09:35:43] I don't know from which version that got into the mariadb-client package [09:43:48] moritzm: I guess we can upload already packages to wikimedia-buster ? [09:48:50] yes, there's already quite a few in there [09:49:14] BTW, I noticed that you already prepared some DB puppet code for buster back in 2018 [09:49:46] this currently uses os_version to detect buster, but ATM "lsb -t" (on which os_version relies) in buster still uses "testing") [09:50:14] if you want to use these classes before buster is final you can switch to comparing $::lsbdistcodename == buster, that works fine [10:43:23] " I noticed that you already prepared some DB puppet code for buster back in 2018" [10:43:31] wow nice, jynus from the past, thank you! [10:44:01] you welcome, jynus from the future! [10:57:35] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [11:47:14] jynus: indeed :-) your commit to modules/mariadb/manifests/config.pp even dates back to April 2018 [14:20:07] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Papaul) @Marostegui this can be done anytime today. Just let me know when the server is down. Thanks [14:20:44] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) @Papaul thanks - I am going to put it down now. Will ping you on IRC once it is down Thanks! [15:10:55] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Papaul) a:05Papaul→03Marostegui Upgrade BIOS from 2.4.3 to 2.9.1 IDRAC from 2.40. to 2.60 [15:12:06] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) Thank you! I will delete the idrac logs and start testing [15:44:40] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) 05Open→03Resolved Reboot tests with db2085 4.9.0-8 after getting the BIOS and FW upgraded by Papaul (T214840#4954418) 1st reboot: OK 2nd reboot:... [15:46:29] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10MoritzMuehlenhoff) Are there other servers of that batch beside db1106 and db2085? [15:51:58] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10jcrespo) yeah, I would like to see this applied to similar servers- while not in a hurry, I prefer this done rather than suffering after a crash or an emergency r... [16:19:44] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) All eqiad servers from the same batch as db1106 are running 4.9.0-8 already db1096-db1106 All codfw servers from the same batch as db2085 are running... [16:21:39] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10jcrespo) My suggestion would be to take one or 2 codfw servers, reboot it a few times and see if it suffers the same issues. Maybe I just got lucky and the next t... [16:22:44] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10jcrespo) It is really easy to reboot codfw servers, and I can take care of that if you want me to. [16:23:06] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) Sure - go ahead :-) [16:23:55] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10jcrespo) Creating a separate ticket for that, will refer here. [16:25:21] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10MoritzMuehlenhoff) JFTR, the next Stretch update (this weekend) will update the kernel to 4.9.144-2, so that can be piggybacked. [20:33:31] 10DBA, 10MediaWiki-Database, 10MediaWiki-Special-pages: Special:ProtectedPages times out on enwiki for Module namespace - https://phabricator.wikimedia.org/T216183 (10colewhite) p:05Triage→03High [20:34:25] 10DBA, 10MediaWiki-Database, 10MediaWiki-Special-pages, 10Wikimedia-production-error: Special:ProtectedPages times out on enwiki for Module namespace - https://phabricator.wikimedia.org/T216183 (10Zoranzoki21)