[03:09:39] 10DBA, 10Data-Persistence, 10Growth-Structured-Tasks, 10Growth-Team: Add a link engineering: Determine format for accessing and storing link recommendations - https://phabricator.wikimedia.org/T261411 (10Tgr) [05:45:55] 10DBA, 10Data-Persistence: Enable replication eqiad -> codfw and other checks - https://phabricator.wikimedia.org/T261914 (10Marostegui) I have disabled GTID on eqiad masters (sX, x1, esX) The rest of slaves have been checked, and they all have GTID enabled. [05:46:43] 10DBA, 10Data-Persistence: Enable replication eqiad -> codfw and other checks - https://phabricator.wikimedia.org/T261914 (10Marostegui) [06:11:22] 10DBA, 10Data-Persistence, 10Growth-Structured-Tasks, 10Growth-Team: Add a link engineering: Determine format for accessing and storing link recommendations - https://phabricator.wikimedia.org/T261411 (10Marostegui) Thanks for the detailed explanation @kostajh! That helps a lot, much appreciated. Having o... [06:24:26] 10DBA, 10wikitech.wikimedia.org: Rename database 'labswiki' to 'wikitechwiki' - https://phabricator.wikimedia.org/T171570 (10Marostegui) [06:24:29] 10DBA, 10Operations: Rename be_x_oldwiki database to be_taraskwiki - https://phabricator.wikimedia.org/T127570 (10Marostegui) [06:25:26] 10DBA, 10Operations, 10Wikimedia-Site-requests: script & docs to rename wiki databases - https://phabricator.wikimedia.org/T83609 (10Marostegui) 05Stalled→03Declined I am going to close this. I don't think we'll really work on this. First of all, renaming a database isn't possible on MySQL (there is no `... [08:01:18] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10jcrespo) I have one question before everything else- does the parsercache expansion mean like a new "cluster/service" in parallel to the... [09:26:07] marostegui: I am checking now s2 on eqiad, do you remember if you saw issues on eqiad or codfw, to prioritize checking (I will check all eventually) [09:26:14] jynus: codfw [09:26:28] thanks, then will start doind the check in parallel there [09:26:34] thanks [09:27:42] huh. just stumbled across https://gerrit.wikimedia.org/r/c/operations/puppet/+/480750 [09:28:37] yeah, we still want this, I think, but may be too old to reuse? [09:30:24] well it's definitely ambitious [09:30:59] yeah, "want this" as the final product [09:31:07] some kind of merge [09:34:23] marked it as abandoned (with a reason) [09:34:48] +1 [09:35:08] in honestly, I didn't know that existed, it was created while I was on vacations, I think [09:35:33] the only reason i found it was because gerrit marked it as conflicting with https://gerrit.wikimedia.org/r/c/operations/puppet/+/635253 [09:36:48] that's a nice feature [09:37:07] I am very happy with all the incremental improvements you are doing, kormat [09:37:15] i'm glad :) [09:37:22] safe but moving forward, always [09:37:28] that's cool [10:08:32] 10DBA, 10Operations, 10Traffic, 10Patch-For-Review: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976 (10Marostegui) [10:08:56] 10DBA, 10Operations, 10Traffic, 10Sustainability: dbtree: make wasat a working backend and become active-active - https://phabricator.wikimedia.org/T163141 (10Marostegui) 05Stalled→03Declined Closing this as we won't be really working on this anymore, but on deprecating tendril in favour of something e... [10:41:03] he he: https://phabricator.wikimedia.org/T160229#3097930 [10:49:26] 10DBA, 10Operations, 10Patch-For-Review, 10User-Kormat: Convert role::mariadb::misc to profile - https://phabricator.wikimedia.org/T265900 (10Kormat) 05Open→03Resolved [10:49:28] 10DBA, 10Operations, 10Patch-For-Review, 10User-Kormat, 10User-jbond: Refactor mariadb puppet code - https://phabricator.wikimedia.org/T256972 (10Kormat) [10:53:49] 10DBA, 10Operations, 10User-Kormat: Puppetize orchestrator - https://phabricator.wikimedia.org/T265990 (10Kormat) p:05Triage→03Medium [10:54:09] 10DBA, 10Operations, 10User-Kormat: Puppetize orchestrator - https://phabricator.wikimedia.org/T265990 (10Kormat) [10:55:16] kormat: would it be useful to provide the "role" of a host for replication alerts in adition to the section? [10:55:35] so one can imediately say if it is a mw core or a backup host or something? [10:55:52] (on alert msg) [10:56:48] yeah that sounds useful [10:57:03] i don't know how much work it is, though [10:57:16] I can give it a look later [10:57:35] maybe it is just changing 1 profile were the alert is defined [11:03:34] I just reported https://bugs.mysql.com/bug.php?id=101239 [11:04:39] hopefully it doesn't go into the folder of the other bugs I reported: "Verified (2773 days)" [12:28:17] I checked and making the change is easy, the problem is applying it everywhere as we would have to change every single profile [12:28:32] so I would wait until more refactoring is done [12:30:36] 👍 [12:51:41] 10DBA, 10Data-Persistence: Enable replication eqiad -> codfw and other checks - https://phabricator.wikimedia.org/T261914 (10Marostegui) [12:54:01] 10DBA, 10Operations, 10User-Kormat: orchestrator: integrate promotion rules into puppet - https://phabricator.wikimedia.org/T266002 (10Kormat) [12:55:28] 10DBA, 10Operations, 10User-Kormat: orchestrator: integrate promotion rules into puppet - https://phabricator.wikimedia.org/T266002 (10Marostegui) p:05Triage→03Medium It is especially important to specify hosts that should never be masters [12:56:54] 10DBA, 10Operations, 10User-Kormat: orchestrator: Puppetize - https://phabricator.wikimedia.org/T265990 (10Kormat) [13:07:05] 10DBA, 10Operations, 10User-Kormat: orchestrator: Select backend database solution - https://phabricator.wikimedia.org/T266003 (10Kormat) [13:07:14] 10DBA, 10Operations, 10User-Kormat: orchestrator: Select backend database solution - https://phabricator.wikimedia.org/T266003 (10Kormat) p:05Triage→03Medium [13:17:44] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10Pchelolo) Thank you for the answers! > I have one question before everything else- does the parsercache expansion mean like a new "clus... [13:18:41] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10Marostegui) >>! In T263587#6562289, @Pchelolo wrote: > I guess we have to begin here. > > TLDR of the problem is that we will not have... [13:25:32] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10jcrespo) Small addendum: Note that parsercache functionality is memcached + MySQL, not just MySQL. In fact the MySQL part was a later ad... [13:30:19] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10jcrespo) Another small correction: > it could bring us capability to write into the ParserCache from the secondary DC, which we don't cu... [13:30:38] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10ArielGlenn) >>! In T263587#6563095, @jcrespo wrote: > I have one question before everything else- does the parsercache expansion mean li... [13:32:45] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10Joe) Cassandra is not absent of its own issues, and it has a much higher cost per GB than parsercache currently has (I did no research,... [13:35:41] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10Marostegui) @ArielGlenn the current parsercache hosts run SSDs. [13:36:13] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10jcrespo) >>! In T263587#6564251, @ArielGlenn wrote: > I'm going by the Dell quotes for the hw, backtracking from the racking task. If th... [13:47:21] 10DBA, 10MediaWiki-Parser, 10Parsoid, 10serviceops, 10Platform Team Workboards (Green): CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10Marostegui) >>! In T263587#6564281, @jcrespo wrote: >> I'm going by the Dell quotes for the hw, backtracking from the racking task. If t... [14:17:50] 10DBA, 10Operations, 10User-Kormat: orchestrator: Get packages into WMF apt - https://phabricator.wikimedia.org/T266023 (10Kormat) [16:05:18] 10DBA, 10Operations: Puppetize grants for mysql hosts that are the source of recovery (dbstore, passive misc) - https://phabricator.wikimedia.org/T111929 (10jcrespo) @LSobanski I think Manuel and/or I requested to document what grants are needed to setup a backup host. The problems is there is no good way to... [16:07:45] 10DBA, 10Operations: Puppetize grants for mysql hosts that are the source of recovery (dbstore, passive misc) - https://phabricator.wikimedia.org/T111929 (10jcrespo) In other works this is a subtask of bigger issue T146149, specific to the backup-related hosts. [16:54:36] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Strengthen backup infrastructure and support - https://phabricator.wikimedia.org/T229209 (10jcrespo) [17:26:14] 10DBA, 10Operations, 10Sustainability (Incident Followup), 10Wikimedia-Incident: S5 replication issue, affecting watchlist and probably recentchanges - https://phabricator.wikimedia.org/T263842 (10jcrespo) As a last comment, I thought at first it was 1, but after some analysis, I believe there are more cha... [19:31:19] PROBLEM - MariaDB sustained replica lag on db1081 is CRITICAL: 7.4 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1081&var-port=9104 [19:32:39] RECOVERY - MariaDB sustained replica lag on db1081 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1081&var-port=9104 [20:01:05] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10wiki_willy) a:05Jclark-ctr→03RobH Hi @RobH - since John is still out and Chris is knee deep with installs, can you see if you're able to work with HP remotely, in getting a replacement... [21:07:26] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) [21:12:01] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) This task has a number of issues, starting with: * There has been a [[ https://phabricator.wikimedia.org/maniphest/task/edit/form/55/ | hardware troubleshooting form available ]] on... [21:12:08] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) a:05RobH→03jcrespo [21:13:16] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) [21:28:54] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) I'm waiting on the very slow HPE site upload to parse the AHS file I downloaded for this, and I also noticed that via https interface (https://db1139.mgmt.eqiad.wmnet/) that it has a... [21:30:01] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) DIMM Failure - Uncorrectable Memory Error (Processor 2, DIMM 5) is the actual failure from the log. Once the HPE site parses, I'll try to get a new dimm dispatched. They will likel... [21:54:28] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10RobH) Case ID: 5350976764 opened, requesting a new mainboard and any/all migration directions to be dispatched to eqiad to @Cmjohnson's attention. (He is currently out sick, but is projec... [23:14:36] 10DBA, 10Operations, 10Wikimedia-Site-requests: create script & docs to rename wiki databases - https://phabricator.wikimedia.org/T83609 (10Dcljr) [23:20:18] 10DBA, 10Operations, 10Wikimedia-Site-requests: create script & docs to rename wiki databases - https://phabricator.wikimedia.org/T83609 (10Dcljr) Took the liberty of changing the task name, to try to clarify what exactly was being discussed here. It would be nice to have a better task description, but maybe...