[05:51:22] 10DBA, 10Operations, 10ops-codfw: Degraded RAID on db2033 - https://phabricator.wikimedia.org/T217301 (10Marostegui) 05Open→03Resolved Thank you! It looks good now ` logicaldrive 1 (3.3 TB, RAID 1+0, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 600 GB, OK) physicaldrive 1I:1:2... [05:56:03] 10DBA, 10Patch-For-Review: Implement a proof of concept of a snapshot cycle automation for a mediawiki section database - https://phabricator.wikimedia.org/T210292 (10Marostegui) a:03jcrespo Assigning to Jaime as he is currently working on it [05:57:24] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: db1114 crashed (HW memory issues) - https://phabricator.wikimedia.org/T214720 (10Marostegui) No problem! let's leave the loop there for a few days to see if it crashes Thank you! [06:03:51] 10DBA, 10Operations, 10ops-eqiad: dbproxy1012 power supply without power - https://phabricator.wikimedia.org/T217394 (10Marostegui) [06:04:06] 10DBA, 10Operations, 10ops-eqiad: dbproxy1012 power supply without power - https://phabricator.wikimedia.org/T217394 (10Marostegui) p:05Triage→03Normal [06:44:16] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) [06:44:26] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) [07:25:19] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) [08:04:32] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Marostegui) 05Open→03Resolved Everything looks good! [08:26:24] 10DBA, 10Operations: Decommission db1061-db1073 - https://phabricator.wikimedia.org/T217396 (10Marostegui) [08:26:39] 10DBA, 10Operations: Decommission db1061-db1073 - https://phabricator.wikimedia.org/T217396 (10Marostegui) [08:26:46] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Marostegui: rack/setup/install db11[26-38].eqiad.wmnet - https://phabricator.wikimedia.org/T211613 (10Marostegui) [08:27:06] 10DBA, 10Operations: Decommission db1061-db1073 - https://phabricator.wikimedia.org/T217396 (10Marostegui) p:05Triage→03Normal [08:33:16] 10DBA, 10Operations, 10Patch-For-Review, 10Performance-Team (Radar): Increase parsercache keys TTL from 22 days back to 30 days - https://phabricator.wikimedia.org/T210992 (10Marostegui) 05Open→03Resolved [08:43:23] 10DBA, 10Data-Services: Discrepancies with logging table on different wikis - https://phabricator.wikimedia.org/T71127 (10Marostegui) a:03Marostegui I am going to try to work on this, as this has bitten us already a few times already, the most recent time on a global user rename. This is in a not very good... [08:43:36] 10Blocked-on-schema-change, 10MediaWiki-Database, 10MW-1.32-notes (WMF-deploy-2018-07-17 (1.32.0-wmf.13)), 10Schema-change: Add index log_type_action - https://phabricator.wikimedia.org/T51199 (10Marostegui) a:03Marostegui [08:44:44] jynus: when you transferred the data from a labs hosts to another labs hosts, I recall you had to delete relaylog.inf and master.info, no? [08:44:48] elukey: ^ [08:45:56] something like that, it will be on a ticket [08:46:12] ok, I will look for it, thanks! [08:47:32] thanks! [08:47:55] there is also (at least for me) the question mark about users/roles deployed to labsdb1012 [08:47:56] 10Blocked-on-schema-change, 10MediaWiki-Database, 10MW-1.32-notes (WMF-deploy-2018-07-17 (1.32.0-wmf.13)), 10Schema-change: Add index log_type_action - https://phabricator.wikimedia.org/T51199 (10Marostegui) [08:48:08] elukey: we can clean those up after the transfer [08:48:16] (or decide if we want to keep them) [08:50:19] ah okok [08:50:34] because I was reading role::labs::db::maintain_dbusers [10:30:39] 10DBA, 10MediaWiki-Database, 10Performance-Team (Radar), 10User-Marostegui: Replace parsercache keys to something more meaningful on db-XXXX.php - https://phabricator.wikimedia.org/T210725 (10Marostegui) As we have already finished {T210992} I would like to resume work on this task. I believe we all agree... [10:37:55] 10DBA, 10MediaWiki-Database, 10Performance-Team (Radar), 10User-Marostegui: Replace parsercache keys to something more meaningful on db-XXXX.php - https://phabricator.wikimedia.org/T210725 (10jcrespo) > T210992 Shouldn't we wait for the ttl to efectively increase before doing more operations? (aka wait 22... [10:44:25] 10DBA, 10MediaWiki-Database, 10Performance-Team (Radar), 10User-Marostegui: Replace parsercache keys to something more meaningful on db-XXXX.php - https://phabricator.wikimedia.org/T210725 (10Marostegui) Definitely - I was not going to do it next week, it was a re kickoff of this :-) We should also change... [10:46:06] 10DBA, 10Performance-Team (Radar), 10User-Marostegui: Replace parsercache keys to something more meaningful on db-XXXX.php - https://phabricator.wikimedia.org/T210725 (10Marostegui) [10:47:04] 10DBA, 10MediaWiki-Cache, 10Performance-Team (Radar), 10User-Marostegui: Replace parsercache keys to something more meaningful on db-XXXX.php - https://phabricator.wikimedia.org/T210725 (10Marostegui) [11:37:31] hi all, plan to reboot labsdb1005 in 10 minutes [13:25:01] jbond42: labsdb1004 paged yesterday evening for mysql not being up, was that related to your reboot earlier that day? [13:27:30] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) s4 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [x] dbstore1002 [] db1125 [] db1121 [] db1103 [... [13:35:33] marostegui: i just checked and it seems the mysql alert went out at ~19:41 however the system was rebooted at 2019-02-28 15:55, further i had hal.fak online to confirm things start correctly so i dont think so [13:36:08] jbond42: No, I mean, when I logged in, mysql was down, I started it right after the page [13:36:29] So my question is…was it put down for the reboot earlier that day? [13:36:43] yes it was rebooted at 15:55 yesterday [13:37:32] so I think it was like it happened with labsdb1005, mysql doesn't start automatically and it came back from downtime and that's why it paged [13:41:08] yes thats possible, just checking my intention was to only set downtime for 10 minutes but i could have made a type [13:42:14] yep, I think the issue was that mysql doesn't start automatically, which not everyone knows :) [13:42:42] so the downtime expired and mysql was still down hence the page [13:42:54] yes it looks like that was the case, my appolagies. i have added a note to the service restarts page to indicate that mysql dosn't start automaticly on a reboot [13:43:15] nothing to apologize for, I was just checking that I understood what happened yesterday correctly :) [13:44:36] long story short, we don't start mysql automatically on a reboot just to be on the safe side of things: ie: corrupted storage, ie2: a master crashes (powers down), we do a master failover and the other master comes back up from death, so at least mysql will remain stopped :) [13:44:40] and so forth! [13:45:53] yes sounds good, will know for next time :D [17:31:18] 10DBA, 10MediaWiki-API, 10Performance: list=logevents slow for users with last log action long time ago - https://phabricator.wikimedia.org/T71222 (10Anomie) Looking back at this, db1100 still uses the `times` index for the original query, and it uses the right plan when told to ignore `log_user_type_time`.... [17:37:39] 10DBA, 10MediaWiki-API, 10Performance: list=logevents slow for users with last log action long time ago - https://phabricator.wikimedia.org/T71222 (10Marostegui) I can try to run an `analyze` on Monday (the table is 71G) and see if that changes the plan. What is the query you are using, @Anomie? [18:00:36] 10DBA, 10MediaWiki-API, 10Performance: list=logevents slow for users with last log action long time ago - https://phabricator.wikimedia.org/T71222 (10Anomie) ` lang=sql explain SELECT log_id,log_type,log_action,log_timestamp,log_deleted FROM `logging` LEFT JOIN `user` ON ((user_id=log_user)) LEFT JOIN `page...