[05:03:06] 10DBA, 10AbuseFilter, 10Patch-For-Review: Drop abuse_filter_log.afl_log_id in production - https://phabricator.wikimedia.org/T226851 (10Marostegui) >>! In T226851#5367805, @Marostegui wrote: >>>! In T226851#5367028, @MusikAnimal wrote: >> The `abuse_filter_log` table is apparently missing on the replicas, pr... [05:03:44] 10DBA, 10AbuseFilter, 10Patch-For-Review: Drop abuse_filter_log.afl_log_id in production - https://phabricator.wikimedia.org/T226851 (10Marostegui) [05:13:18] 10DBA, 10Operations, 10ops-codfw: pc2010 possibly broken memory - https://phabricator.wikimedia.org/T227552 (10Marostegui) Another crash just happened: ` [Mon Jul 29 04:55:14 2019] mce: [Hardware Error]: Machine check events logged [Mon Jul 29 04:55:14 2019] mce: Uncorrected hardware memory error in user-acc... [05:14:38] 10DBA, 10Operations, 10decommission, 10Patch-For-Review: decommission db1072.eqiad.wmnet - https://phabricator.wikimedia.org/T228956 (10Marostegui) [05:15:52] 10DBA, 10Operations: Decommission db1061-db1073 - https://phabricator.wikimedia.org/T217396 (10Marostegui) [06:25:47] 10DBA, 10AbuseFilter: Drop abuse_filter_log.afl_log_id in production - https://phabricator.wikimedia.org/T226851 (10Marostegui) s7 eqiad progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1003 [] db1136 [] db1125 [] db1116 [] db1101 [] db1098 [] db1094 [] db1090 [] db1086 [] db1079 []... [06:26:21] 10DBA, 10AbuseFilter: Drop abuse_filter_log.afl_log_id in production - https://phabricator.wikimedia.org/T226851 (10Marostegui) [08:50:42] 10DBA, 10Operations, 10serviceops: Strengthen backup infrastructure and support - https://phabricator.wikimedia.org/T229209 (10akosiaris) [08:50:59] 10DBA, 10Operations, 10serviceops: Strengthen backup infrastructure and support - https://phabricator.wikimedia.org/T229209 (10akosiaris) p:05Triage→03Normal [09:09:01] 10DBA, 10Operations, 10serviceops, 10Goal: Strengthen backup infrastructure and support - https://phabricator.wikimedia.org/T229209 (10Marostegui) [09:24:33] 10DBA, 10AbuseFilter: Drop abuse_filter_log.afl_log_id in production - https://phabricator.wikimedia.org/T226851 (10Marostegui) [09:34:34] marostegui: Hey, I want to move forward with https://phabricator.wikimedia.org/T225053 this basically moves half of reads on wb_terms table to the normalized structure, it can cause performance regressions. When do you think it's fine to deploy it? [09:34:55] Amir1: can we do it after tomorrow? [09:35:01] I am doing a s8 failover tomorrow at 5AM UTC [09:35:20] Sure thing. Not tomorrow or not before the failover? [09:35:39] I would wait at least 24h after the failover [09:35:44] Just to avoid mixing issues (if any) [09:36:08] Amir1: will you do it in batches or it needs to be done 0-100%? [09:37:07] for properties, it'll be 0-100% because there are not many of them [09:37:17] it's not much data but it's being read a lot [09:37:32] it can be rolled back if needed? [09:41:04] marostegui: yes [09:41:09] good [09:41:17] let's coordinate wednesday? [09:41:22] is that ok with your plans? [09:41:25] Sure [09:41:46] :) [09:41:55] I don't want it to be so late in the week (to not collide in the weekend) [09:42:05] yeah [09:42:08] Makes sense [09:42:20] Wednesday should be fine, right? we'd have almost 3 days [09:42:22] to monitor [09:44:47] yeah, that's good [14:54:36] 10DBA, 10Operations, 10ops-codfw: pc2010 possibly broken memory - https://phabricator.wikimedia.org/T227552 (10Papaul) Swapped DIMM B1 with DIMM A1 to see if we have the same problem on DIMM A1 if we do, we will have to replace he main-board. @Marostegui Please let me know it the system crash again . Thanks [14:56:20] 10DBA, 10Operations, 10ops-codfw: pc2010 possibly broken memory - https://phabricator.wikimedia.org/T227552 (10Marostegui) >>! In T227552#5372881, @Papaul wrote: > Swapped DIMM B1 with DIMM A1 to see if we have the same problem on DIMM A1 if we do, we will have to replace he main-board. > @Marostegui Plea... [15:35:16] 10DBA, 10Operations, 10ops-codfw, 10Goal, 10Patch-For-Review: rack/setup/install db21[21-30].codfw.wmnet - https://phabricator.wikimedia.org/T227113 (10Papaul) [15:35:48] 10DBA, 10Operations, 10ops-codfw, 10Goal, 10Patch-For-Review: rack/setup/install db21[21-30].codfw.wmnet - https://phabricator.wikimedia.org/T227113 (10Papaul) a:05Papaul→03Marostegui @Marostegui all yours [15:54:58] 10DBA, 10Operations, 10ops-codfw, 10Goal, 10Patch-For-Review: rack/setup/install db21[21-30].codfw.wmnet - https://phabricator.wikimedia.org/T227113 (10Marostegui) db2127 looking good! ` root@db2127:~# free -g ; df -hT /srv total used free shared buff/cache available M... [15:55:14] 10DBA, 10Operations, 10ops-codfw, 10Goal, 10Patch-For-Review: rack/setup/install db21[21-30].codfw.wmnet - https://phabricator.wikimedia.org/T227113 (10Marostegui) 05Open→03Resolved [16:08:43] 10DBA, 10Goal: Productionize db21[21-30} - https://phabricator.wikimedia.org/T228969 (10Marostegui) [16:51:58] 10DBA, 10Operations, 10ops-eqiad: rack/setup/install db2131.codfw.wmnet - https://phabricator.wikimedia.org/T229251 (10RobH) p:05Triage→03Normal [16:52:05] 10DBA, 10Operations, 10ops-eqiad: rack/setup/install db2131.codfw.wmnet - https://phabricator.wikimedia.org/T229251 (10RobH) [16:53:05] 10DBA, 10Operations, 10ops-eqiad: (2019-08-31)rack/setup/install db2131.codfw.wmnet - https://phabricator.wikimedia.org/T229251 (10RobH) [16:54:04] 10DBA, 10Operations, 10ops-eqiad: (2019-08-31)rack/setup/install db2131.codfw.wmnet - https://phabricator.wikimedia.org/T229251 (10RobH) [16:54:25] 10DBA, 10Operations, 10ops-eqiad: (2019-08-31)rack/setup/install db2131.codfw.wmnet - https://phabricator.wikimedia.org/T229251 (10RobH)