[05:04:42] 10DBA, 10Collaboration-Team-Triage, 10Operations, 10StructuredDiscussions, 10WorkType-Maintenance: Setup separate logical External Store for Flow in production - https://phabricator.wikimedia.org/T107610#4110960 (10jcrespo) [05:04:45] 10DBA, 10Operations, 10Patch-For-Review: Create a full backup of all external storage records that would be easy to restore/setup a temporary delayed slave - https://phabricator.wikimedia.org/T153440#4110958 (10jcrespo) 05Open>03Resolved es1 is in es2003/es2003 as a binary copy. es2 and es3 is in es2002... [05:07:02] 10DBA, 10Collaboration-Team-Triage, 10Operations, 10StructuredDiscussions, 10WorkType-Maintenance: Setup separate logical External Store for Flow in production - https://phabricator.wikimedia.org/T107610#4110961 (10jcrespo) Let's proceed with this, also let's create new clusters to avoid over-sized table... [05:20:10] 10DBA, 10Operations, 10ops-codfw: db2069 RAID with predictive failure - https://phabricator.wikimedia.org/T191593#4110966 (10Marostegui) [05:20:22] 10DBA, 10Operations, 10ops-codfw: db2069 RAID with predictive failure - https://phabricator.wikimedia.org/T191593#4110978 (10Marostegui) p:05Triage>03Normal [06:06:02] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4111009 (10Marostegui) [06:26:15] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4111023 (10Marostegui) For s7: db2047 Same HW and it will be in a different row once db2040 is moved to A3 (T191193) [07:26:12] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4111079 (10Marostegui) [08:09:35] db1114 is the first core server running 10.1.32 fyi [08:09:38] I just upgraded it [08:10:17] which section? [08:10:34] s1 - api [08:10:43] it was running 10.1.31 [08:10:49] so not a massive change :) [08:10:55] but I thought I would let you know [08:10:57] :) [08:11:37] we should build with 10.1 by default and if long maintanance happens, upgrade already [08:12:56] not as a massive reimage, but if we do not start slowly, there will be blockers on the deepest replicas [08:13:35] yeah [08:14:14] it should be "easy" as we can reimage without formatting /srv already :-) [08:14:26] if it will take 6 month to do a dc failover, we can transition to 10.1 then [08:14:58] note a lot of work will happen at the same time than decom [08:15:06] so not really a huge effort [08:18:58] By the end of the FY, it wouldn't also be (almost) unthinkable to have all of core on SSDs [08:20:31] yeah, only like 9 hosts pending no? [08:20:36] the ones we have to decomm at some point [08:20:42] from db1051-db1060 I believe? [08:22:42] no, older ones until 73 are hd [08:22:47] but those could go to non-core [08:23:02] yeah, that is what I am saying, that only those are not SSDs [08:23:08] yes, sorry [08:23:16] we should do some planning [08:23:40] of what we can actually do this quarter, of all things we would want [08:23:44] I have neglected to commit a db-eqiad.php decom plan :( [08:23:58] also for arzhel [08:24:18] yeah, I agreed to talk to him after our meeting yesterday with m*rk [08:24:31] think today and on monday morning and we can draft one on monday meeting [08:24:32] to chat about row C so we and him can move forward a bit :) [08:24:56] (I will think, too) [09:11:57] ok to upgrade apache on dbmonitor1001/tendril? [09:12:10] yes, go on [09:12:13] ack [09:12:31] not necessary but I would do the non-active first [09:12:38] and only need to ask for the active one [09:12:39] yeah, did that before [09:12:44] great! [09:13:00] upgraded, tendril.wikimedia.org works fine [09:13:11] thanks [10:58:54] 10DBA, 10MediaWiki-API: Database query error (internal_api_error_DBQueryError) while getting list=allrevisions - https://phabricator.wikimedia.org/T123557#1932315 (10Marostegui) I guess this was solved when we finished: T132416? [11:05:17] 10DBA, 10MediaWiki-API: Database query error (internal_api_error_DBQueryError) while getting list=allrevisions - https://phabricator.wikimedia.org/T123557#4111576 (10jcrespo) Yes resolved from on our side. [11:42:20] marostegui: jynus wrt https://gerrit.wikimedia.org/r/#/c/424300/ I just wanted to say I ran the code several times in terbium to make sure it doesn't put strain on the database. I will also keep an eye for issues anyway after this gets deployed [11:43:04] probably better to deploy it on Monday rather than today? [11:43:40] No rush, just wanted to say it [12:43:12] 10DBA, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Remove term_entity_type from wb_terms - https://phabricator.wikimedia.org/T191626#4111869 (10Lucas_Werkmeister_WMDE) [14:13:33] 10DBA, 10MediaWiki-API: Database query error (internal_api_error_DBQueryError) while getting list=allrevisions - https://phabricator.wikimedia.org/T123557#4112108 (10Anomie) 05Open>03Resolved a:03Anomie Let's close this then. If similar queries are still timing out, it'll be a different cause than was di... [14:13:46] 10DBA, 10MediaWiki-API: Database query error (internal_api_error_DBQueryError) while getting list=allrevisions - https://phabricator.wikimedia.org/T123557#4112111 (10Anomie) a:05Anomie>03None [15:18:12] 10DBA, 10Collaboration-Team-Triage, 10Operations, 10StructuredDiscussions, 10WorkType-Maintenance: Setup separate logical External Store for Flow in production - https://phabricator.wikimedia.org/T107610#4112346 (10Anomie) This is probably repeating what everyone already knows, but just in case I believe...