[10:03:15] 10DBA, 13Patch-For-Review: Import S2,S6,S7,m3 and x1 to dbstore2001 and dbstore2002 - https://phabricator.wikimedia.org/T151552#2898442 (10jcrespo) p:05High>03Normal a:05jcrespo>03Marostegui I have now fixed dbstore2001, and enwiki is catching up with the new, larger, buffer pool. {F5141533} We need to... [11:41:31] I am packaging now mariadb 10.1.20 [12:30:32] jynus: thanks for the quick followup on the ferm refactoring, I'll make a writeup on existing users of the ferm service, continuing to plan this via email seems best [13:52:26] 10DBA, 06Labs, 10Tool-Labs: Tool Labs queries die - https://phabricator.wikimedia.org/T127266#2898982 (10Dispenser) The Toolserver actually tried this, a "fast" server that would kill queries after 60 seconds. It was abandoned as tool authors wouldn't rewrite scripts to use it. The servers were converted t... [14:28:28] 10DBA, 06Labs, 10Tool-Labs: Tool Labs queries die - https://phabricator.wikimedia.org/T127266#2899018 (10jcrespo) > The Toolserver actually tried this, a "fast" server that would kill queries after 60 seconds. It was abandoned as tool authors wouldn't rewrite scripts to use it. The servers were converted to... [15:48:48] jynus: what is the difference between the columns pl_from_namespace and pl_namespace in table pagelinks? [15:50:41] one is the namespace of the referring item [15:50:47] the other is of the referred one [15:51:23] pages in mediawiki are couples of (namespace, title) [15:51:51] *pairs [15:52:09] there is more or less good documentation about the tables on mediawiki.org [15:52:40] doctaxon, https://www.mediawiki.org/wiki/Manual:Pagelinks_table [15:54:42] but with pl_from = page_id is pl_from_namespace obsolete, isn't it? [15:54:55] sorry, again? [15:56:49] if you mean that you do not not need it, that ok [15:57:09] I know some queries that use it as a denormalized field [15:57:13] in mediawiki [15:57:40] pl_from is an index id, there is not another page with the same id in another namespace [15:58:05] yes, that is correct [15:59:10] if you want a why that is there, you should check the realease notes of 1.24, normally there is a performance reason for it [15:59:26] duplicating data some queries can run faster [16:00:09] remember I do not design the schema, I only make the servers that run it not fail :-) [16:00:18] pagelinks is gigantic table, running through is almost impossible [16:00:26] oh, on that we agree [16:00:29] MySQL server has gone away [16:00:34] I have a proposal to shrink it [16:00:44] but will have to convice people [16:02:08] it will be good, if there were added columns like pl_is_redirect [16:02:32] or pl_is-missing (the redlink) [16:03:37] well, the idea is to take away if possible rows, not add more [16:03:54] I would move titles to a separate table [16:04:04] we could have those there, if necessary [16:04:17] not repeated millions of times [16:04:41] okay [16:05:00] sounds good [16:06:59] jynus: want you read the SQL query from our last talk? [16:41:27] 10DBA, 06Operations, 10ops-codfw: db2060 crashed, probably RAID controller - https://phabricator.wikimedia.org/T154031#2899146 (10jcrespo) [16:49:26] 10DBA, 06Operations, 10ops-codfw: db2060 crashed, probably RAID controller - https://phabricator.wikimedia.org/T154031#2899146 (10RobH) When I first connected to the serial console, it wasn't accepting input, but scrolled the following: [27858755.642012] INFO: task jbd2/sda1-8:385 blocked for more than 120... [16:51:27] 10DBA, 06Operations, 10ops-codfw: db2060 crashed, probably RAID controller - https://phabricator.wikimedia.org/T154031#2899165 (10RobH) On post, it also scrolled past: 1719-Slot 0 Drive Array - A controller failure event occurred prior to thisve power-up. (Previous lock up code = 0x13) [17:10:41] 10DBA, 13Patch-For-Review: db2034: investigate its crash and reimage - https://phabricator.wikimedia.org/T149553#2899223 (10jcrespo) ``` Dec 23 16:36:10 db2060 kernel: [ 6.793108] ata2.01: failed to resume link (SControl 0) Dec 23 16:36:10 db2060 kernel: [ 6.829120] ata2.00: SATA link down (SStatus 0 SCo... [17:13:33] 10DBA, 06Operations, 10ops-codfw: db2060 crashed, probably RAID controller - https://phabricator.wikimedia.org/T154031#2899241 (10jcrespo) From the OS logs: ``` Dec 23 16:36:10 db2060 kernel: [ 6.793108] ata2.01: failed to resume link (SControl 0) Dec 23 16:36:10 db2060 kernel: [ 6.829120] ata2.00: SA... [17:41:48] 10DBA, 06Operations, 10ops-codfw: db2060 crashed, probably RAID controller - https://phabricator.wikimedia.org/T154031#2899288 (10jcrespo) ``` HP ProLiant System ROM 08/02/2014 HP ProLiant System ROM - Backup 08/02/2014 HP ProLiant System ROM Bootblock 03/05/2013 HP Smart Array P420i Controller 6.00 iLO 2.03... [17:42:13] 10DBA, 06Operations, 10ops-codfw: db2060 crashed (RAID controller) - https://phabricator.wikimedia.org/T154031#2899289 (10jcrespo) [17:45:36] 10DBA, 06Operations, 10ops-codfw: db2060 crashed (RAID controller) - https://phabricator.wikimedia.org/T154031#2899292 (10jcrespo) p:05Triage>03Low Leaving this open for @Marostegui and @Papaul to see, there is not much else to do except maybe "upgrading the bios" so that next time it happens that cannot... [17:45:52] 10DBA, 06Operations, 10ops-codfw: db2060 crashed (RAID controller) - https://phabricator.wikimedia.org/T154031#2899295 (10jcrespo) a:03jcrespo [18:11:17] doctaxon, sorry I could not help, we had a bit of a crisis [18:11:25] ^ [18:11:39] please send the question to the labs-list [18:11:54] maybe someone else can help, or even me if I find the free time [21:49:46] 10DBA, 06Operations, 10ops-codfw: db2060 crashed (RAID controller) - https://phabricator.wikimedia.org/T154031#2899774 (10Marostegui) I haven't checked in much detail, but from the logs it looks like just a controller crash indeed. We can upgrade the BIOS once we have some spare time now that it is easy to d...