[07:50:45] 10DBA: Compress wikisahred.cx_corpora on x1 hosts - https://phabricator.wikimedia.org/T240325 (10Marostegui) [07:56:52] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [08:33:36] 10DBA: Compress wikisahred.cx_corpora on x1 hosts - https://phabricator.wikimedia.org/T240325 (10Marostegui) [08:40:39] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [09:19:10] jynus: I think backup2001 is a R440 no? This might be the same as T238305 [09:19:11] T238305: servers freeze across the caching cluster - https://phabricator.wikimedia.org/T238305 [09:19:36] You might want to list it on that task too, to keep track of those weird issues with the R440 [09:19:55] So far, regarding DBs we've had only one crash, but we've had many cp hosts already [09:21:56] yes [09:22:51] If it has new firmware and bios and everything...that's probably worth mentioning on that "tracking" task [09:23:00] because we are in a "new" situation [09:27:29] 10DBA: Compress wikisahred.cx_corpora on x1 hosts - https://phabricator.wikimedia.org/T240325 (10Marostegui) [09:27:37] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) [09:28:03] 10DBA: Compress wikisahred.cx_corpora on x1 hosts - https://phabricator.wikimedia.org/T240325 (10Marostegui) 05Open→03Resolved a:03Marostegui All hosts compressed: ` root@cumin1001:/home/marostegui# ./section x1 | while read host port; do echo $host:$port; mysql.py -h$host:$port wikishared -e "show create... [09:30:17] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) [09:31:51] 10DBA, 10Core Platform Team, 10MW-1.34-notes (1.34.0-wmf.24; 2019-09-24), 10Performance Issue, 10mariadb-optimizer-bug: Review special replica partitioning of certain tables by `xx_user` - https://phabricator.wikimedia.org/T223151 (10Marostegui) 05Open→03Resolved a:03Marostegui I am going to consid... [09:31:52] 10DBA, 10Core Platform Team, 10Epic, 10Tracking-Neverending: Tracking task for mariadb optimizer misbehaviours - https://phabricator.wikimedia.org/T233579 (10Marostegui) [09:35:06] do you know how to read KVMCAPTURE .dvc files? [09:35:46] nope :( [09:36:02] it doesn't seem to be text based or compressed [09:36:02] aren't those the files sent to dell? [09:36:56] might be the same as the diag, that can only be read by the vendor [09:38:27] no, it can be played on the web console, but requires java on the browser that I don't have [09:38:57] but what's that file supposed to have? [09:39:00] more logs? [09:39:18] boot capture, characters on boot [09:39:34] I can try [09:41:34] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) [09:47:45] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) [09:48:19] jynus: Nah, it doesn't work either :( [09:50:42] do you know how can I upload the logs so only papaul or us can see them? [09:51:00] on a private paste maybe? [09:51:26] or mail I guess? [09:58:19] nope, still on a nda paste they are public [09:58:52] you can paste them on a paste that it is only viewable for subscribers to that specific one [09:58:53] the link is not seen, but the whole point is to prevent download, so it doesn't work [09:59:01] else...maybe a gdoc or mail? [10:02:09] I am going to leave it as a link, I cannot even delete the uploaded files [10:03:21] it is not like there is anything private there [10:03:40] but definitly phab is not the place to upload private data [10:04:10] I don't know if HW logs are meant to be private [10:04:43] I yes, I can delete them [10:06:24] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) Log and boot: https://drive.google.com/file/d/1YL-j3M9fMFGq9EkHxxOL6kVtOf4uyM-e/view?usp=sharing https://drive.google.com/file/d/1E-5dZ_fitSE5TW0RmrrYRZsGt4DTitFn/view?usp=sharing [11:26:54] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) Moritz points at T238305#5731421 that maybe it is the same issue as: https://www.dell.com/community/PowerEdge-OS-Forum/Random-Reboot-R740/td-p/5169703/page/3 ` root@backup2001:~$ cat /sys/devices/system/... [13:10:26] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10Marostegui) >>! In T240177#5731486, @jcrespo wrote: > Moritz points at T238305#5731421 that maybe it is the same issue as: https://www.dell.com/community/PowerEdge-OS-Forum/Random-Reboot-R740/td-p/5169703/page/3 >... [13:22:14] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) > You've changed it to `powersave` or it is originally set to `powersave`? Didn't change anything, I pasted it as it is now. Most servers, including non-crashing backup1001 seems to be in that mode. [14:20:53] jynus: marostegui: so 'ExtensionStore shard1' is known variously as 'extension1' and also 'x1' -- do you have a preference for what dbctl should call it? [14:21:18] I would prefer x1 [14:21:30] As the others are s1,s2....x1 makes sense [14:39:36] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10MoritzMuehlenhoff) We're setting the governer via cpufrequtils class and cp/lvs hosts are already configured to use "performance", so I'd suggest to test that setting on one of the affected cp* hosts to gain addit... [14:51:33] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) > disable C / C1E settings without changing to performance profile? Do you know by any chance if performance governor sets that automatically (only needs to be changed) or it is a (potential) requirement... [14:51:47] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) a:05Papaul→03jcrespo [14:56:06] marostegui: ok! then we'll have x1, plus es1/2/3 [14:58:03] cdanis: be ready for es4 and es5, but not setup yet [14:58:12] jynus: sure, an easy change :) [14:58:20] just FYI [14:59:04] I belive the new setup was the reason why manuel asked about it [15:03:27] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10MoritzMuehlenhoff) >>! In T240177#5732180, @jcrespo wrote: >> disable C / C1E settings without changing to performance profile? > > Do you know by any chance if performance governor sets that automatically (only... [15:38:38] 10DBA, 10Operations: backup2001 crashed 2019-12-08 - https://phabricator.wikimedia.org/T240177 (10jcrespo) regarding performance_governor, T225713 combined with this ticket seems unclear what is the best option. [16:44:11] I think offsite backups just finished [16:45:01] doing a manual run to see what happens [21:08:24] 10DBA, 10TechCom-RFC: MediaWiki database policy and/or guidelines (2019) - https://phabricator.wikimedia.org/T220056 (10Krinkle) >>! In T220056#5713343, @Milimetric wrote: > Moving to last call, @Krinkle or @Nikerabbit to incorporate comments from T220056#5644705. To clarify, the last call is two weeks. Endin...