[05:38:51] 10DBA, 10Gerrit, 10Patch-For-Review: Switch to mariadb java connector - https://phabricator.wikimedia.org/T176164#4107439 (10Marostegui) Hi, What would you need from the DBAs? [05:48:00] 10DBA, 10MediaWiki-Database, 10MediaWiki-Special-pages, 10Security, 10Wikimedia-log-errors: Wikimedia\Rdbms\Database::tableName: use of subqueries is not supported this way. - https://phabricator.wikimedia.org/T191116#4107456 (10Marostegui) 05Open>03Resolved a:03Anomie And this is gone now after th... [06:07:36] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4107468 (10Marostegui) [06:29:07] 10DBA: Delete prefstats tables - https://phabricator.wikimedia.org/T154490#4107502 (10Marostegui) I have dumped these tables on the following path: `/srv/tmp/prefstats` for the following section/host s1: db1089 s2: db1090 s3: db1072 s4: db1091 s5: db1082 s6: db1113 s7: db1094 s8: wikidata doesn't have that tabl... [06:30:40] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4107503 (10Marostegui) For s5: db2038 Same HW, different ROW and old master. [06:46:13] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: dbstore1001 crashed: Multibit ECC errors were detected on the RAID controller. - https://phabricator.wikimedia.org/T186596#4107519 (10jcrespo) a:03jcrespo Don't worry, I can boot into RAID manager and do it myself. Thanks! [06:48:29] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: dbstore1001 crashed: Multibit ECC errors were detected on the RAID controller. - https://phabricator.wikimedia.org/T186596#4107530 (10jcrespo) I moved you into "Blocked" because I don't see a better option (you do not have, like us an "All is done in ou... [06:58:22] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4107545 (10Marostegui) [07:02:09] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4107548 (10Marostegui) For s6: db2053 Same HW, different row [07:20:11] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4107605 (10Marostegui) [08:34:17] I am testing dump_section.py in interactive mode, and it gives very useful error messages [08:34:41] can I see an example? :) [08:34:42] e.g. if there are permission errors, it says there was an error, and the log shows which one [08:35:14] I just overwrote the error [08:35:49] python3 dump_section.py --host=es2019.codfw.wmnet --user=root --pass=$pass --backup-dir=/srv/backups es3 [08:35:51] Error while performing backup of es3 [08:35:57] cat log.es3 [08:36:06] 2018-04-05 08:30:31 [ERROR] - Error connecting to database: Access denied for user X [08:36:42] nice one :) [08:37:19] if it is a warning, it continues [08:37:50] BTW- the backup is not fully online [08:38:29] it has to block stuff to get coords at the begining, creating 100-200 seconds of lag [08:38:50] at least for es, will be worse on s3, I guess [08:39:13] yeah, s3 will be a party [08:39:35] just FYI, that you may want to depool on start [08:40:00] and leave it pooled later if it is needed (but with lots of threads reading from it) [08:40:03] yeah, it is a good point [08:40:08] :wq [08:40:09] gah [08:40:29] e.g. es hosts are only limited by bandwidth [08:40:47] (so they would be highly degraded) [08:41:42] for es backups, I have modified the script to avoid compression [08:41:48] as it is already compressed [08:41:52] ah, that makes sense [08:41:55] and will save time [08:43:29] if you are curious I have left the done and ongoing copies at es2002:/srv/backups [08:44:17] let's seeee [08:45:12] hehe I like how less warns you that the .sql might be a binary file [08:45:28] yes [08:46:02] how long is it taking for those 4T? [08:46:05] I don't think logical copy is the right way, but it is the easiest way for now [08:46:08] 12 hours [09:29:25] I cant make our noon meeting today [09:29:44] Would later work instead? [09:30:05] would work for me anytime after 3.30 [09:30:51] jynus: ^ [09:30:58] yes, that is ok too [09:31:04] Ok [09:31:11] mark: send us an invite with whichever time works for you then :) [09:31:30] Will do :) [09:31:46] oh, I changed it [09:31:50] haha [09:32:01] but tell me whatever works for you [09:32:17] or send a separate meeting invite and I will cancel this [09:32:32] yeah, let's do that, let's cancel this and wait for mark new invite :) [09:32:40] ok [09:44:38] 10DBA, 10MediaWiki-Page-deletion, 10Operations, 10Performance: Cannot delete two pages with large histories even having the appropriate permissions to do so - https://phabricator.wikimedia.org/T145630#2636396 (10Graham87) I know this bug is old and resolved, but I stumbled on it by accident while looking f... [09:46:14] 10DBA, 10Datasets-General-or-Unknown, 10MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), 10MW-1.28-release-notes, and 2 others: Select of revisions for stub history files does not explicitly order revisions - https://phabricator.wikimedia.org/T29112#4108059 (10ArielGlenn) [11:13:08] 10DBA, 10Gerrit, 10Patch-For-Review: Switch to mariadb java connector - https://phabricator.wikimedia.org/T176164#4108272 (10Paladox) Hi, we only need dba ok to do this (it requires dba approval because it’s touching dba related stuff) ie replacing a MySQL driver with a mariadb one. [11:26:34] i think 3.30 invite is best actually [11:33:55] 10Blocked-on-schema-change, 10DBA, 10User-Ladsgroup, 10Wikidata-Ministry-Of-Magic: Review and deploy 421372 in WMF production - https://phabricator.wikimedia.org/T191519#4108322 (10Ladsgroup) p:05Triage>03Low [13:11:00] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4108583 (10Marostegui) For s7: db2054 Same HW, different row [13:18:03] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4108603 (10Marostegui) For s6, db2053 needs to be reverted as candidate master as with the master's move, they'd end up in the same row (T... [13:25:47] because we really do not have proper hardware, I have setup dbstore1001 as a backup host [13:26:07] I think we discussed this but never detailed, as this is all temporary [13:26:14] yeah, we did :) [13:26:18] it wasn't a surprise [13:26:34] just you know it is not necessary the end of the discussion [13:26:44] but better prepared to do that than doing nothing [13:26:51] totally! :) [13:27:07] It is also good to generate some load on it [13:27:07] we only have 2 server to perform backups [13:27:10] to see how the storage reacts [13:27:15] so that is what I setup for now [13:27:48] we could setup on it some mysql instances or something [13:27:55] if needed [13:28:27] 10Blocked-on-schema-change, 10DBA, 10User-Ladsgroup, 10Wikidata-Ministry-Of-Magic: Schema change for rc_namespace_title_timestamp index - https://phabricator.wikimedia.org/T191519#4108649 (10Anomie) [13:28:39] it will create packages but not send them to bacula [14:56:44] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4108939 (10Marostegui) @RobH can you let us know when the switch is ready so we can move db2039? Thanks! [15:00:31] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4108959 (10RobH) I've gone ahead and enabled asw-d1-codfw ge-1/0/14, and left asw-c6-codfw ge-6/0/6 online for now. Once the system is fully moved, we'll remove the port info f... [15:34:37] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all codfw database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T191275#4109086 (10Marostegui) [15:38:17] jynus: marostegui Hey, community of commonswiki wants to increase the usages of wikidata which means wbc_entity_usage will get way bigger, is it okay. On the hand I can deploy and stop logging autoapatrol actions (which stops logging table from growing), that stop will happen next week anyway though [15:39:23] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4109104 (10Papaul) Racktables update. moved db2039 from C6 to D1 [15:40:07] Amir1: what does way bigger means? you've got an estimation? [15:40:19] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4109106 (10Papaul) Please update the task with the next server to move so I can can the rack ready. Thanks [15:40:30] probably twice as big in one month or so [15:40:48] And then it will keep growing at that rate or...? [15:40:58] get slower [15:41:01] speed of growth can be controlled [15:41:10] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4109110 (10Marostegui) >>! In T191193#4109106, @Papaul wrote: > Please update the task with the next server to move so I can can the rack ready. Thanks let's go for db2040 as n... [15:41:22] That table is now 16G on commons [15:42:00] If it goes to around 40GB, that is not a problem for the server, disk-wise I mean [15:43:27] Amir1- did you inform them of the watchlist problem? [15:44:12] there is no problem if they know about that ("informed consent") [15:44:30] marostegui: can you tell me how big is the logging table there? 72% of that table can be deleted too (I know it doesn't free up space unless you shrink it but at least, it stops growing) [15:45:28] jynus: we haven't enabled rc integration for commons yet, we probably going to enable it but with the xkill, the ratio of rc entries from wikidata won't be more than 20% [15:46:28] Amir1: 206G on commons [15:46:46] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4109121 (10Papaul) [15:47:20] Amir1: I say we tell them, and then add whatever you want "we fixed it/we will make sure it doesn't break, etc." [15:47:37] basically so they can inform us quickly in case it happens again [15:47:49] okay [15:47:52] wait [15:47:53] Noted :) [15:47:58] so what are you going to enable? [15:48:05] if it is not the rc integration? [15:48:30] ah, you mean they want to start using it more [15:48:36] ok, same thing applies [15:49:13] it's more wikidata usages meaning wbc_entity_usage will grow [15:49:15] as long as someone is there in case things go wrong- no problem for me [15:49:28] okay noted. Thank you! [15:49:35] which I understand they are lower, but the point is to be around [15:49:41] just in case [15:54:56] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4109139 (10Papaul) switch port information when ready to move db2040. db2040 was on asw-c6-codfw ge-6/0/7 and now will be on asw-a3-codfw ge-3/0/ 27 new ip address will be :... [15:56:54] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Move masters away from codfw C6 - https://phabricator.wikimedia.org/T191193#4109149 (10Marostegui) >>! In T191193#4109139, @Papaul wrote: > switch port information when ready to move db2040. > > db2040 was on asw-c6-codfw ge-6/0/7 and now will be on a... [16:06:07] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: dbstore1001 crashed: Multibit ECC errors were detected on the RAID controller. - https://phabricator.wikimedia.org/T186596#4109181 (10jcrespo) 05Open>03Resolved dbstore1001 is back in use. Thanks for everyone that helped upgrading it and recover fro... [16:07:54] 10DBA: Failover DB masters in row D - https://phabricator.wikimedia.org/T186188#3936392 (10jcrespo) We should throw a plan for this, but for all rows. [17:50:03] 10Blocked-on-schema-change, 10DBA, 10User-Ladsgroup, 10Wikidata-Ministry-Of-Magic: Schema change for rc_namespace_title_timestamp index - https://phabricator.wikimedia.org/T191519#4109798 (10Krinkle) [18:23:29] Amir1: I am curious to see how long that drop column takes [18:23:42] Same here [18:26:44] marostegui: btw. the logging table in commonswiki will now grow at one tenth of speed it used to be [18:27:40] And deleting rows for fun now so storage-wise it won't grow for some time