[05:59:17] 10DBA, 10Parsing-Team: testreduce_vd database in m5 still in use? - https://phabricator.wikimedia.org/T245408 (10Marostegui) >>! In T245408#5891429, @ssastry wrote: > Do I have permissions to delete rows from this database? But ,perfstats, stats, results are all keyed on commit hashes in the commits table. The... [06:09:36] 10DBA, 10Patch-For-Review: decommission dbproxy1002.eqiad.wmnet - https://phabricator.wikimedia.org/T245384 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: `dbproxy1002.eqiad.wmnet` - dbproxy1002.eqiad.wmnet (**PASS**) - Downtimed host on Icinga - Downti... [06:20:21] 10DBA: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) [06:41:01] 10DBA: Clean up wikiadmin2 user from core hosts - https://phabricator.wikimedia.org/T243512 (10Marostegui) [06:56:33] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Prepare and check storage layer for ngwikimedia - https://phabricator.wikimedia.org/T240772 (10Marostegui) a:05Marostegui→03None [07:48:14] 10DBA: Productionize es1020-es1025, es2020-es2025 - https://phabricator.wikimedia.org/T243052 (10Marostegui) 05Open→03Stalled [07:48:16] 10DBA, 10Epic, 10Goal: Setup es4 and es5 replica sets for new read-write external store service - https://phabricator.wikimedia.org/T226704 (10Marostegui) [08:05:41] 10DBA: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10Marostegui) [08:21:33] 10DBA, 10Patch-For-Review: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10Marostegui) I have disabled it on puppet for 10.4 for production hosts for now. It will also be needed for multiinstance if we finally decide to go for 10.4 [08:21:45] 10DBA, 10Patch-For-Review: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10Marostegui) p:05Triage→03Medium [09:01:30] 10DBA: Clean up wikiadmin2 user from core hosts - https://phabricator.wikimedia.org/T243512 (10Marostegui) [09:32:29] myload took only 10 hours, which is surprising given the only 100GB of Ram and heavy cpu usage used by the other 2 processes [09:32:47] for s3? [09:32:50] yes [09:32:54] that's pretty good [09:33:01] I was expecting a day or so [09:33:04] given the other instances were dumping data [09:33:11] I was expecting over a day [09:33:23] under normal circunstances it could be 5 hours of even less [09:34:07] will now setup replication, and when it catches up, run a thorough data check [09:38:38] good [09:44:18] affcomwiki is an empty db that myloader failed to create [09:49:09] of course :( [10:47:53] I am unable to run a host drop on tendril due to metadata locking [10:48:14] yeah, it has happened to me before :( [10:48:23] just try a few times [10:50:01] OR kill a few running queries [10:50:07] that too :) [10:50:41] apparently if the server is unavailable, things get overloaded [10:50:48] which is not a good architecture [10:52:22] I think it is because metadata gathering by tendril into tendril [11:04:29] we are going to test the new prometheus config generator [11:04:51] ok [11:05:01] any suggestion to make it fail safely? removing grant temporarilly is ok? [11:05:06] let me know if you need me, I will go for lunch in a bit [11:05:10] Yeah, that sounds good [11:05:10] not need you [11:05:12] Will you test in codfw? [11:05:16] just aware of temporary error [11:05:23] cool [11:05:40] I was only going to test on eqiad, but we can do codfw too [11:05:49] up to you :) [11:31:31] when your come back, test went ok [11:31:50] FYI, if exported generator breaks, there will be 2 alarms: Check the last execution of generate-mysqld-exporter-config [11:32:08] I am still here :) [11:32:21] and by dependency Check systemd state [11:32:26] on the prometheus hosts [11:32:31] log will be on systemd now [11:32:45] I will document somewhere the last process [11:33:35] thanks :) [11:33:47] you can see examples of alerms on -ops [11:33:54] yeah, I saw those [11:34:34] we don't have a monitoring section under MariaDB, do we? [11:34:50] I may create one [11:36:14] 10DBA, 10Operations, 10Wikimedia-Etherpad: Upgrade and restart m1 master (db1135) - https://phabricator.wikimedia.org/T244238 (10Marostegui) Window reserved on the deployment's page [11:58:16] I created WIP https://wikitech.wikimedia.org/wiki/MariaDB/monitoring [11:58:24] but at least dumped the last stuff on metrics [12:03:45] I've added a notice on https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting but it everything needs more work [12:59:06] thanks :) [14:02:34] 10DBA: Test MariaDB 10.4 in production - https://phabricator.wikimedia.org/T242702 (10Marostegui) Hello, An update on 10.4 testing: Everything has been running fine for the last week , we haven't found any major regressions, so I am going to leave db1107 now serving traffic 24h, during the week - it won't serv... [15:04:56] 10DBA: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10Anomie) > ` > root@db1107.eqiad.wmnet[enwiki]> explain SELECT /* ApiQueryExtLinksUsage::run */ el_index_60, el_id, page_id, page_namespace, page_title, el_to FROM `page`, `externallinks` WHERE (page_id=el_... [15:16:07] https://twitter.com/jynus/status/1229786513258565634 [15:21:11] 10DBA: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10Marostegui) We can now actually try to see what the optimizer is really doing: ` root@db1107.eqiad.wmnet[(none)]> set session optimizer_trace=1; Query OK, 0 rows affected (0.00 sec) root@db1107.eqiad.wmne... [15:22:50] Not sure if I should try to generate a backup first or a data check first on s3 source? [15:23:14] the generate backup will let you know if there are issues no? [15:23:33] so you vote for backup first? [15:23:38] yeah [15:23:42] cool, thanks [15:23:45] if it goes fine, then data check just in case no? [15:23:52] (I will do the oppsite then) [15:23:55] just kidding [15:23:57] XDDD [15:23:58] :-D [15:24:09] it makes 100% sense what you say [15:24:18] if that doesn't work, not woth checking it [15:24:30] yeah, and if it works, just check it to be 100% sure [15:24:32] plus it is done in case something goes south [15:24:48] oh, yeah, check will always be done, I wasn't sure about the order [15:34:04] 10DBA, 10cloud-services-team (Kanban): DB planning: include a writeable (?) misc DB cluster in codfw for WMCS - https://phabricator.wikimedia.org/T218570 (10Bstorm) Just to highlight it for discussion (in case it is useful for anyone's later planning): this would mean that `striker` and `labsdbaccounts` have n... [15:36:20] 10DBA, 10cloud-services-team (Kanban): DB planning: include a writeable (?) misc DB cluster in codfw for WMCS - https://phabricator.wikimedia.org/T218570 (10Marostegui) Thanks @Bstorm for the clarification! In a medium/long term I would like to re-arch and consolidate stuff in our misc infra, to have a better... [15:44:19] 🧔<--- me waiting for s3 to shutdown with low memory and fast_shutdown=0 [15:44:49] XDDDD [15:45:25] "InnoDB: Waiting for change buffer merge to complete" [16:36:27] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Needed by 31st January) eqiad: rack/setup/install es102[0-5].eqiad.wmnet - https://phabricator.wikimedia.org/T241359 (10Cmjohnson) [16:58:48] 10DBA: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10Anomie) >>! In T245489#5893273, @Marostegui wrote: > We can now actually try to see what the optimizer is really doing: Cool! Bookmarked. > I don't quite fully understand why it chooses `el_index` instea... [17:02:48] 10DBA: Possibly disable optimizer flag: rowid_filter on 10.4 - https://phabricator.wikimedia.org/T245489 (10jcrespo) FYI, histograms and other query optimizer configurations that should be now available on 10.4+ should be able to help tuning filtering conditions in some cases, although I haven't checked if it is... [18:45:15] 10DBA: Compress table watchlist_expiry - https://phabricator.wikimedia.org/T245358 (10Jdforrester-WMF) [18:45:21] 10DBA: Compress table watchlist_expiry - https://phabricator.wikimedia.org/T245358 (10Jdforrester-WMF) [19:23:58] I am working on db1140:s3, it will be down for a while [19:24:28] (downtimed until tomorrow at midday) [19:32:05] 10DBA: Compress table watchlist_expiry - https://phabricator.wikimedia.org/T245358 (10Jdforrester-WMF) [19:32:22] 10DBA: Compress table watchlist_expiry - https://phabricator.wikimedia.org/T245358 (10Jdforrester-WMF) Tables now created (and empty) on all wikis. Go ahead. :-) [20:40:17] hello DBAs! I have what I hope is a quick question... is there a reason why INSERTing in batches is better than individual INSERTs in terms of performance? I'm specifically talking about this code: [20:40:18] https://github.com/wikimedia/mediawiki/blob/dd1fec96adbec084da62d130cb3fd531b2dd78b9/includes/watcheditem/WatchedItemStore.php#L770-L778 [20:41:05] if we can do individual INSERTs then I can get the wl_ids really easily, which then makes it easy to update the new watchlist_expiry table accordingly [20:41:17] ticket for this is https://phabricator.wikimedia.org/T245213 [20:47:04] https://medium.com/@benmorel/high-speed-inserts-with-mysql-9d3dcd76f723 [20:47:17] >When you need to bulk-insert many million records in a MySQL database, you soon realize that sending INSERT statements one by one is not a viable solution. [20:47:18] :P [20:47:26] https://dev.mysql.com/doc/refman/5.7/en/insert-optimization.html [20:48:28] musikanimal: Where is that used? On https://en.wikipedia.org/wiki/Special:EditWatchlist/raw ? [20:52:15] Reedy: yeah I think that's the only place [20:52:28] and actually I'm not sure what we're doing with that, since you can't specify experies [20:53:04] anyway this is going to be a nightmare if we stick with batch inserts. I have SELECT the wl_ids retroactively, then batch insert experies, I guess [20:55:56] musikanimal: strike one up for the benefit of just adding a column to the watchlist table ;P [20:56:22] well you'd still need to find some hacky way to specify experies in a textarea, next to the page titles [20:56:36] Not necessarily [20:56:47] Just make it so that you can't set expirations on that [20:56:49] :P [20:56:58] yeah exactly, that's what I'm thinking [20:56:58] Then wait for someone to request that they must have that feature... [20:57:08] I mean, you basically can't do it in text [20:57:12] MW titles can be nearly any character [20:57:17] right [20:57:32] I guess Special:EditWatchlist/raw is the only way to edit watchlists for some people, and those people just don't get experies. But that's a product decision [20:57:32] So CSV et al are almost useless [20:57:44] I think that's reasonable [20:57:55] Is there going to be a way to get an expiry retroactively? ie after you've already watched it? [20:58:13] maybe I'll just create a new method for adding a single expery, that's called only when watching a single page (which there's a public method for) [20:58:22] yes [20:58:54] but we could make it so that experies are unmodified when using Special:EditWatchlist/raw [23:14:39] 10DBA, 10Parsing-Team: testreduce_vd database in m5 still in use? - https://phabricator.wikimedia.org/T245408 (10ssastry) >>! In T245408#5891847, @Marostegui wrote: >` > | GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, ALTER, CREATE TEMPORARY TABLES ON `testreduce`.* TO 'testreduce'@'10.64.48.43' |...