[06:46:29] 10DBA, 10Wikimedia-Site-requests: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Marostegui) When do you want to do this? [06:53:51] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: db1114 crashed - https://phabricator.wikimedia.org/T214720 (10Marostegui) p:05Triage→03Normal [07:12:24] 10DBA, 10Patch-For-Review: Audit MySQL configurations - https://phabricator.wikimedia.org/T133333 (10Marostegui) [09:52:20] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) @elukey from what I can see, the research user is used to access wikis, not only for staging. So I guess w... [09:56:44] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10elukey) Yes exactly, basically people only use research to consult wiki replicas and to store things in staging, sorry... [14:30:38] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) [14:56:29] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) [14:56:57] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) s4 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore1004 [] dbstore1002 [] db1125 [] db1121... [14:57:09] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) [15:02:21] 10DBA, 10CheckUser: Provide a strategy for testing the performance of queries needed to show the list of user-agents for each IP - https://phabricator.wikimedia.org/T212092 (10jcrespo) Checking. [17:01:46] 10DBA, 10CheckUser: Provide a strategy for testing the performance of queries needed to show the list of user-agents for each IP - https://phabricator.wikimedia.org/T212092 (10jcrespo) ` root@db1106.eqiad.wmnet[enwiki]> EXPLAIN SELECT cuc_agent, COUNT(*) AS cnt FROM cu_changes USE INDEX (cuc_user_ip_time)... [17:10:32] 10DBA, 10CheckUser: Provide a strategy for testing the performance of queries needed to show the list of user-agents for each IP - https://phabricator.wikimedia.org/T212092 (10Huji) To confirm: is the cu_changes table 12TB for enwiki? [17:11:47] 10DBA, 10CheckUser: Provide a strategy for testing the performance of queries needed to show the list of user-agents for each IP - https://phabricator.wikimedia.org/T212092 (10jcrespo) Sorry, 12GB :-) We should only get tables so big on the external store servers :-) Please give me some time, these analytics... [17:22:42] Hi there! jynus here I am [17:22:53] hi [17:23:20] so I forgot why we are doing this, as in, in db terms [17:23:45] So well I underestimated that subquery, those 343 rows on beta cluster made me think it was good :/ [17:23:47] Alright [17:24:16] I sort of understand the initial goal (description of the task) but got lost in the middle [17:25:33] So what we want to do is query the abuse_filter table, and for each row we want a row from abuse_filter_log which should: 1-have afl_filter = af_id (row by row) and 2-Within the subset from 1, we want the one with the MAX timestamp (any is fine in case of duplicates) [17:25:45] So that, for each filter in the list, we can show the time of the last hit [17:26:04] I see [17:26:07] This was at first implemented with a join [17:26:19] so let me dumb it down for me [17:26:25] there is a large table with a log [17:26:33] and a small one with a set of rules [17:26:39] (This was the query with the JOIN: https://phabricator.wikimedia.org/T93564#4683733) [17:26:41] Yes [17:26:42] and people want to see the last time it got hit [17:26:52] abuse_filter is the small one, it won't have more than 1000 rows [17:26:57] I can see that being expensive [17:27:05] An yes, abuse_filter_log is a logging table so it'll be pretty large [17:27:13] For sure it is [17:27:15] af_id is the row that joing it, right? [17:27:23] joins* [17:27:42] yeah, this is a similar issue with logging or revision [17:27:49] joining with it is complicated [17:27:58] ok, I think I have all info I need [17:27:58] Yup, the join is on af_id = afl_filter [17:28:14] let me thing of a solution, even if it is a compromise one [17:28:17] Handy link with DB schema: https://phabricator.wikimedia.org/diffusion/EABF/browse/master/abusefilter.tables.sql [17:28:19] and I will propose something [17:28:22] thanks [17:28:26] Anything is fine, many thanks :) [17:28:37] one last question [17:28:49] how "fresh" this need to be [17:29:01] do people need really the last one [17:29:15] or an outdated for some seconds, minutes would work? [17:29:52] I am wondering if some caching or job would be out of the question [17:30:03] I guess a cached value with a TTL of ~half a minute should be fine, at least as a compromise [17:30:18] ok, just to know what are our margin [17:30:23] Yep [17:30:35] Not a long TTL, but 30 secs or so should be fine [17:30:39] let me give it a good look, I need to gather some data from production [17:30:42] Even a minute, I guess, but I wouldn't say more [17:30:46] Sure, thanks [17:30:54] and will get back to you on the ticket [17:30:56] thanks! [17:31:51] Thank *you* for your help :) [17:32:02] I'm still around in case you need other info, just please ping me [23:52:28] 10DBA, 10CheckUser: Provide a strategy for testing the performance of queries needed to show the list of user-agents for each IP - https://phabricator.wikimedia.org/T212092 (10Huji) I had a minor heart attack ;) *jk* I will remain patient. If there is anything else I can do to help, please let me know.