[05:48:29] 10DBA, 10Operations: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it - https://phabricator.wikimedia.org/T57385#4167092 (10Marostegui) [05:56:03] 10DBA, 10Operations: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it - https://phabricator.wikimedia.org/T57385#4167094 (10Marostegui) [06:01:06] 10DBA, 10Operations: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it - https://phabricator.wikimedia.org/T57385#4167107 (10Marostegui) [06:19:09] 10DBA, 10AbuseFilter, 10Patch-For-Review: Move AbuseFilter slow filters data from Logstash to per-filter profiling - https://phabricator.wikimedia.org/T179604#4167140 (10Daimona) [06:19:52] 10DBA, 10Operations: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it - https://phabricator.wikimedia.org/T57385#4167143 (10Marostegui) [06:20:14] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#4167147 (10Marostegui) [06:20:18] 10DBA, 10Operations: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it - https://phabricator.wikimedia.org/T57385#589294 (10Marostegui) 05Open>03Resolved This has been dropped everywhere [06:21:04] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#3451451 (10Marostegui) [06:38:09] 10DBA, 10Operations, 10Chinese-Sites: Drop *_old database tables from Wikimedia wikis - https://phabricator.wikimedia.org/T54932#551990 (10Marostegui) After all the deletions that have happened lately as part of the parent ticket, this is the current status of these tables. They only exist on s3 on: ``` chw... [07:41:50] https://gerrit.wikimedia.org/r/#/c/429727/ [07:44:11] isn't better to use db1095 ? [07:44:26] yeah, but I don't want to generate lag on labs while replication is stopped [07:44:37] why would be replication stopped? [07:45:10] I did some tests and using -B database1 database2 etc changes the position on metadata all the time [07:45:29] like the position where it should start replicating [07:45:45] I don't understand [07:46:02] so, to dump all s3 databases we need -B database1 database2 database3 right? [07:46:10] not necesarilly [07:46:20] what do you use then? [07:46:46] you dump everthing, then not import the other data [07:47:01] but what do you mean with metadata changing? [07:47:23] The metadata file which contains the position you should replicate from, changes everytime a database is dumped [07:47:43] for me that is a huge bug [07:47:56] can you try with a regex? [07:48:10] yeah, let me try that [07:48:32] just for a couple of small dbs [07:48:32] ˜/jynus 9:46> you dump everthing, then not import the other data -> that is a good idea but in the end you waste time with enwiki or wikidata for instance, which are not needed and are pretty big [07:48:49] well, the whole idea is that you do that once [07:48:58] at the start of the process :-) [07:49:08] I will try the regex [07:49:20] note I am interested here on mydumper [07:49:29] because that can lead to bugs [07:51:06] if -B is not consistent , it is important to know [07:51:38] or at least, it should be documented and write different logs on metadata [07:52:17] I am testing the regex now [07:54:04] alternatively, if we had time we could test mariabackup, as innobackup allows for partial backups [07:54:36] note I don't have anything against depooling db1011 [07:54:51] but I really would like to know issues/workarounds with the tools [07:57:27] specially if in the future we have to create backups of these [07:58:40] sure - but let's test the regex thingy [08:07:53] Ah wait, this is my fault for assuming that -B accepts multiple arguments (I had in my mind it worked like mysqldump) [08:08:54] ok, that is a mistake I would had done, but I still don't understand what was happening initially? [08:09:20] was the metadata file being rewritten? [08:09:36] yeah, but because I had a for loop (which I had missed first), so it is normal [08:09:43] ah! [08:09:53] no, it has to be single execution [08:10:08] yeah, I was so focused on the mydumper command that I missed the first part of the command I built XD [08:10:25] I guess it was derived from pasy mysqldumps with the slave stopped [08:10:57] I will use a regex to ignore enwiki and wikidata [08:11:01] so we can test it for sure [08:11:05] and dewiki? [08:11:08] yeah [08:11:48] sorry to be like I am, but I like to ask questions when something is not working [08:11:52] and file bugs, etc. [08:11:55] no [08:11:59] you made me realise the error [08:12:40] did you use the wrapper I created? [08:12:52] the dump_section? [08:12:57] yes [08:13:04] yeah, I waas going to use it now to build the command [08:13:19] it tries to hide all the "ugly" options [08:13:30] and allow to change only the "good" ones [08:24:30] also it makes sure things like events, triggers, routines are also copied [08:46:56] going to go for https://gerrit.wikimedia.org/r/#/c/429736/ [08:47:24] Did it finish? I saw it in the morning I think [08:47:27] still running an alter [08:47:34] ? [08:47:59] weer you running an alter on db1090? [08:48:04] were [08:48:08] like 300 of them [08:48:20] I just finished now the ones that failed while replication running [08:48:25] yes, so what I am saying is that I saw it still running one in the morning [08:48:42] It was a heads up mostly [08:48:42] it finished all on saturday [08:48:52] Ok, maybe my tendril wasn't refreshed [08:48:52] (linter,change_tag, watchlist) [08:49:04] no, I run some more this morning [08:49:08] Yeah, I saw watchlist [08:49:14] I logged it [08:49:55] Fine, what I am saying is: "hey, heads up I saw an alter table running" I have no idea if it finished it or not. If you want to pool it I assumed you knew it [08:49:58] That's all [08:50:09] all, ok [08:50:18] my heads up was because it could break things [08:50:25] :) [08:55:30] are you touching db1098:3317 ? [08:55:36] no [08:55:39] why? [08:55:51] I started mysql yesterday to let it replicate to see if it is corrupted [08:56:02] it is depooled, cannt remember why it was like that [08:56:04] now I do [08:56:08] it broke [08:56:13] during the weekend [08:58:24] I may need a double check of https://gerrit.wikimedia.org/r/429747 (to be done later) [08:58:56] I will first pool db1090 and see what happens [09:00:12] sure [09:07:53] I see queries flowing in [09:08:06] nice [09:08:24] no errors so far [09:08:44] but I may take a long time, speciall without the depool [09:09:03] one thing that the previous CR donesn't have is a failover of sanitarium [09:09:14] (sanitarium's master) [09:09:22] wanted to coordinate with you on that [09:11:10] db1060? [09:11:34] I guess db1074 is the natural option? [09:11:56] yep [09:12:02] I think that's the one we said [09:12:10] I remember you suggesting one [09:12:16] but couldn't remember which [09:12:37] db1074 I think it was [09:12:50] or even the vslow host [09:16:13] also I know you are doing alters, so maybe you want to depool that at the same time or something? [09:16:28] db1074 already has the alter done [09:16:39] db1076 has it running it now [09:16:45] db1060 is pending too [09:16:47] ok, so should I wait? [09:16:53] don't do it on db1060 [09:16:59] cool [09:17:03] I will not do it there [09:17:05] it is going to be decommisioned [09:17:13] yep! [09:17:14] but if you need to do it on labs [09:17:30] you may have wanted to wait for the movement [09:17:47] so let me know how you prefer to do it [09:17:56] maybe it is better to run it there, because if db1074 already has it, labs will not have it [09:18:06] so maybe it is better to run it on db1060 anyways, so it gets replicated [09:18:11] and once done, move sanitarium to db1074 [09:18:31] ok to me [09:18:41] so check when you can the above patch [09:18:49] and see if that fits with what we want to do [09:18:53] ok, let me see [09:19:03] or we wait a bit [09:19:26] yeah, that would work [09:19:42] However, I will not alter db1060 today, as the alter in db1076 is running now and will take a few hours [09:19:50] ok [09:19:56] so I can deploy it now [09:20:00] totally [09:20:03] and tomorrow you run the alter [09:20:11] and after that [09:20:21] we do the movement [09:20:23] ok? [09:20:31] Actually [09:20:36] I can leave the alter running today [09:20:41] As it will be depooled [09:20:44] so it is ready by tomorrow [09:20:45] whatever you prefer [09:20:56] Tomorrow is a public holiday and Wed is a public holiday here too [09:21:04] But yes, I will deploy the alter today anyways [09:21:42] well, no rush [09:21:47] I was only asking [09:21:55] to coordinate and not work extra [09:22:02] I am not in a hurry [09:22:13] No, as it is going to be depooled, I can leave it running as soon as db1076 finishes (I think it will finish around 14:00 or so) [09:22:16] and I almost prefer nothing ongoing [09:22:24] during holiday [09:22:30] that too [09:53:57] Hallo. Is it possible to use JSON_ARRAYAGG in mysql on terbium? If I try it, I get: execute command denied to user 'research_prod'@'%' for routine 'log.JSON_ARRAYAGG'. [09:54:04] analytics databases are very old- not our fault if we cannot get a window for upgrade [09:54:09] What I need to do is to group by one column and to get all the results from another column in one row. I could do GROUP_CONCAT, but since I'm doing it with data that includes user input, I cannot join the values in a safe way with a separator. JSON looks safe, but apparently cannot be used. [09:54:30] aharoni: please share a paste with what you want/need [09:54:51] it is difficult to understand without more background [09:56:22] JSON_ARRAYAGG is a mysql 5.7+ function, I doubt it is available on mariadb [09:56:33] but there could be an equivalent [09:59:05] on the other hand, json_array will only be available on mariadb 10.2: https://mariadb.com/kb/en/library/json_array/ [09:59:49] so share what you need and I may be able to give you an equivalent [10:06:01] jynus: https://phabricator.wikimedia.org/P7053 [10:34:06] jynus: does the paste make sense? [10:35:26] aharoni: on meeting [10:35:31] will answer soon [10:35:35] thanks [11:12:44] aharoni: that is not possible to do in mariadb [11:13:01] I can offer you several possiblities [11:14:00] do a badly hack by trying to "parse recursive syntax with regular expressions" [11:14:03] like https://stackoverflow.com/a/42445841/342196 [11:14:39] not using aggregation and using proper json functions on client side [11:15:14] the bad hack is bad, but may work. [11:15:19] wait for mysql being available in the cluster- which has no date of start (it is being tested) [11:15:27] post-processing is what I do now, but it's probably very inefficient. [11:15:41] the right way to do it is client-only functions [11:16:01] there is no mysqls in production right now, we just recently started testing 8.0 [11:16:18] Is it possible to access the same data from stat1005? IIRC, it's not exactly mysql there. [11:16:35] But the data is available. I just don't remember how to access it. [11:16:54] I'm talking the EventLogging data. [11:17:19] eventlogging is only available on mysql 10 [11:17:23] *mariadb 10.0 [11:17:55] I am not sure if there is some plan to add it to hadoop, too [11:18:24] there is https://phabricator.wikimedia.org/T159170 [11:18:48] you should comment on the needs [11:19:36] OK. I thought that "execute command denied" is a matter of permissions, but the real reason is that we are just not at the right version yet? [11:19:49] no [11:19:57] it is literally not available [11:20:14] no mariadb version has that function [11:20:48] and if there was a migration to mysql, which is not discarded, but also not decided [11:20:56] it would take a year or more to be effective [11:21:43] eventlogging is also very hard to upgrade because we are not given maintenance windows and there is no proper redundancy [11:22:19] unlike labsdbs, which we setup from the ground up to be fully redundant [11:34:57] jynus: thanks [11:41:03] what's the 'pif_edits' table for? [11:43:51] 10DBA: Drop flaggedrevs tables at metawiki - https://phabricator.wikimedia.org/T193390#4167712 (10MarcoAurelio) [11:44:26] 10DBA: Drop flaggedrevs tables at metawiki - https://phabricator.wikimedia.org/T193390#4167723 (10MarcoAurelio) [11:46:36] 10DBA: Drop flaggedrevs tables at metawiki - https://phabricator.wikimedia.org/T193390#4167731 (10MarcoAurelio) [11:46:40] 10DBA: Drop flaggedrevs tables on wikis where it is not enabled - https://phabricator.wikimedia.org/T174801#4167730 (10MarcoAurelio) [11:50:07] 10DBA: Drop flaggedrevs tables on wikis where it is not enabled - https://phabricator.wikimedia.org/T174801#4167736 (10MarcoAurelio) [11:50:09] 10DBA: Drop flaggedrevs tables from mediawikiwiki - https://phabricator.wikimedia.org/T186865#4167735 (10MarcoAurelio) [11:50:43] 10DBA: Drop flaggedrevs tables from mediawikiwiki - https://phabricator.wikimedia.org/T186865#3957724 (10MarcoAurelio) [11:50:45] 10DBA: Drop flaggedrevs tables on wikis where it is not enabled - https://phabricator.wikimedia.org/T174801#3573416 (10MarcoAurelio) [11:51:24] 10DBA: Drop flaggedrevs tables from mediawikiwiki - https://phabricator.wikimedia.org/T186865#3957724 (10MarcoAurelio) [11:51:26] 10DBA: Drop flaggedrevs tables on wikis where it is not enabled - https://phabricator.wikimedia.org/T174801#3573416 (10MarcoAurelio) [11:52:05] 10DBA: Drop flaggedrevs tables on wikis where it is not enabled - https://phabricator.wikimedia.org/T174801#3573416 (10MarcoAurelio) [13:15:39] 10DBA, 10Patch-For-Review: Productionize 8 eqiad hosts - https://phabricator.wikimedia.org/T192979#4167866 (10jcrespo) [13:18:10] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#4167879 (10Marostegui) [13:18:12] 10DBA, 10Operations, 10Chinese-Sites: Drop *_old database tables from Wikimedia wikis - https://phabricator.wikimedia.org/T54932#4167876 (10Marostegui) 05Open>03Resolved a:03Marostegui >>! In T54932#4167221, @Marostegui wrote: > After all the deletions that have happened lately as part of the parent ti... [13:28:02] modules/role/files/mariadb/check_private_data_report may need some tuning? [13:28:14] for the new sanitariums, no? [13:28:21] still has db1069 on it [13:28:23] also [13:28:37] Yeah, I was planning to get to it once we know the new hostsnames and all that [13:28:43] yes [13:28:48] not saying it has to be now [13:28:57] just was grepping for ir [13:29:01] (db1069) [13:29:02] yeah [13:29:07] and found it outdated [13:29:24] did we fix the execution to work with a socket? [13:29:42] does check_private_data.py support socket and port? [13:29:43] yeah [13:29:43] that [13:30:04] we will need to give it a look for the goal [13:30:15] leaving it for now as is [13:30:34] yeah [13:46:57] I am not saving or checking db1069 because it was copied to db1090 already [13:47:26] +1! [13:50:22] 10DBA, 10Patch-For-Review: Decommission db1051-db1060 (DBA tracking) - https://phabricator.wikimedia.org/T186320#3940857 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jynus on neodymium.eqiad.wmnet for hosts: ``` ['db1069.eqiad.wmnet'] ``` The log can be found in `/var/log/wmf-auto-reimage/201... [13:51:59] Top QPS Servers [13:52:11] db1116:13315 / s5: 435.61 K [13:52:49] that must be wrong [13:52:56] as that host isn't doing anything [13:53:00] just replicating [13:53:15] no, I think we are doing close to the maximum theoretical QPS mysql can do [13:53:21] I am just kidding [13:53:32] :) [14:10:26] 10DBA, 10Patch-For-Review: Decommission db1051-db1060 (DBA tracking) - https://phabricator.wikimedia.org/T186320#4168018 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['db1069.eqiad.wmnet'] ``` and were **ALL** successful. [14:47:09] 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1039 - https://phabricator.wikimedia.org/T184262#4168111 (10Cmjohnson) [14:55:03] 10DBA, 10Operations, 10hardware-requests, 10ops-codfw: Decommission db2011 - https://phabricator.wikimedia.org/T187886#4168131 (10Papaul) switch port information asw-a6-codfw ge-6/0/10 [14:56:00] 10DBA, 10Operations-Software-Development, 10Patch-For-Review: Debmonitor: request for misc DB allocation - https://phabricator.wikimedia.org/T192875#4168135 (10Volans) >>! In T192875#4156668, @jcrespo wrote: > @Volans you can speed up the process by setting some password on the private repository (and some n... [15:16:14] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2081 crashed/rebooted, probably due to hardware failure - https://phabricator.wikimedia.org/T193325#4168203 (10Papaul) a:05Papaul>03Marostegui 1- power disconnection+connection 2- update BIOS from 2.5.5 to 2.7.1 3- update IDRAC from 2.50 to 2.52 [15:21:35] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: db2081 crashed/rebooted, probably due to hardware failure - https://phabricator.wikimedia.org/T193325#4168232 (10Marostegui) Thanks @Papaul - I have started MySQL to let it replicate for a couple of days before closing this. I will leave the host depool... [16:58:42] 10DBA, 10Patch-For-Review: Test MySQL 8.0 with production data and evaluate its fit for WMF databases - https://phabricator.wikimedia.org/T193226#4168624 (10jcrespo) @Volans T193226#4166726 [17:01:49] 10DBA, 10Patch-For-Review: Test MySQL 8.0 with production data and evaluate its fit for WMF databases - https://phabricator.wikimedia.org/T193226#4168646 (10Volans) @jcrespo ack, no blocker for me, I'm actually not using it. [18:20:14] 10DBA, 10Beta-Cluster-Infrastructure, 10Operations: Possible to run writes (e.g. UPDATE) on Beta Cluster replica - https://phabricator.wikimedia.org/T110115#1569333 (10EddieGP) Just judging from the task title, this and {T183245} look like being duplicates? [22:57:43] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: db1098 crashed and got rebooted - https://phabricator.wikimedia.org/T193331#4169881 (10Cmjohnson) drained flea power, updated bios and idrac f/w to and powered back on BIOS Version 2.7.1 Firmware Version 2.52.52.52