[07:38:12] 10DBA, 10Cleanup, 10Data-Services, 10cloud-services-team (Kanban): Drop DB tables for now-deleted fixcopyrightwiki from production - https://phabricator.wikimedia.org/T246055 (10Marostegui) Thanks Brooke, I will drop them manually then. [07:40:08] 10DBA, 10Cleanup, 10Data-Services, 10cloud-services-team (Kanban): Drop DB tables for now-deleted fixcopyrightwiki from production - https://phabricator.wikimedia.org/T246055 (10Marostegui) Done from the wikireplicas ` root@cumin1001:~# for i in labsdb1009 labsdb1010 labsdb1011 labsdb1012; do echo $i; mysq... [08:13:09] addshore: anything runnning on wikidata or migration or something [08:13:15] db1087 is having around 1-2 seconds lag all the time [08:13:22] And I don't see the dumps now [08:13:25] o/ *looks* [08:13:26] I might just depool it [08:13:44] the migration is indeed running but its going much slower now than it has been in the past week [08:13:48] let me look at some graphs [08:14:34] https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1087&var-port=9104&fullscreen&panelId=37 [08:14:49] https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1087&var-port=9104&fullscreen&panelId=3 [08:15:08] ooof, I dont think that should be the migration [08:15:16] it looks like the edit rate on wikidata is a bit spikey though [08:15:24] I know what it is [08:15:32] SELECT /* SpecialMostRevisions::reallyDoQuery */ page_namespace AS `namespace`,page_title AS `title`,COUNT(*) AS `value` FROM `revision`,`page` WHERE page_namespace IN (640,146,0) AND (page_id = rev_page) AND (page_is_redirect = 0) GROUP BY page_namespace,page_title ORDER BY value DESC LIMIT 5000 0.000 [08:15:38] That one has been running for a looong time [08:15:49] From wikiadmin [08:15:59] That is running from mwmaint1002 so the query killer didn't kick in [08:16:08] hmmmmmmmmmmmmmmm [08:16:14] I reckon I opened a task about it a few months ago [08:16:22] ok to kill it? [08:16:33] I think so, I dont think it is us [08:16:47] wait, SpecialMostRevisions from mwmaint? [08:16:48] odd [08:16:51] Killing it and then looking for the ticket [08:16:56] I am sure this is not the first time I see this [08:16:58] oh yes, this is the dammed special page agian [08:17:16] https://phabricator.wikimedia.org/T239072 [08:17:20] this query scans 400M rows XD [08:17:32] yeah, that one [08:17:37] should be disabled? https://gerrit.wikimedia.org/r/#/c/553339 [08:18:26] I can double check [08:18:29] let me comment on the task [08:19:46] also, writing bit of the terms migration is set to be complete thursday FYI :) then just got those other steps to do [08:19:56] oh nice [08:23:35] addshore: https://phabricator.wikimedia.org/P10683 [08:23:38] those are enabled [08:23:46] could any of those cause that query? [08:24:47] crontabs/www-data:0 1 11,25 * * /usr/local/bin/mwscriptwikiset updateSpecialPages.php s8.dblist --override --only=Mostrevisions > /var/log/mediawiki/updateSpecialPages/s8@18-MostRevisions.log 2>&1 [08:24:57] it is there [08:25:01] Running [08:25:04] that would run that query [08:25:05] Again, let me check the php process [08:25:15] I might need to kill the processin mwmaint1002 [08:25:18] let me find which one it is [08:26:09] yep, that is the one [08:26:13] I am going to kill the php process [08:27:09] <3 [08:29:08] ok, things are back to normal [08:29:13] should we disable that one too? [08:29:14] or what? [08:34:02] marostegui: I guess https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/553339/3/modules/mediawiki/manifests/maintenance/updatequerypages.pp didnt abset the cron so it stayed ? [08:34:11] *absent [08:34:20] but yes, I think disable that one and note it in the ticket [08:39:49] yeah, but I would have guessed that patch would exclude s8,no? [08:39:57] the patch Amir1 made I mean [08:44:15] I have reopened the task [08:53:05] I dont know if that puppet patch will have actually removed the cron, or just stopped it being re added in the future / on new hosts [09:26:40] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) After stopping hearbeat on es3 master es1017: ` root@cumin1001:/home/... [09:27:12] cool [09:40:51] would it be too much to ask to (not now) document the process properly with all steps based on the ticket? [09:41:02] so someone in 3-6 years knows what to do :-D [09:41:07] definitely [09:41:11] or maybe on wikitech better [09:41:13] as in the wiki [09:41:15] yes [09:41:56] this is so infrequent, that is not worth automating for now [09:42:15] but also that we cannot remember from one time to another [09:42:26] in fact, it is the first time I see it in 5 years! [09:42:30] haha [09:42:36] should we consider it finished then? [09:42:41] yes [09:42:55] unless, you know, background things like this- documentation, etc. [09:43:55] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal, 10Patch-For-Review: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) [09:44:05] 2500 rows on enwiki without cluster25 [09:45:09] 1260 cluster26, 1240 cluster27 [09:45:13] nice [09:45:44] I belive the new cluster will have lower load for some time [09:46:01] and then become much more loaded with time as they contain the latest revisions [09:47:45] funnily, I saw that query latency is not reported on es1 hosts [09:48:05] because the query used (heartbeat) may fail there due to lack of a heartbeat table [09:48:48] latency-wise, however, I can see es4 and es5 with worse one, probably due to the writes [09:49:05] that may have a higher penalty than the more reads on static content [09:52:33] and cold buffers [09:54:03] not really- I think [09:54:33] because it is append only, latest writes (all content now on new clusters) should be already on the buffer pool [09:54:55] but I am only guessing here, not sure how the growth is happening right now [10:08:46] yoda always said it better: https://i.imgflip.com/3s4wdy.jpg [10:09:21] hahahahaha [10:09:32] https://wikitech.wikimedia.org/wiki/MariaDB#How_to_enable_a_new_external_storage_(es)_section [10:09:36] Going to add now how to "close" one [10:09:40] please edit as needed [10:09:58] will see it [10:13:53] I may add the "script to add tables was broken and needed fix", because probably that would be true also next time [10:14:52] (I know you refer to it) [10:16:35] my approach to documentation in the past is that while links are necessary, they may break or move or get lost [10:17:38] the thing I see left is dbctl, didn't it need changes there? [10:19:26] will go to get some coffee, will not touch until you are "done" editing to prevent conflicts [10:25:31] jynus: oh yes, dbctl yeah [10:25:37] That is part of setting up the servers [10:25:40] But I will add it yeah [10:27:34] if dbctl in general is explained elsewere, a link would be enough [10:27:45] but I thought this was more involved because new sections [10:27:56] plus the whole master thing that was confusing [10:28:07] which whole master thing? [10:28:20] settting it to master and all that? [10:28:27] that every section on dbctl/mw has a master [10:28:39] but that doesn't match with the standalone hosts [10:28:44] yeah :( [10:28:57] going to add the RO steps now [10:28:59] so, those things are the most interesting from my point of view [10:47:45] marostegui: I think I know what's going on. If you drop the cronjob, it doesn't stop it, you need to mark it as absent first and then drop it [10:47:52] addshore: ^ [10:48:22] I forgot about this "feature" of puppet [10:54:41] Amir1: Aaaah that explains it yeah [10:54:53] I can merge once there's a patch, just ping me [10:55:14] marostegui: I don't know if it's easily doable to make a patch :( [10:55:23] jynus: added https://wikitech.wikimedia.org/wiki/MariaDB#Setting_up_the_servers_with_dbctl and same for the RO status [10:55:29] you can manually just remove it and it won't be added again [10:55:34] I will expand the RO dbctl part in a sec [10:55:48] Amir1: ok! [10:55:51] Thanks [10:55:58] thank you :* [11:07:52] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) [11:08:48] 10DBA, 10Core Platform Team Workboards (Clinic Duty Team), 10Goal: Enable es4 and es5 as writable new external store sections and set es2 and es3 as read only - https://phabricator.wikimedia.org/T246072 (10Marostegui) 05Open→03Resolved a:03Marostegui Considering this all done with the documentation, wh... [11:08:50] 10DBA, 10Epic, 10Goal, 10Patch-For-Review: Setup es4 and es5 replica sets for new read-write external store service - https://phabricator.wikimedia.org/T226704 (10Marostegui) [11:09:24] 10DBA, 10Epic, 10Goal, 10Patch-For-Review: Setup es4 and es5 replica sets for new read-write external store service - https://phabricator.wikimedia.org/T226704 (10Marostegui) 05Open→03Resolved a:03Marostegui es4 and es5 are now in production es2 and es3 are now read only Thanks @jcrespo for all the... [13:51:29] 10DBA, 10Cloud-Services, 10Stewards-and-global-tools, 10Toolforge: Throttling linkwatcher tool user as it is consuming 100% CPU - https://phabricator.wikimedia.org/T121094 (10Aklapper) For the records, @Merl who commented in T121094#1876317 was [last active in 2016](https://tools.wmflabs.org/guc/?by=date&u...