[01:57:32] did we purge the parser cache for every wiki? [01:57:58] because we're at 87,831 unclaimed jobs right now, and they all appear to be parsoidCachePrewarm [03:46:52] Redis is not running jobs anymore. I don't think anything that says that would be accurate, since showJobs, the API, and Grafana are all not going to work as is current with changeprop. [05:47:08] hmm, getting a strange error when trying to rename wikis [05:47:10] [1/9] ``` Renaming simonscharacterdirectorywiki to stagapediawiki. If this is wrong, Ctrl-C now!00 [05:47:10] [2/9] UnexpectedValueException from line 236 of /srv/mediawiki/1.42/extensions/CreateWiki/includes/CreateWikiPhp.php: Wiki 'simonscharacterdirectorywiki' cannot be found. [05:47:11] [3/9] #0 /srv/mediawiki/1.42/extensions/CreateWiki/includes/WikiManager.php(321): Miraheze\CreateWiki\CreateWikiPhp->resetWiki() [05:47:11] [4/9] #1 /srv/mediawiki/1.42/extensions/CreateWiki/maintenance/renameWiki.php(44): Miraheze\CreateWiki\WikiManager->rename('stagapediawiki') [05:47:11] [5/9] #2 /srv/mediawiki/1.42/maintenance/includes/MaintenanceRunner.php(698): Miraheze\CreateWiki\Maintenance\RenameWiki->execute() [05:47:12] [6/9] #3 /srv/mediawiki/1.42/maintenance/doMaintenance.php(100): MediaWiki\Maintenance\MaintenanceRunner->run() [05:47:12] [7/9] #4 /srv/mediawiki/1.42/extensions/CreateWiki/maintenance/renameWiki.php(79): require_once('/srv/mediawiki/...') [05:47:12] [8/9] #5 {main} [05:47:13] [9/9] ``` [05:47:19] and it causes the original wiki to become unavailable (cc @cosmicalpha) [05:47:38] Unfortunately I've got to go in around 10 minutes as I didn't imagine there would be any issues with renames and thought I could just get all 3 done [05:47:57] ummm [05:48:25] that isn't good [05:48:27] [1/13] this is what commands were run (as usual): [05:48:27] [2/13] ```EXECUTE (type c(continue), s(kip), a(bort): salt-ssh -E "db171" cmd.run "sudo -i mysql --skip-column-names -e 'SELECT wiki_dbcluster FROM mhglobal.cw_wikis WHERE wiki_dbname = \"simonscharacterdirectorywiki\"'" [05:48:28] [3/13] reception@puppet181:~$ sudo python3 renamewiki_rhinosversion21.py --oldwiki simonscharacterdirectorywiki --newwiki stagapediawiki [05:48:28] [4/13] EXECUTE (type c(continue), s(kip), a(bort): salt-ssh -E "db171" cmd.run "sudo -i mysql --skip-column-names -e 'SELECT wiki_dbcluster FROM mhglobal.cw_wikis WHERE wiki_dbname = \"simonscharacterdirectorywiki\"'"c [05:48:28] [5/13] ['', '', '', '', 'c4'] [05:48:29] [6/13] EXECUTE (type c(continue), s(kip), a(bort): salt-ssh -E "db181" cmd.run "sudo -i mysqldump simonscharacterdirectorywiki > /home/reception/simonscharacterdirectorywiki.sql"c [05:48:29] [7/13] db181.wikitide.net: [05:48:29] [8/13] EXECUTE (type c(continue), s(kip), a(bort): salt-ssh -E "db181" cmd.run "sudo -i mysql -e 'CREATE DATABASE stagapediawiki'"c [05:48:29] [9/13] db181.wikitide.net: [05:48:30] [10/13] EXECUTE (type c(continue), s(kip), a(bort): salt-ssh -E "db181" cmd.run "sudo -i mysql -e 'USE stagapediawiki; SOURCE /home/reception/simonscharacterdirectorywiki.sql;'"c [05:48:30] [11/13] db181.wikitide.net: [05:48:31] [12/13] EXECUTE (type c(continue), s(kip), a(bort): salt-ssh -E "mwtask181" cmd.run "sudo -u www-data php /srv/mediawiki/1.42/extensions/CreateWiki/maintenance/renameWiki.php --wiki=loginwiki --rename simonscharacterdirectorywiki stagapediawiki Reception123"c [05:48:31] [13/13] ``` [05:48:39] well we have an SQL backup at least [05:49:29] though I'm not sure how this could happen to two wikis? Maybe it's related to recent CW changes? [05:49:39] The error doesn't give enough information to know what went wrong and how to fix it... [05:49:47] It definitely could be [05:50:04] let me get the graylog entry for the full error (if there hopefully is one) [05:50:37] Did you rename the actual database? [05:51:02] (maybe just databases.php isn't resetting is why I ask) [05:51:18] huh this is very strange... one of the wikis seems to have worked now and the old wiki showing the error shows a 404 [05:51:35] while the first wiki shows 404 for both [05:52:29] If both failed at the point the one that is working will undoubtedly have issues [05:52:40] yeah I wouldn't have missed any steps as all is done with the python script from puppet [05:52:57] it doesn't seem to? I had a quick look and everything seems intact [05:53:03] It means the SQL wouldn't have updated so centralauth (I think), GNF etc... will use old values [05:53:06] as for login it's expected that there will be issues as that's usually the case [05:53:14] oh... [05:53:37] for some reason graylog isn't resolving as I wanted to see if I could get more details [05:53:37] Because it failed before this point `$this->hookRunner->onCreateWikiRename( $this->cwdb, $old, $new );1 [05:55:02] hmm, now the first wiki is also resolving [05:55:17] This indicates some order done wrong, as the only way this error happens is if the wiki simons.... is not in cw_wikis at all. [05:55:22] even if they're broken I don't fully understand what took a few minutes for them to work as usually after renames it's much quicker [05:55:35] well if it wasn't the new wiki wouldn't be working [05:55:55] The error would be impossible otherwise so I am at a total loss lol [05:56:22] [1/11] The only way it happens: [05:56:23] [2/11] ```php [05:56:23] [3/11] $wikiObject = $this->dbr->selectRow( [05:56:23] [4/11] 'cw_wikis', [05:56:24] [5/11] '*', [05:56:24] [6/11] [ 'wiki_dbname' => $this->wiki ] [05:56:24] [7/11] ); [05:56:25] [8/11] if ( !$wikiObject ) { [05:56:25] [9/11] throw new UnexpectedValueException( "Wiki '{$this->wiki}' cannot be found." ); [05:56:25] [10/11] } [05:56:25] [11/11] ``` [05:56:40] well it would also be extremely unlikely that both wikis weren't in cw_wikis [05:56:46] as then they wouldn't have been working in the first place [05:56:58] they would've only been deleted by the script itself [05:57:16] so maybe something went wrong in renameWiki.php after it deleted from cw_wikis [05:57:48] What I mean is some issue with order where it is being renamed in mhglobal before it is being called there so it calls on the old wiki after the old wiki has already been renamed. That is the only thing that would make sense. [05:58:08] Hmm yeah, though that is confusing as the script itself wasn't changed [05:58:48] Maybe something unintentionally changed in the way renames have to be ran. I'll do some test renames on beta later today and try to see what is happening. [05:58:54] job queue seems a bit slow, no? [05:59:07] Thanks! [05:59:25] @reception123 in the meantime don't do any renames. Put those on pause. [05:59:31] yeah, that's for sure [05:59:35] there was one left but it'll have to wait [06:00:36] Maybe... it seems jobs are still being sent to redis but not ran in redis so they are sending to both redis and changeprop? [06:00:54] ah [06:01:02] I see wiki creations have hung completely [06:04:11] Uh oh, I just ran through some approvals. [06:04:47] yep, that's what alerted me to an issue [06:05:10] Need to call it a night, hopefully it can get sorted. I wasn't aware of this and had kicked off that last rename about 35 minutes ago [06:05:46] it's also completely hung so we'll have to look into it [06:07:41] I'm running jobs now [06:07:49] Thankfully it still lets me run runJobs [06:12:05] is that after paladox update or? [06:12:20] Yes [06:12:29] minor road bump [06:12:44] @cosmicalpha can you look into it? [06:15:35] No access at this exact moment. [06:15:57] Running runJobs would fix for now [06:16:26] Get thee run, jobs! It is commanded. [06:16:59] Can someone run runJobs.php on Meta if anyone is around right now? @Technology Team [06:17:28] I already did [06:17:34] Oh thanks [06:17:39] I'm running all jobs on all wikis now [06:17:45] but that's a bandaid [06:18:30] Just do an infinite loop on it on all wikis. That only Might blow up the servers. [06:18:50] lol [09:35:41] What happened? [09:36:03] I tested CW jobs on beta so it should be working. [09:46:46] Ohhh thanks @agentisai [09:47:00] np [09:47:15] not sure what you did but the new job queue works great! [09:54:22] I tuned it [09:54:42] But seems it still needs tuning as I donโ€™t know what causes that load spike. [09:55:15] Should probably get it so load balancing for mwtask happens as right now we just use mwtask171 [10:46:15] @paladox we can add more tasks to the jobrunner pool [10:46:28] Laptop is downstairs [10:47:47] What do you mean jobrunner pool? [11:28:54] @paladox so we can load balance jobrunner.wikitide.net [11:29:09] You can just do it in gdnsd as weighted for now but we should copy it to cf ready [11:33:24] we don't want public traffic accessing it as far as i'm aware. [12:38:01] @agentisai / @cosmicalpha not sure who set the sockets on kafka181 to 4 but sockets corespond to how about cpu things you got. So we have 20 x 2 (which = 40 but in practice equals 80) [12:43:44] Hi, how is going the new jobrunner/jobqueue using EventGate, Change-Propagation and Kafka? [12:51:51] all wikis are using it. It seems to be going fine! [12:53:58] nice! [14:03:19] @agentisai / @cosmicalpha do you see the issue that you saw earlier happening rn? [14:04:08] with the job queue? [14:04:26] yeh [14:04:36] i've been monitoring it [14:04:46] Does CW work for you on meta? [14:05:20] yep, works good! [14:05:27] nothing seems to be broken [14:05:37] awesome! [14:05:41] it works instantly on job creation [14:05:59] only issue now, I suppose, is trying to figure out how to get these jobs out of limbo\ [14:06:01] https://grafana.wikitide.net/d/GtxbP1Xnk/mediawiki?orgId=1 [14:08:30] oh jobqueue_jobs is on jobchron [14:08:42] @agentisai i guess we have to update the alert but i dunno what the replacement is [14:10:37] i restarted redis on it so it cleared it out at least but we need to replace it [14:11:43] oh awesome, thanks [14:11:55] maybe wmf can give us some ideas [14:12:58] https://grafana.wikitide.net/d/ZB39Izmnz/eventgate [14:13:03] that's nice at least [14:20:50] [1/2] something strange has been proceeding on my talk page, if someone from tech can give it a peek [14:20:51] [2/2] https://meta.miraheze.org/wiki/User_talk:Raidarr#Hello%2C_I_encountered_an_issue [14:28:36] Thi sis likely related (in part) to the rename being stuck with the mega jobqueue [14:34:02] if that can be confirmed, he'll have a track to go from even if not necessarily a quick solution [14:54:06] not sure even if there is a replacement [15:04:33] On Scruff Wiki / Manage this wiki's permissions, suddenly the sysop has lost no user rights (scruff.miraheze.org) [15:05:15] Huh [15:05:18] Anything in logs? [15:06:16] maybe misconfigurated abusefilter? [15:11:43] I can only see 'public logs' and Abuse filter isn't accesible, while SDIY Wiki seems to be okay. [15:12:15] I've not changed anything except edit main and talk spaces [15:12:35] link? I can peer and see if something stands out [15:14:07] https://sdiy.info/ sysop permissions seem to be gone there as well [15:14:32] https://scruff.miraheze.org/ [15:15:30] For the interested, Wikimedia's 6 monthly data centre switchover is now complete and writes will be served from codfw for the next 6 months. [15:22:56] yeah the perms got wiped, tech is going to have to peek at that I think [15:23:23] so my answer to that right now is a) reset wiki permissions or b) reapply sysop permissions based on defaults, I assume that is more desirable for scruff and synth [15:23:24] woo that's fun [15:24:04] or even, I can just add read to sysop/bureaucrat and you can configure as desired since bureaucrat is intact [15:25:06] @cosmicalpha ^ [15:26:13] take note of a double whammy: attempting to add read to either wiki produces an error [15:26:18] sample, `[5dd52c85f507005abb5bb8be] 2024-09-25 15:25:52: Fatal exception of type "TypeError"` [15:26:41] to sysop on either wiki* [15:27:17] https://discord.com/channels/407504499280707585/407537962553966603/1288518832495001691 [15:27:26] it's happened on meta as well [15:27:26] https://meta.miraheze.org/wiki/Special:ManageWiki/permissions/sysop [15:27:27] wtf [15:27:53] I'll go ahead and mention that group deletion doesn't seem to be appearing either [15:28:03] noticed that on my wiki and elsewhere a little while back [15:28:33] array_search(): Argument #2 ($haystack) must be of type array, null given [15:28:38] @raidarr for your error ^ [15:28:49] @cosmicalpha fyi https://logging.wikitide.net/messages/graylog_35/73367c21-7b52-11ef-9906-bc2411bdcdc1 [15:31:34] ok this looks like more of a bug @raidarr [15:31:43] i see entries for sysop in the table [15:32:06] [1/5] > | metawiki | sysop | ["urlshortener-create-url","urlshortener-manage-url","autopatrol","block","blockemail","browsearchive","createaccount","createpage","createtalk","delete","deletechangetags","deletedhistory","deletedtext","deletelogentry","deleterevision","edit","editcontentmodel","editinterface","editprotected","editsemiprotected","editusercss","edituserjson [15:32:07] [2/5] ","edituserjs","import","importupload","minoredit","move","noratelimit","override-export-depth","pagelang","patrol","patrolmarks","protect","purge","rollback","suppressredirect","unblockself","undelete","upload_by_url","abusefilter-modify","abusefilter-log-detail","abusefilter-view","abusefilter-log","abusefilter-modify-restricted","abusefilter-revert","abusefilter-view-private","abu [15:32:07] [3/5] sefilter-log-private","override-antispoof","globalblock-whitelist","nuke","oathauth-enable","mwoauthproposeconsumer","mwoauthupdateownconsumer","tboverride","tboverride-account"massmessage","replacetext","translate","translate-import","translate-manage","translate-messagereview","translate-groupreview","view-dump","generate-dump","delete-dump","pagetranslation","mergehistory","unwatc [15:32:07] [4/5] hedpages"] | {"0":"confirmed","1":"flood","2":"translator","3":"translationadmin","4":"autopatrolled","5":"rollbacker","6":"patroller","7":"ipblock-exempt","12":"directors"} | {"0":"confirmed","1":"flood","2":"translator","3":"translationadmin","4":"autopatrolled","5":"rollbacker","6":"interface-admin","7":"patroller","8":"ipblock-exempt","13":"directors"} | [] | [] [15:32:08] [5/5] | NULL | [15:34:53] [1/7] > > $mwPermissions = new Miraheze\ManageWiki\Helpers\ManageWikiPermissions( 'metaw [15:34:53] [2/7] > iki'); [15:34:54] [3/7] > [15:34:54] [4/7] > > $permList = $mwPermissions->list( 'sysop' ); [15:34:54] [5/7] > [15:34:55] [6/7] > > var_dump($permList); [15:34:55] [7/7] > array(6) { [15:34:56] works for me tho [15:36:42] you'd know far better than me, most code looks like gobblygook to me :p [15:36:49] unrelated but also there's https://logging.wikitide.net/messages/graylog_35/52699d51-7b53-11ef-9906-bc2411bdcdc1 [15:38:54] [1/5] I have the settings backed up as JSON at [15:38:55] [2/5] https://ia802309.us.archive.org/3/items/wiki-sdiy.info_w-20240911/sdiywikiwiki_managewiki_backup_9c679065655f8a406d10.json [15:38:55] [3/5] and [15:38:55] [4/5] https://ia800103.us.archive.org/8/items/wiki-scruff.miraheze.org_w-20240911/scruffwiki_managewiki_backup_139e45f38be097c393a3.json [15:38:55] [5/5] can these be imported? [15:46:52] once the underlying bug is resolved, that does seem like an intuitive solution [15:48:08] https://cdn.discordapp.com/attachments/1006789349498699827/1288527469779091557/message.txt?ex=66f58238&is=66f430b8&hm=3a733326c6e496e31f538347fbfbb96200548f929ac9ae1091657713f6b8dc67& [15:51:11] Thanks, there's probably a few tweaks to the permissions that I can't remember. [15:54:18] I think I know what the issue is [15:54:28] Give me a moment [15:59:05] only around 123 wikis are affected it seems [16:03:29] https://tenor.com/view/waiting-tom-and-jerry-impatient-walk-around-circles-gif-20355711 [16:05:36] ๐—œ'๐—น๐—น ๐—ต๐—ฒ๐—น๐—ฝ ๐Ÿญ๐Ÿฌ๐—ฝ๐—ฒ๐—ผ๐—ฝ๐—น๐—ฒ ๐˜๐—ผ ๐—ฒ๐—ฎ๐—ฟ๐—ป $250k ๐˜„๐—ถ๐˜๐—ต๐—ถ๐—ป ๐—ฎ ๐˜„๐—ฒ๐—ฒ๐—ธ ๐—ฏ๐˜‚๐˜ ๐˜†๐—ผ๐˜‚ ๐˜„๐—ถ๐—น๐—น ๐—ฝ๐—ฎ๐˜† ๐—บ๐—ฒ ๐Ÿญ๐Ÿฌ% ๐—ผ๐—ณ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ฝ๐—ฟ๐—ผ๐—ณ๐—ถ๐˜ ๐˜„๐—ต๐—ฒ๐—ป ๐˜†๐—ผ๐˜‚ ๐—ฟ๐—ฒ๐—ฐ๐—ฒ๐—ถ๐˜ƒ๐—ฒ ๐—ถ๐˜. ๐—ก๐—ผ๐˜๐—ฒ ๐—ผ๐—ป๐—น๐˜† ๐—ถ๐—ป๐˜๐—ฒ๐—ฟ๐—ฒ๐˜€๐˜๐—ฒ๐—ฑ ๐—ฝ๏ฟฝ [16:06:52] oh what is the issue? [16:11:15] bad json [16:11:41] some entries read as, for example: `"right1,"right2","right3"` [16:11:46] should be fixed [16:11:53] Hm [16:13:48] @agentisai oh thanks, https://scruff.miraheze.org/wiki/Special:ManageWiki/permissions/sysop is still broken tho [16:15:18] I see the cause [16:15:20] will fix [16:15:37] thanks!! [16:17:36] cc @robkam [16:18:07] should be good now [16:18:17] worth updating #announcements as resolved? [16:18:30] yes [16:18:42] was just a small mishap affecting a small number of wikis [16:31:40] Fixed now, thank you ๐Ÿ™‚ [20:47:19] Forgot to mention this but apparently the loathsome characters wiki admins and top admin permissions arenโ€™t working [20:47:22] https://loathsomecharacters.miraheze.org/wiki/Special:Block [20:48:12] that would relate to an issue earlier, but I thought it was fixed [20:48:58] yea, try it again - the major incident on this doesn't seem to apply here [20:49:05] if it doesn't work I recommend posting the error message that comes out [20:49:55] Got this [20:49:59] https://cdn.discordapp.com/attachments/1006789349498699827/1288603430704906343/IMG_0266.png?ex=66f5c8f6&is=66f47776&hm=5f8a58e360a087113c599a9011d9e614b920de8d697e485c22fea272504f0039& [20:51:01] [1/2] if it was earlier's problem this would be empty [20:51:02] [2/2] https://loathsomecharacters.miraheze.org/wiki/Special:ManageWiki/permissions/sysop [20:53:22] [1/2] however upon checking this I do not see any administrator group, so something else apparently went wrong... [20:53:23] [2/2] [20:55:07] administrators appear in ManageWiki and on UserRights but not in ListUsers [20:55:22] @paladox and/or @agentisai, thoughts? [20:56:38] it shows for me [20:56:53] sysop group seems to be missing for MediaWiki [20:56:54] second one, not [20:56:57] maybe bad cache? [20:57:05] now I see it in the list [20:57:12] might well be cache?... [20:57:41] @yoshi459292927370272, try again? [20:57:58] It was [20:58:04] Should be working now [20:58:07] I reset the cache [20:58:24] @raidarr if you see an odd error like this, reset the ManageWiki cache [20:58:44] aha [20:59:07] It may be due to a poisoned cache from the previous incident [20:59:28] wait you can clear a wikis cache via the UI? [20:59:59] Yes [21:00:05] Special:ManageWikiDefaultPermissions [21:00:09] beat me [21:00:12] You can reset groups, cache, and something else [21:00:15] I forgot [21:00:27] groups, all settings, cache [21:00:32] ah, nice [21:00:38] I forgot what that third thing was [21:00:57] reset buttons are hardcore, click and it goes all the way [21:00:57] oh [21:03:33] confirmation dialogue's gonna cost you extra [21:03:53] 2 office chairs and a random hat [21:11:06] It works [21:13:02] nice [21:13:14] Thanks [21:43:38] Hi. We are still experiencing the issue of missing permissions. phabricator states that this issue has been resolved. Will it take some time for this fix to be reflected to us? [21:51:59] It shouldnโ€™t [21:52:02] @agentisai [21:52:05] Whatโ€™s your wiki [21:52:19] probably a poisoned cache as well [21:53:10] [1/2] Its my wiki [21:53:10] [2/2] minecraftjapan.miraheze.org [21:54:26] Did it fail itโ€™s CON save? [21:58:03] What exactly is your issue? [21:58:32] seems the cache refreshed at a bad time [21:59:39] We, the administrators, are not authorized to edit MediaWiki namespace pages. I have tried to fix this but I get an internal error. [22:01:58] Can you try now? [22:03:21] It woked now! [22:03:35] thank you! [22:03:44] No problem [22:09:04] @rhinosf1 maybe you can find a replacement for the `jobqueue_jobs` redis? [22:09:18] need an alternative in changeprop/kafka [22:09:47] @agentisai you may just want to run resetWikiCache on all wikis [22:09:58] Good idea, will do [22:54:19] What [22:59:24] @rhinosf1 the alert is broken on grafana because the Prometheus key jobqueue_job doesnโ€™t work anymore as we switched to the new jobrunner [22:59:29] Which uses a different system [22:59:49] @paladox no idea, I didn't write the actual Prometheus code [23:00:44] I guess now would be a good time to point out them stats aren't the best anywhere [23:00:49] We could do with rate of change [23:01:28] My brain mush tbh