[00:39:12] so I'm... pondering how to deal with a SERIOUS case of wiki vandalism right now. over 600,000 user accounts have been registered to my wiki (I could've sworn I had a captcha in place...) and created dozens of gigabytes of junk when I wasn't looking. There are only two legitimate users on this wiki - [00:39:13] I would like to nuke all the rest, and undo every [00:39:13] thing that they did, then secure my wiki against a repeat. [00:39:26] Am reading various relevant pages but... a bit overwhelmed. Could use some advice. [00:44:10] Volund: :(( how much legitimate content do you have that you'd like to keep? [00:45:06] I think it's probably going to be easier to extract what you want to keep, import it into a new db/wiki and then discard the vandalized one [00:45:08] about a dozen pages... yeah, yikes [00:46:08] if it's a dozen, I'd suggest copying the good wikitext versions out of the history, saving them to text files, reinstalling the wiki and then copying the pages in. If you want to keep page history, you can use Special:Export and Special:Import [00:47:43] as for how to stop this in the future, I'd suggest restricting account creation and telling people to contact you for a new account. [00:48:04] yeah... THAT part... is obvious. @_@ yikes [00:48:17] (I'm assuming this is a personal-type wiki, not something you're trying to build an open community around) [00:49:21] very small community that will be at most a few dozen people. registration-by-invite is fine with me [00:50:21] * legoktm nods [00:50:28] let me know if you need help with any of those steps [00:51:28] Normally I'd recommend the UserMerge and Nuke extensions for cleaning this kind of stuff up, but it sounds much easier to salvage what you want [00:59:33] oooooh. SmiteSpam is doing a decent job... if I could just have it do it all at once. @_@ [01:05:35] but nope.... [01:27:16] So I took a different path altogether. I'm using my knowledge of SQL to purge this crap manually [01:27:45] holy crap though there are 600,000+ user accounts and around 900,000 revisions... deleting them in batches [01:30:00] reinstalling and salvaging the 12 pages and 2 users you want to keep would've probably been easier [01:52:48] I'd like to create a xml file using dumpBackup.php, after I run the command puts this: PHP Notice: Undefined index: storeDirectory in /var/www/wiki/w/includes/cache/localisation/LocalisationCache.php on line 205 [01:53:05] And the fil is in maintenance directory. Is secure the xml file or not? [01:53:12] file* [01:54:45] Depends on your webserver config [01:54:50] and/or whether it respects .htaccess [06:06:20] how can I distinguish usernames and IPs with the API? Regex? [06:14:45] enterprisey: which API query are you using? [06:14:56] most should optionally return userid IIRC [06:15:03] any of them that return a username [06:15:09] I should probably be using userids internally, haha [06:15:16] but I'm also working with IPs [06:15:27] oh, I see what you're saying, thanks [06:16:11] enterprisey: if userid == 0 it's an IP [06:16:15] yea [06:16:24] also most languages should have an isIPAddress() function [17:31:10] :D managed to cleanup my wiki by simply editing the SQL database directly. [17:48:14] * sql edits Volund directly [19:04:05] Hi there--I'm trying to upgrade a broken 1.33 upgrade to 1.35.1 so that the new actor schema can be used. When I try to run update.php, I get the following error: https://pastebin.com/3FbT6h2y I've looked at the ipb_address field, but I see a lot of broken user names in row fields and not IP addresses? [22:32:32] Is there a way to query when a page fails during a 1.33+ migration? For example, I have the page https://segaretro.org/Template:Stub on the older database schema, but it's disappeared for https://staging.segaretro.org/Template:Stub. If I can find these, I can just mass export/import, but not sure how to do that with the new DB structure. [22:38:38] Blackwire: the page is there, but MediaWiki doesn't seem to find it just by page title https://staging.segaretro.org/index.php?title=Template:Stub&action=history [22:39:53] Vulpix: Exactly--I ran into this exact same problem back when 1.33 was new. I just need to figure out how to get the page restored and used. [22:40:18] So my thought now that I had a copy was to at least export/import, but if there's some saner way I'll do that, too. I just want to get onto the modern standard for MW. [22:46:59] Poking the api, I see everything in place: edit summary, users, content... https://staging.segaretro.org/Special:ApiSandbox#action=query&format=json&prop=revisions%7Cinfo&titles=Template%3AStub&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser%7Ccontentmodel%7Ccontent%7Croles%7Cuserid&rvslots=*&rvlimit=1 [22:47:32] Try to disable all extensions (except ParserFunctions, that shouldn't cause a problem here) and see if it makes any difference [22:57:51] So, I disabled all non-PF extensions and re-ran update.php with the force flag. No dice. https://staging.segaretro.org/Template:Stub / https://staging.segaretro.org/Special:Version [23:03:45] Blackwire: Is the database of the staging server a recent backup of the production server? Looks like history of that page is from 2014 [23:05:00] Well, I see other pages with recent edits, but this one is missing the most recent edits [23:05:57] Oh, it's recent--that just happens to be an older page. I can give a couple more examples if you'd like: https://staging.segaretro.org/Press_release:_1997-03-15:_BECK_AND_COOLIO_TO_PERFORM_AT_MTV%27S_GameWorks_PREMIERE_PARTY_SPECIAL_LIVE_FROM_THE_GRAND_OPENING_OF_GameWorks_SATURDAY,_MARCH_15_IN_SEATTLE,_WA or https://staging.segaretro.org/Template:NavboxLeft would also be examples. [23:06:09] It's just tricky because the ones we find that are broken are... somewhat random. [23:07:18] Apparently, that's the problem. That Template:Sub has some missing revisions. Since the "page" table has a direct link to the latest revision, if the latest revision is not found on database, the page is not displayed [23:09:48] I guess the question, then, is why would those revisions be missing from the conversion and if there's some way to find that. Like I said, I'm not opposed to export/import patching in data (in this case, prior revisions). [23:13:57] I guess those revisions exist (because MediaWiki doesn't destroy revisions, not at least during upgrade), but it may not find them if there's some inconsistency with other related tables (like user/actor or comments) [23:14:16] In fact, I get this error if I force one of the missing revisions to display https://staging.segaretro.org/index.php?title=Template:Stub&oldid=261047 [23:15:39] You may be able to pinpoint where's the issue, if you look at the database and start following the IDs it relates to other tables https://www.mediawiki.org/wiki/Manual:Revision_table [23:16:51] Query the database for the revision with rev_id = 261047, and see if rev_comment_id is populated and exists on the comment table. The same with the rev_actor field and the actor table [23:17:50] Take a look also to the slots table, looking for slot_revision_id = 261047 [23:19:19] Well, looked at revisions and that doesn't seem to be the issue: http://ss.sonicretro.org/261047.png I'll check slots next. [23:19:57] yeah, but check also with the related tables (comment, actor) [23:32:05] Also you should check if the upgrade stopped with any error. Even if you run the update.php again and now it doesn't error out, the first error may have aborted a script but also log it as completed, and the next run skips it, leaving things in an inconsistent state [23:32:32] If that's the case, I'd also recommend filing a but [23:40:09] Blackwire: is page_latest filled ? [23:41:01] that username of "old>Shobiz" is weird [23:41:07] 10 is the namespace for templates, right? [23:41:30] yes [23:41:39] Platonides: That's actually to be expected, as a lot of very early articles we had were from a fork of another older wiki (https://info.sonicretro.org). Not all users migrated. [23:42:23] (Also, page_latest is set to the 261047 revision) [23:42:38] the > would not be a legal character on a new username to create [23:42:53] I was wondering if it could be related to the breaking [23:42:55] ...interesting. [23:43:16] But the Press Release article is from a current user who would be valid still. [23:49:06] Full upgrade log is at http://ss.sonicretro.org/mw_1351_update_log.txt (455kB)