[00:05:26] [1/7] Alternative solution: https://github.com/miraheze/MirahezeMagic/pull/649 [00:05:26] [2/7] Works for special pages as well as `action=edit`, `action=history`, etc. [00:05:27] [3/7] Does not work for: [00:05:27] [4/7] 1. Special:UserLogin and Special:CreateAccount (require core patch) [00:05:27] [5/7] 2. Links added by extensions (e.g. `view buckets`, `veaction=edit`) [00:05:28] [6/7] 3. Talk page red link (didn't figure this one out) [00:05:28] [7/7] https://cdn.discordapp.com/attachments/1006789349498699827/1493039374466678958/image.png?ex=69dd84c6&is=69dc3346&hm=d4b8aec34d7667769c5e289883f2bd9e00c2a19aa7630b492079dfcfc47124e7& [00:05:48] Currently deployed on test151 https://exttest.mirabeta.org/wiki/Main_Page?debug=2 [00:07:43] What about in-content special page links [00:07:57] Or Special:Diff or SpecialEditPage etc. [00:08:26] Uh stupid question I guess those are special pages too but [00:11:43] Can you have a sanity check script that reads all pages from the sitemap and checks how many links are there that are supposed to be nofollowed but aren't [00:11:57] hi tech any reason my json page would stop showing in the table format [00:12:09] non miraheze wiki [00:12:54] Maybe your content model is wrong [00:13:52] its JSON [00:15:45] talk pages may use rel="ugc" instead [00:16:01] Hmm how well supported is that [00:16:22] I was mostly thinking about talk page red links since they should not be crawled or indexed in the first place. [00:16:36] google introduced it 7 years ago [00:17:44] I guess it could depend on other extensions you have installed? I haven't seen any that turn your JSON table into JSON text though [00:17:48] Can you link [00:18:32] https://deadlock.wiki/Data:ItemData.json [00:19:15] oh hey, deadlock [00:19:20] I played that a few minutes [00:19:35] glory to valve [00:19:42] we r hoping they will host the wiki [00:19:48] best of luck [00:21:53] Oh maybe it's because the page is huge so MediaWiki gives up at rendering as a table [00:22:35] is there a limit? i definitely feel like fischipedia has larger json pages [00:22:36] I think Kocka is right lol. [00:22:44] Compare https://strinova.org/wiki/Module%3ATranslate/data.json and https://strinova.org/wiki/Module:CharacterSkins/data1.json [00:22:54] That used to confuse me. Now I know. [00:23:00] i see [00:23:22] thanks for help guys!! [00:23:42] [00:23:46] 200KB [00:26:35] (because of ) [00:27:52] [1/3] I like the bug reports that are like [00:27:53] [2/3] What happens: crash [00:27:53] [3/3] What should have happened: not crash [00:28:22] The bug report template can be a bit annoying sometimes [00:32:00] Works for "view buckets" now because I did what @pixldev was instructed and [moved everything to ExtensionFunctions](https://github.com/miraheze/MirahezeMagic/pull/649/changes/33539730dffd890bd46cff19de195ba16d0cd3ea). Somehow I still don't see nofollow for purge though. [00:49:13] Oo? [00:51:31] It's basically what Bawolff said I think. [00:54:58] `ExtensionFunctions` fires last. And once it fires use `$hookContainer->register` to register extension hooks. This way there is a high chance that the hook is added last. [17:40:09] @posix_memalign just saw your PR, what cases aren't covered? [17:41:32] I think he mentioned the cases here [17:44:36] The discussion page red link, Special:UserLogin, Special:CreateAccount, `action=purge`, and `veaction=edit` and the ones I'm aware of. It's currently deployed on test151 so you can look around https://exttest.mirabeta.org/wiki/Main_Page for `nofollow` (and the lack of it). [17:45:54] Anything in actions don't have nofollow, though for anons the only button is purge. [17:47:28] https://github.com/miraheze/mw-config/pull/6371 might help too and at least stop it following links on special pages that just link to other special pages (like UserLogin when it ends up getting in loops of tokens) [18:01:15] Oh wait I think I know why action=purge and veaction=edit are not marked as nofollow. I'll fix them later today. [18:02:11] As for login and create account, if the current MirahezeMagic patch proves to work, we can try to get an upstream core patch because it might be lots work to justify why we want to whitelist the `rel` attribute. [18:02:56] We never set it to nofollow the whole time 😭 [18:03:27] Nope, the namespace is just noindex at the moment [18:03:37] Please do feel free to deploy that [18:04:38] It's already noindex,nofollow actually [18:04:58] Not according to the config [18:04:59] https://cdn.discordapp.com/attachments/1006789349498699827/1493311052384047204/image.png?ex=69de81cb&is=69dd304b&hm=19836566f10515c250393c71f90c462d61ca19eec0a3b7bd6df65a55efaee19b& [18:05:09] Special pages are not affected by that config as far as I know [18:05:17] They are just hardcoded to noindex,nofollow [18:06:00] [18:06:54] yes it is [18:07:29] the config should match reality anyway or be removed [18:08:56] tbh I'm surprised your robot policy only had NS_SPECIAL which doesn't do anything [18:09:13] Maybe it's a saner default to not index talk pages and user pages [18:09:51] We generally stick to defaults [18:10:06] Yeah but the defaults suck [18:10:39] Yes in this case they do and we should probably change [18:10:56] But that's why it'll be the way it is [18:11:48] [1/2] Semi-related but I was just looking today through defaults for ParserMigration and Wikimedia really just said this is a sane default (changed from false in 1.44) [18:11:49] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1493312768047321188/image.png?ex=69de8364&is=69dd31e4&hm=8ba7fbf0df34711d0592067c3db5e4119a843fadb86194b91e4196898a18d1c7& [18:12:46] There isn't even a task for this cscott just did that [18:14:37] nvm it's a 1.46 change [18:14:43] I guess it's good to know in advance [18:15:27] this feels a bit early because parsoid shouldn't be the default parser before 1.47 [18:15:56] but I guess it matches WMF's setup since they use parsoid read views on most(?) wikis [18:18:10] Yeah, it just feels very WMF-centric [18:18:39] Like did they think WMF is the only wiki farm that needs to migrate to Parsoid at some point [18:19:30] They were actually decent last time I talked to them about Parsoid [18:23:13] I just realized like 5 minutes ago that on Parsoid doesn't support data attributes [18:45:37] should we remove the cloudflare rules blocking google from special pages now, or should they stay in place [19:18:01] [1/2] Among all namespaces I think File is the worst offender. In User NS there could be valuable user page essays, and sometimes people want their user pages to be index when it acts like a personal home page. File NS has way too many pages which are all included in the sitemap and marked as index,follow, so they waste an enormous amount of Search Engine [19:18:01] [2/2] time. [19:20:21] On some of my wikis we actually have meaningful file pages [19:20:46] But yes maybe it's good as a default [20:09:36] are there some sort of rate limits on downloads of large dumps because it keeps losing connection roughly every minute [20:14:28] Hmmm [20:16:19] Lemme see about getting it downloaded and hosted on our gdrive for you [20:16:30] I'm half way there now [20:16:32] so it's moot [20:16:35] I was just curious [20:16:39] Right on [20:16:42] Shouldn't be an issue [20:17:04] it's just annoying that it cuts out every minute, so it goes like 30MB/s for a minute then cuts out for a minute [20:17:18] uploading to gdrive would probably take longer [20:19:53] There shouldn't be [20:20:38] it'll download like 250MB then keep retrying for like 2 minutes then another 250MB [20:20:49] but it's specifically multiples of 250MB which is weird [20:20:53] nvm now it stopped at 8.63G [20:31:29] and then it gave up after 20 tries [21:41:08] This might actually be helpful, it keeps getting stuck between 9-10GB and I don't quite understand why, it's been at 9.38G for a while and just now went to 9.43G over about 5 seconds before it gets stuck into the retries again [21:42:05] Yeah, I'll give it a shot, otherwise I'll work with skye to get that moved over [21:42:50] Cool thanks [21:46:12] Thank goodness for fiber optic internet. [21:47:19] I'm on the other side of the world to SLC, as are WO servers so it's not particularly great [21:47:44] OVH had better supply of servers at the time so yeah [21:54:53] Anyone else read OVH as OVecHkin? [21:55:18] are you doing ok [21:55:21] I don't get the reference [21:55:29] dOVaHkin? [21:55:48] OVH unfortunately have a unique position where they are a shit host to deal with but equally so are so cheap that it's worth the crap sometimes [21:57:53] I’m looking for a place that’ll rent me a server with 384-512 GB of RAM and charge me a reasonable amount [21:59:21] good luck [21:59:52] as am I [21:59:59] Right now it's a bad time for all things hardware rental [22:00:02] or at least if it's a reasonable amount I will be buying from them [22:00:28] The mess is EXPENSIVE [22:00:35] I'm sort of glad WO got our servers when it did, it's not a great deal but given the ram shortage OVH are beginning to increase pricing so [22:00:41] better now than later [22:01:06] Yeah, until AI collapses or more ram deals for AI centers fall through, it's gonna be expensive for a while on all fronts. [22:01:14] building myself a device with 128gb's gone on the back burner indef [22:01:31] It does seem like it's getting better though from what I can tell [22:01:47] Yeah, OpenAI reneged on a big chip deal, that let out some of the supply pressure [22:01:52] the biggest issue WO have right now is the nand supply [22:02:01] Often the cure to high prices is high prices [22:02:01] we have one server on ssds and the other two are hard drives [22:02:26] so it's been a bit of fun figuring out what I can defer to hard drives and what actually needs to run on ssds [22:02:33] The problem right now is it's a circular firing squad of money. [22:02:56] They can keep paying infinite prices for a bit because the money just goes to the next in line and very little leaves the circle [22:03:26] But it will eventually run against hard limits and external, material costs [22:04:52] Iirc our good friend sam has had to scale back his expansion [22:05:05] now it's like $600M (or maybe it was billiion) over the next 4 years [22:05:09] Yep, that was the chip deal I mentioned [22:05:24] oh yeah I didn't see [22:05:33] I don't see how they need all this capacity [22:05:46] Well, partly they're trying to get out from under MS rentals [22:05:48] the number of people using chatgpt isn't really going to increase that significantly to justify that level of expansion [22:06:00] As there's no way to be profitable if you also are renting your compute from MS [22:06:13] true [22:06:18] but equally so on prem is a pain in the ass [22:06:37] GPU compute is probably better to run yourself [22:06:46] but general purpose stuff like databases don't need to be on prem [22:07:00] Problem is that the private equity they were banking on didn't come through, so with no financing, no deal