[13:56:02] Not sure if I asked this before but if I were to send API queries to every wiki on the Miraheze network, what kind of ratelimiting should I have to avoid getting blocked [14:01:30] Idk the answer but I'm curious why you would need to do that [14:01:39] ^ [14:03:01] I was thinking about making a tool to search for oudated Fandom links when a wiki forks [14:05:45] Not necessarily update them because that sort of thing is dependent on context [15:58:48] [1/2] Tried to port Fandom's [Table Progress Tracking](https://community.fandom.com/wiki/Help:Table_Progress_Tracking) and it works [15:58:48] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1498715112813629480/image.png?ex=69f22ab7&is=69f0d937&hm=44fe5f9e9d56f29ea5abb3e8ade17f6f2f340f45acadc22a3ec390ca28a2c5af& [15:59:25] But I'm a bit unsure about storing data and what would be the best way (without an extension). So far I'm using a gadget and storing into preferences with userjs- keys [16:34:50] Pretty sure userjs preferences are the best you have? [16:35:22] Saving to a page would be kind of spammy [16:35:33] But also I think OA built an extension for this [16:35:46] Yeah and I'm not going to roll an external service because users could abuse it unless I do oauth2 or something [16:35:58] Oh? [16:36:04] I was looking for one before writing but couldn't find nay [16:36:08] [[mw:Extension:TableProgressTracking]] [16:36:09] [16:36:45] Apparently it's approved for Miraheze? [16:37:04] I'm using it on my wiki, works great [16:38:18] Oh man, I need better googling skills :70_b_dogkek: [19:16:28] I usually do 1 request per second. If it's too slow you can get away with higher frequencies. The ratelimit is pretty generous and you will likely be fine as long as you send requests synchronously and not multithread. [19:17:54] I mean idk, how many public wikis are there anyways [19:18:22] Sending requests synchronously is the secret to never bothering anyone cause you only ever take up 1 slot on their server [19:19:29] If I did 1 request per second that would take like 6 hours for 1 external link across the farm [19:19:37] But yeah I can do synchronous [19:22:03] A little more than 10k. My other bot takes several hours to run because it needs to scan all wikis. Maybe I should set it to 0 to see how it goes. [19:47:45] You could just wait for a 429 and then back off [19:48:05] If no one has changed the rate limits, a 429 will only block requests above the limit at first [19:48:26] If you carry on being annoying after getting a 429 for a bit, it'll eventually massively block you. [20:12:08] Does it have a Retry-After header [20:15:17] Not sure [20:16:05] No [20:16:35] Waiting a second or two should be fine though [20:35:11] Do you want me to report any strange wiki behaviors found this way [21:05:42] [1/5] The strange ones I've found so far: [21:05:42] [2/5] [21:05:42] [3/5] [21:05:43] [4/5] [21:05:43] [5/5] [21:12:34] last 2 r domains that werent renewed im pretty sure [21:13:56] [1/2] o yea [21:13:56] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1498794421557399552/image.png?ex=69f27494&is=69f12314&hm=9386ef7873401334b073cc853248673eb188ce37980a9fcb71d0311291da0c35& [21:19:27] yeah, would especially be good for custom domains [21:19:50] do we bother removing them is the question [21:25:55] Adding more to that list as I find them [21:39:55] [1/3] I wrote a bot to scan these a while back, though it seems that Cloudflare gives the full list. [21:39:55] [2/3] A few on Kocka's list are pending validation. [21:39:55] [3/3] https://cdn.discordapp.com/attachments/1006789349498699827/1498800958249893898/image.png?ex=69f27aaa&is=69f1292a&hm=e06689a31d1db437df69038f857b1186c4a386660598ac3494430eaaba79e8fa& [21:42:06] [1/2] Lots of pending validation, which doesn't seem normal. [21:42:07] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1498801510664896593/image.png?ex=69f27b2e&is=69f129ae&hm=c21f306a6f018ffa06ba4b094dfe88aa92d662c2ad8c96dc4f0ede201cb9be9d& [21:45:38] Hmm exodus.miraheze.org gave me a "Wiki deleted" notice but then when I actually visit it it seems okay [21:46:33] Might be caching again [21:46:53] Happens to a lot of our infra where things propogate very slowly