[18:02:15] https://www.youtube.com/watch?v=zfjY9JU0NR0 [18:06:17] cool! [18:12:34] no note conccerning logging? [18:12:49] Sagan: good point, one moment [18:13:02] dr0ptp4kt: I think you can edit it yourself too, since you are oped [18:13:12] just edit it or type /topic #channel newone [18:13:38] Sagan: yeah, just a moment. thanks for the reminder! [18:13:46] you're welcome :) [18:13:52] dr0ptp4kt: the text is Wikimedia Foundation meetings channel | Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) |​ Logs: https://wm-bot.wmflabs.org/logs/%23wikimedia-office/ [18:14:01] ... everything after "is" [18:14:29] Success! [18:14:45] thanks tzatziki and Sagan. have a nice day! [18:21:03] is there a bike pump in the office? [21:05:05] #startmeeting RFC meeting [21:05:05] Meeting started Wed Aug 2 21:05:05 2017 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:05:06] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:05:06] The meeting name has been set to 'rfc_meeting' [21:05:23] #link https://phabricator.wikimedia.org/T171382 [21:05:43] #topic RFC: IPv6 contributions and talk pages [21:07:27] anyone want to talk about this today? [21:08:20] (I'll read along :) [21:08:59] the proposer is apparently not attending [21:10:10] why is it getting discussed so quickly? [21:10:27] don't remeber it annouunced on wikitech [21:10:31] I suppose it's worth mentioning this may (or could and perhaps should) relate to the proposal to have rev_user abstracted with an 'actor' concept, meaning at that level we perhaps also should consider them as the same (if we're doing it, we should probably do it consistently) [21:11:11] I'm wondering whether this is actually a problem in practice [21:11:30] trying to figure out some SQL that I can use to answer that, at least for enwiki.recentchangres [21:13:36] my main concern here is that some people's subnets might be not /64. I know this shouldn't happen but there are definitely ISPs and corporate IT out there who don't care of standards [21:14:37] The RFC has a fairly well-defined problem statement and non-code proposed solution to consider that is no longer in draft state from the author. It was indeed quite quick to get a meeting (not sure it was announced or not?). One thing missing though, is resourcing/commitment – given it seems to have come from a volunteer without given interest in implementing it themselves. We should involve a product team as well (collaboration? [21:14:37] platform?) for someone to take ownership of maintenance and/or implementation. [21:15:12] s/product/engineering/ (product or tech team) [21:15:35] It sounds pretty hairy from an engineering perspective, as well as quite complex for product side. [21:15:56] If it's a really major issue, we could kill off stuff we're doing to resource it, but… [21:16:17] Yeah, although if this was addressed at the rev_user/actor level, the queries would be even simpler than they are today (they would be as efficient as a rev query for user_id) [21:17:42] I'm more concerned about the impact of not doing it: 1) communication (do they get the talk page message if they are still online from the same connection within a reasonable time range) and 2) can patrollers/reviewers easily see other edits they made from that connection (both for discovery purposes, and for blocking if needed) [21:17:44] just stripping the last 64 bits of the address in User::newFromSession() or whatever would be relatively easy, but I don't think this task will be as easy as that [21:18:00] I did this: select substring_index(rc_user_text,':',4) as net, count(*) from recentchanges where rc_user_text like '200_:%' group by net; [21:18:35] I should probably do statistics on it, there are 4417 results [21:18:41] my inclination is to aim towards serial-pseudonymy per browser with an automatic cookie (that folks can clear if they are paranoid) to supplant the traditional "IP edits" [21:18:45] but that's a much bigger task :D [21:18:49] brion: Indeed. [21:18:53] (On both parts.) [21:19:23] brion: Hm.. interesting. remove use of IP as basis entirely and use sessions instead. [21:19:44] (presumably still kept for checkuser, which would start to make more sense after that for anons) [21:19:49] in the shorter term though making sure people see their responses at least is nice...... [21:19:55] Krinkle: Yeah. Major implications for current tools (IP blocks, range blocks, etc.), of course. [21:20:15] i do though think it's the sort of thing we should start putting on a MediaWiki roadmap for the future [21:20:25] so it can eventually get resourced :) [21:20:30] ok, better still, counting IP addresses within those ranges: select substring_index(rc_user_text,':',4) as net, count(distinct substring_index(rc_user_text,':',-4)) from recentchanges where rc_user_text like '200_:%' group by net; [21:21:39] James_F: Right. could integrate from checkuser perspective (checkboxes with IPs used by that session, and block some or all of them, and conversely, the oppposite: seeing the different session that share the same IP, and block those instead - esp. if they clear cookies to evade detection but keep the IP, kind of the opposite problem we deal with with autoblock) [21:22:22] But we'd probably want part of that future checkuser to be visible to sysops, which brings into question why we'd expose 'guest' IPs, but not 'logged-in' IPs to admins - which if course is already true today [21:22:27] Krinkle: Lots of communities have no CUs at all, but can defeat some simple IP-based vandalism with regular sysops. Going IP-less would mean either… yes. [21:22:52] And before anyone asks, no, Legal would not remotely be OK with us giving out CU to all sysops. [21:23:03] a compromise solution might be to use the first IP address of a session as the username for that session [21:23:09] yeah we need to think about the tooling [21:23:19] that would answer this problem of SLAAC automigration [21:23:39] especially if we use a long-lived cookie instead of a normal session [21:23:45] Right, makes sense. [21:23:51] Also applies to mobile [21:24:03] When editing as IP on mobile, your IP will change even more than normally with Ipv6 [21:24:06] assuming you move around [21:24:17] This would be a great use case for that as well. [21:24:36] And of course, migrating your session to a user account :) [21:24:41] and preserving your edits if you want. [21:24:51] (preserving assignment I mean) [21:24:59] that's the holy grail :D [21:26:33] Okay, so back to the RFC proposal. I hear agreement that this "might" be a real problem [21:26:38] But, research should be done on how common it is for edits from different IPv6 addresses within the same /64 range to happen within the same , say, 24 hour period. [21:26:48] Although for the mobile case, I think we can confidently say it is definitely a problem. [21:27:08] But with regards to the solution, would we even consider /64 normalisation as a first step? [21:27:45] out of 4417 /64 networks, in enwiki RC, 619 have edits from more than one host [21:27:56] (e.g. always record the lowest address in-range, for MyTalk, block matching and user_text) [21:28:09] 42 have edits from more than 10 hosts [21:29:53] TimStarling: These are edits by logged out users only, or does that include logged in users too? [21:30:06] logged out only [21:30:28] * James_F nods. 42 is a fair number. [21:31:40] the most is 31 hosts from a single network, and looking at those edits suggests that they are one person [21:32:24] OTOH, major changes to MW to benefit an average of 1.3 people a day isn't a great cost/benefit. [21:35:52] interesting... there is also someone who appears to be one user, jumping around a /32 [21:36:04] Deterministically mapping any IPs to the .0 equiv of IPv6/64 wouldn't be that a major change I think. There'll be some fall-out in terms of random code not going through WebRequest or User classes, but most do. [21:36:41] there are probably not many very active anons, not surprising that the edits from a given mobile ISP would be dominated by one such user [21:36:56] Equivalent to the old UseMod 1.2.3.xxx IP fuzzing. [21:37:03] But I'm also curious if it could cause problems, which might be a reason to decline this exact proposal in favour of solving it with brion's session Ids instead (which would use the first real IP in the session again as user text) [21:37:16] James_F: Right, but without the result being an invalid IP. [21:37:19] Yeah, I'd lean to putting efforts towards that. [21:37:23] ah right, that /32 is Telstra [21:38:21] is there an old RFP or jsut various email threads about replacing ip with some opaque token for anon edit attribution generally? [21:38:23] 2001:8000::/20 = Telstra, and there is a user on Telstra who anonymously makes many edits to movie articles [21:38:48] bd808: i don't think we ever put together a full RFP [21:39:02] should be some old threads here and there, but don't know offhand where they'd be hiding [21:39:33] I know Ive seen the chatter, probably comes up around the same time as TOR discussions in my head [21:39:45] the idea I floated earlier about using the first IP address of the session, that is new today [21:40:09] RFP? [21:40:15] RFC I meant [21:40:24] Yeah, that's what I thought the first time :) [21:40:24] * brion throws random letters around :D [21:40:32] Then brion copied it [21:40:52] "request for proposal" vs "request for comment" [21:41:16] brion: Let's finish MCS before working on the auto-account plans? :-) [21:41:21] it allows editors to continue doing questionably ethical things like whitehouse wikipedia edits twitter stream [21:41:47] bd808: That does indeed seem more likely than 'Reversed field pinch' (enwiki disambig) [21:41:54] I was thinking there would be some noise from that camp about hiding ips [21:42:13] ethically questionable things that reveal more ethically questionable things yes :D [21:42:15] We'd want a pretty wide discussion phase, yes. [21:42:19] making the first ip seen sticky makes some sense for session continuity [21:42:55] yeah, kinda like how autoblocks can retro-block a user ip... i like it [21:43:57] I imagine though it will mean a lot of mobile users will get assigned IPs from wherever they frequently go or use hotspots. E.g. you start editing an article in Starbucks using Google's hotpost, then you go to work or home, but that's your session IP. [21:44:00] Anyway :) [21:44:18] hmmmm, good point [21:44:36] as opposed to getting your random IP from your mobile carrier. [21:44:54] seems like an improvement [21:45:07] better to be the starbucks guy than a random frequently-changing collection of IP addresses [21:45:26] maybe people will set up places where you can claim nice-looking IP addresses to start your session from, and then once you have it, you can go anywhere. [21:45:39] heh [21:45:43] vanity anons? [21:45:43] like buying a hat, or a t-shirt. [21:45:46] Yup [21:45:59] leetspeak IP addresses for sale, 1$ to join this hotspot. [21:45:59] whitehouse kids science day wifi [21:46:27] no money back if you clear your cookies by accident [21:46:51] I'm pretty sure we're in a fictional universe now [21:46:56] yeah [21:47:15] * anomie happens to peek in IRC from vacation, and notes that requiring a CU to be able to do any rangeblocks of persistent vandals should probably be discussed with the people who do anti-vandal stuff on the big wikis. [21:47:26] or at least a well constrained set of end users :) [21:48:20] Aye, good point. Right now, if you somehow find out which IPs a user is using (e.g. because they frequent the same article or other pattern), you can block them one by one. [21:48:45] If we assign the first IP as the session name, presumably the admin can only block that one IP (and their session, which they can reset by clearing cookies) [21:49:08] we may have to introduce an anonymous way to also block other IPs used, though that makes logs difficult. [21:49:08] any 'sticky ip' cookie should be opaque on the client side or its a new vector for range hopping for sure [21:49:39] or at least the most recently used IP. [21:49:51] we have autoblocks for logged-in users, they only apply to the most recently used IP [21:49:52] without revealing it to the admin [21:50:14] TimStarling: Ah, interesting. And that doesn't rely on the cookie solely? [21:50:45] there is the newish autoblock cookie that carries the block like a virus to new ips too [21:50:47] I didn't know we put the logged-in's user's latest IP in the ipblocks table when blcoking them with autoblock [21:51:07] I thought that'd require CheckUser to be installed to get the IP after the fact [21:51:40] recent IP addresses are stored in rc_ip, it uses that [21:51:50] this predates CU [21:52:11] Block:;defaultRetroactiveAutoblock() [21:52:12] Interesting [21:52:20] TIL :) [21:52:43] Yeah, I was thinking about rc_ip, but I thought that was deprecated/null for most stuff [21:53:09] nope, I am looking at it right now on en.wp, it is populated [21:53:26] wgPutIPinRC=true by default, and not disabled in wmf-config [21:53:28] cool [21:53:47] I always thought it was false in prod because CU is installed "instead". [21:53:48] Anyway [21:54:00] we probably didn't want to break autoblocks [21:54:07] when we installed CU [21:55:03] (Thanks, TimStarling and Krinkle and All) [21:55:47] what are the action items? should I write up this idea of a sticky IP address? [21:56:16] prolly [21:56:48] #action TimStarling to write up the idea of a persistent IP address associated with a cookie [21:58:45] ok, I guess that is all [21:58:52] #endmeeting [21:58:52] Meeting ended Wed Aug 2 21:58:52 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [21:58:52] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-08-02-21.05.html [21:58:52] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-08-02-21.05.txt [21:58:52] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-08-02-21.05.wiki [21:58:53] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-08-02-21.05.log.html [21:59:12] Yeah, we should also make sure there is a team that is interested in working on it, perhaps they'd like to work on the draft as well.