[00:01:47] Awesome thanks andrewbogott [00:03:36] Knowledge in action :) [00:31:06] RECOVERY - Puppet run on tools-exec-1215 is OK: OK: Less than 1.00% above the threshold [0.0] [00:34:30] bd808: i wasn't around, just got back. i see it was literally a "rabbit hole" [00:34:41] memorizes "restart rabbit" [00:39:17] RECOVERY - Puppet run on tools-exec-1207 is OK: OK: Less than 1.00% above the threshold [0.0] [00:56:09] 10Tool-Labs-tools-Global-user-contributions: GUC: Russian wikis have broken url (http:/// instead of http://) - https://phabricator.wikimedia.org/T94351#2245699 (10Krinkle) [00:56:11] 10Tool-Labs-tools-Global-user-contributions: Global user contributions: Support wildcard in username - https://phabricator.wikimedia.org/T66499#2245700 (10Krinkle) [01:29:45] 06Labs, 10Labs-Infrastructure, 10labs-sprint-117, 10labs-sprint-119, and 3 others: Allocate labs subnet in dallas - https://phabricator.wikimedia.org/T115491#2245799 (10chasemp) [01:29:49] 06Labs, 10Labs-Infrastructure, 10labs-sprint-117, 10labs-sprint-119, and 3 others: Allocate labs subnet in dallas - https://phabricator.wikimedia.org/T115491#1725570 (10chasemp) [01:29:53] 06Labs, 10Labs-Infrastructure, 10labs-sprint-117, 10labs-sprint-119, and 3 others: Allocate subnet for labs test cluster instances - https://phabricator.wikimedia.org/T115492#2245801 (10chasemp) [01:29:57] 06Labs, 10Labs-Infrastructure, 10labs-sprint-117, 10labs-sprint-119, and 3 others: Allocate subnet for labs test cluster instances - https://phabricator.wikimedia.org/T115492#1725583 (10chasemp) [02:25:00] RECOVERY - Puppet run on tools-webgrid-generic-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [03:09:45] 06Labs, 10DBA, 13Patch-For-Review: Move labs pdns database off of m5-master - https://phabricator.wikimedia.org/T128737#2246254 (10Andrew) I've been looking for a while and can't figure out why pdns would be hitting localhost mysql. There is a localhost mysql config in /etc/powerdns/pdns.d but I don't know... [03:15:01] 06Labs, 06Operations, 10wikitech.wikimedia.org: intermittent nutcracker failures - https://phabricator.wikimedia.org/T105131#2246288 (10chasemp) [03:22:51] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review, 03ToolLabs-Goals-Q4: Move LabsDB aliases to DNS - https://phabricator.wikimedia.org/T63897#2246336 (10scfc) [04:00:13] !log tools deleted all precise webservice jobs, waiting for webservicemonitor to bring them back up [04:00:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [04:15:11] !log tools delete half of the trusty webservice jobs [04:15:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [05:55:08] (03PS10) 10BryanDavis: Rewrite jsub in python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/285435 (https://phabricator.wikimedia.org/T132475) [06:08:22] (03CR) 10BryanDavis: Rewrite jsub in python (035 comments) [labs/toollabs] - 10https://gerrit.wikimedia.org/r/285435 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [06:48:33] PROBLEM - Puppet run on tools-k8s-etcd-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:59:46] PROBLEM - Puppet run on tools-proxy-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:14:49] morning :) [07:15:05] I have a problem with a webservice that is now flapping [07:15:19] because of some invalid configuration in /var/run [07:15:43] error message is [07:15:44] 2016-04-28 07:15:07: (configfile.c.957) source: /var/run/lighttpd/wikidata-primary-sources line: 96 pos: 1 parser failed somehow near here: (EOL) Duplicate config variable in conditional 0 global: fastcgi.server [07:16:30] can someone clean this file up? I guess it is auto generated [07:23:38] RECOVERY - Puppet run on tools-k8s-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [07:34:53] RECOVERY - Puppet run on tools-proxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:34:53] PROBLEM - Host tools-bastion-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.228) [09:22:25] 06Labs, 10DBA, 13Patch-For-Review: Move labs pdns database off of m5-master - https://phabricator.wikimedia.org/T128737#2247455 (10jcrespo) FYI, labtestservices2001: ``` Notice: /Stage[main]/Dnsrecursor::Labsaliaser/Exec[/usr/local/bin/labs-ip-alias-dump.py]/returns: Traceback (most recent call last): Notic... [10:11:46] 06Labs, 10Tool-Labs: Restore replica.my.cnf for toolsbeta.admin - https://phabricator.wikimedia.org/T109807#2247637 (10scfc) The process for (re-)generating `replica.my.cnf` is described at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Admin#Regenerate_replica.my.cnf. [10:11:49] 06Labs, 10Tool-Labs: Create replica.my.cnf in my home directory - https://phabricator.wikimedia.org/T131546#2170313 (10scfc) The process for (re-)generating `replica.my.cnf` is described at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Admin#Regenerate_replica.my.cnf. [10:11:52] 06Labs, 10Tool-Labs: [Tool Labs] Database credential file replica.my.cnf missing in my home directory on Tool Labs (/home/wiki13). - https://phabricator.wikimedia.org/T122657#1909976 (10scfc) The process for (re-)generating `replica.my.cnf` is described at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tool... [10:23:27] 06Labs, 10Labs-Sprint-100, 10Tool-Labs, 13Patch-For-Review: Deploy new unified webservice code - https://phabricator.wikimedia.org/T98440#2247685 (10yuvipanda) [10:23:29] 06Labs, 10Tool-Labs, 13Patch-For-Review: Deprecate #no-default-php in .lighttpd.conf - https://phabricator.wikimedia.org/T98818#2247683 (10yuvipanda) 05Open>03Resolved This is done. [10:29:33] 06Labs, 10Labs-Sprint-100, 10Tool-Labs, 13Patch-For-Review: Deploy new unified webservice code - https://phabricator.wikimedia.org/T98440#2247689 (10yuvipanda) I've restarted all webgrid jobs and they're all running new code now! \o/ I need to now get rid of all the old webservice related code and we can... [11:36:21] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2247930 (10hashar) I wrote a stupid resolver for the A records: {P2969} Running it right now from deployment-tin and range `70000-85000`. [11:37:15] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Setup monitoring for kubernetes core components. - https://phabricator.wikimedia.org/T131929#2247931 (10yuvipanda) [11:37:59] (03PS1) 10Youni Verciti: Initial check-in [labs/tools/fr-wikiversity-ns] - 10https://gerrit.wikimedia.org/r/285931 [11:48:38] 06Labs, 10Tool-Labs: Goal: Allow using k8s instead of GridEngine as a backend for webservices (Tracking) - https://phabricator.wikimedia.org/T129309#2247934 (10yuvipanda) [11:48:40] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Write a k8s admission controller to enforce that all containers running come from our private repository - https://phabricator.wikimedia.org/T133515#2247935 (10yuvipanda) [11:48:46] 06Labs, 10Tool-Labs: Setup DNS for kubernetes services - https://phabricator.wikimedia.org/T111914#2247937 (10yuvipanda) [11:48:48] 06Labs, 10Tool-Labs: Goal: Allow using k8s instead of GridEngine as a backend for webservices (Tracking) - https://phabricator.wikimedia.org/T129309#2101957 (10yuvipanda) [11:50:52] 06Labs, 10Labs-Infrastructure, 06Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2247941 (10hashar) Out of 15000 A entries, only one leaked: ``` $ python blam.py --delay 0.1 70000-85000 Start: ci-jessie-wikimedia-70000.contintcloud.eqiad.wmflabs... [11:53:58] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Use a 'pause' container from our private repo, not gcr.io - https://phabricator.wikimedia.org/T133873#2247942 (10yuvipanda) [12:05:19] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Use a 'pause' container from our private repo, not gcr.io - https://phabricator.wikimedia.org/T133873#2248003 (10yuvipanda) I've emailed the google-containers list to ask. [12:48:30] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Use a 'pause' container from our private repo, not gcr.io - https://phabricator.wikimedia.org/T133873#2248261 (10yuvipanda) https://github.com/kubernetes/kubernetes/issues/4896 is the answer! [13:07:21] 06Labs, 10Labs-Infrastructure, 06Operations: Estimate hardware requirements for relevance lab elasticsearch servers - https://phabricator.wikimedia.org/T128433#2248394 (10Gehel) 05Open>03Resolved [13:07:55] 06Labs, 10Labs-Infrastructure, 06Operations: Estimate hardware requirements for relevance lab elasticsearch servers - https://phabricator.wikimedia.org/T128433#2074588 (10Gehel) Closing this as resolved, decision on hardware sizing has been taken on T131184, with input from this task. [13:58:59] PROBLEM - Puppet run on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:00:23] 10Tool-Labs-tools-Other: Wikiviz tool is loading assets from 3rd party sites - https://phabricator.wikimedia.org/T133910#2248583 (10yuvipanda) [14:04:07] Afk for a few [14:21:53] 06Labs, 10DBA: Data missing from June 11/12 on s3.labsdb - https://phabricator.wikimedia.org/T115517#1726338 (10Anomie) >>! In T115517#2234279, @jcrespo wrote: > Now, I cannot guarantee the accuracy of other shards, that is why I am reimporting all, starting from enwiki/s1. I can confirm there was some sort o... [14:29:38] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2248624 (10hashar) [14:33:17] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling, 13Patch-For-Review: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2248644 (10hashar) Note, as we migrate jobs to run on `contintcloud` we will delete some instances from `integration`tenant... [14:36:46] 10Tool-Labs-tools-Other: Wikiviz tool is loading assets from 3rd party sites - https://phabricator.wikimedia.org/T133910#2248665 (10scfc) a:03dhvanil [14:58:03] (03PS1) 10Youni Verciti: Rev 0.1 [labs/tools/fr-wikiversity-ns] - 10https://gerrit.wikimedia.org/r/285966 [15:06:33] (03CR) 10Youni Verciti: "This tool would help on fr-Wikiversité to manage namespaces Faculté and Département." [labs/tools/fr-wikiversity-ns] - 10https://gerrit.wikimedia.org/r/285966 (owner: 10Youni Verciti) [15:32:48] 06Labs, 07Epic: Protect end-user privacy by restricting non-consentual third-party browser interactions - https://phabricator.wikimedia.org/T133919#2248842 (10bd808) [15:33:42] 06Labs, 07Epic: Protect end-user privacy by restricting non-consentual third-party browser interactions - https://phabricator.wikimedia.org/T133919#2248858 (10bd808) [15:33:44] 06Labs: Add Content-Security-Policy header enforcing 3rd party web interaction restrictions to proxy responses - https://phabricator.wikimedia.org/T130748#2248860 (10bd808) [15:33:46] 06Labs, 10WM-Bot, 07Privacy: http://wm-bot.wmflabs.org/browser/ is loading assets from multiple 3rd party domains - https://phabricator.wikimedia.org/T133644#2248859 (10bd808) [15:34:11] 06Labs, 07Epic: [EPIC] Protect end-user privacy by restricting non-consentual third-party browser interactions - https://phabricator.wikimedia.org/T133919#2248842 (10bd808) [15:34:35] 06Labs, 07Epic: [EPIC] Protect end-user privacy by restricting non-consentual third-party browser interactions - https://phabricator.wikimedia.org/T133919#2248866 (10bd808) p:05Triage>03Normal [15:48:41] 10Tool-Labs-tools-Other, 06Community-Tech, 07I18n: [[Wikimedia:Pageviews-num-languages/en]] needs PLURAL - https://phabricator.wikimedia.org/T133766#2248881 (10Aklapper) I guess taht string comes from http://tools.wmflabs.org/pageviews-test/ [16:25:56] 10Tool-Labs-tools-Other, 06Community-Tech, 07Category, 07Community-Wishlist-Survey: Pageview Stats tool - https://phabricator.wikimedia.org/T120497#2248965 (10kaldari) [16:41:41] 10Tool-Labs-tools-Other, 06Community-Tech, 07Category, 07Community-Wishlist-Survey: Pageview Stats tool - https://phabricator.wikimedia.org/T120497#2248994 (10kaldari) [16:42:39] 10Tool-Labs-tools-Other, 06Community-Tech, 07Category, 07Community-Wishlist-Survey: Pageview Stats tool - https://phabricator.wikimedia.org/T120497#2248998 (10Johan) [17:19:51] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Get a real (letsencrypt) cert for labtestwikitech.wikimedia.org - https://phabricator.wikimedia.org/T133167#2224113 (10fgiunchedi) this is completed for now, though I had to uncomment the stanza in /etc/acme/challenge-apache.conf to make apache2.... [17:22:40] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Get a real (letsencrypt) cert for labtestwikitech.wikimedia.org - https://phabricator.wikimedia.org/T133167#2249107 (10Krenair) 05Open>03Resolved [17:33:57] 06Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure, 06Operations, and 2 others: Clean up labs graphite datapoints - https://phabricator.wikimedia.org/T111540#2249119 (10Krenair) [17:45:47] 06Labs, 10WM-Bot, 07Privacy: http://wm-bot.wmflabs.org/browser/ is loading assets from multiple 3rd party domains - https://phabricator.wikimedia.org/T133644#2249157 (10bd808) >>! In T133644#2249126, @Technical13 wrote: > I'm confused, if the page is only getting assets, why would any data (IP or otherwise)... [18:03:31] 06Labs, 06WMF-Legal, 07Epic: [EPIC] Protect end-user privacy by restricting non-consentual third-party browser interactions - https://phabricator.wikimedia.org/T133919#2249232 (10ZhouZ) [18:03:42] 06Labs, 06WMF-Legal, 07Epic: [EPIC] Protect end-user privacy by restricting non-consentual third-party browser interactions - https://phabricator.wikimedia.org/T133919#2249234 (10ZhouZ) [18:09:10] 06Labs: Add Content-Security-Policy header enforcing 3rd party web interaction restrictions to proxy responses - https://phabricator.wikimedia.org/T130748#2249253 (10Tgr) Sentry has some custom code for interpreting CSP reports ([[https://github.com/getsentry/sentry/issues/729|ticket]], [[https://github.com/gets... [18:34:18] bd808, why can't we view IPs in the access.log or error.log files? [18:36:32] the X-Forwared-For header isn't processed and logged by the default lighttpd config [18:36:47] so the ip you will see is the ip of the proxy server [18:37:03] I think this is a feature rather than a bug [18:38:24] I'm pretty sure the XFF header is passed by the proxy so you could get the client IP if you needed it [18:39:10] but there are really very few legitimate needs for client IPs in a tool [18:39:37] I need them for my tool because it does ban tracking for IRC. [18:40:37] And it directs them to KiwiIRC, and it's good to have the IPs before they access IRC, because they're easier to deter before they get on. [18:40:57] you have a tool that is running kiwi? [18:41:07] No, it just directs to Kiwi. [18:41:13] the one on kiwiirc.com. [18:41:17] *nod* [18:41:47] bd808, how would I go about getting the XFF header? [18:42:14] Do you need it in the logs or in some programming lanaguage? [18:42:35] Getting it in python would be nice. [18:42:52] The logs would be good too, so I can see where the trolls are mainly coming from. [18:43:24] It would be accessible like any other http header. The raw header name is X-Forwarded-For [18:45:14] * bd808 tests getting the XFF header in logs [18:45:17] tom29739: Us not seeing the ip's is by design afaik, part of the labs privacy measures [18:45:31] Oh. [18:45:48] but any other website I go on will have my IP. [18:45:59] yeah it looks like the XFF may be stripped. I thought it was passed [18:46:01] Wonder where it was documented again. Somewhere on wikitech afaik. YuviPanda probably knows best [18:47:07] 06Labs, 10DBA, 13Patch-For-Review: Move labs pdns database off of m5-master - https://phabricator.wikimedia.org/T128737#2249419 (10Krenair) Thanks for pointing that error out @jcrespo, I have left a comment on https://gerrit.wikimedia.org/r/#/c/280768/ [18:48:29] I tired logging with this lighttpd config change and got missing XFF values -- https://fak3r.com/2008/01/09/howto-log-the-users-ip-not-the-proxys-in-lighttpd-access-log/ [18:48:36] *tried [18:50:10] bd808: According to https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#What_can_and_can.E2.80.99t_be_done_with_user_information.3F you can't just collect ip addresses [18:51:01] oh, tom29739 ^ [18:51:02] bd808: XFF isn't set for tools. Is set for novaproxy [18:51:13] multichill: ^ [18:51:36] Yeah, the header is stripped by the looks of it. [18:51:43] That's a pain. [18:51:45] YuviPanda: ah. I knew that we had talked about how it worked. I forgot that difference [18:52:32] I'd want us to move to whitelisted XFF too at some point... [18:52:41] We do pass geo IP cookie :/ [18:52:52] novaproxy also didn't do it, but the UTRS project wanted it and they had very good reasons... [18:52:59] bd808: oh? from where? who sets it? [18:53:09] can you make exceptions with whitelisted XFF? [18:53:22] What's the geo IP cookie? [18:53:23] that's a good question. I thought we set it in varnish in prod [18:53:34] bd808: yup, we do. [18:54:05] I have one set to .wmflabs.org which wouldn't be behind any varnish [18:54:13] *set on [18:54:46] What is the geoIP cookie though? [18:54:52] isn't geo IP cookie the same leak of data for dynamic ip? [18:55:15] tom29739: the MaxMind database lookup based on the ip address [18:55:36] So wouldn't that be a privacy violation too then? [18:55:45] phe: slightly less leakage, but very similar [18:56:40] If you visit https://tools.wmflabs.org/bd808-test/si.php and go down to the "PHP Variables" section you can see everything that your browser and the intervening servers are passing to PHP [18:56:41] Not really, many ip's match to the same location tom29739 [18:57:05] My IP maps to my town, it's got more accurate in recent times. [18:57:11] hmm, so if I just curl tools.wmflabs.org I don't see any cookies being sent, so it isn't us. it's some tool that's setting it for far too much [18:57:19] (03PS98) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [18:57:32] For some people it geolocates to their front door. [18:58:17] YuviPanda: I bet it is the beta cluster varnish [18:59:44] YuviPanda: confirmed that if I drop the cookie and stay in tools.wmflabs.org it doesn't come back [19:00:11] And then hitting beta brings it back [19:01:05] If I use/collect IPs in IRC, then does that count as 'collecting Private Information'? [19:01:13] bd808: ah, right. and they do get your IP [19:01:19] Like, to analyse for bans and blocks and the like/ [19:01:25] tom29739: yes [19:01:55] IP adresses are considered PII (personally identifiable information) [19:02:40] But I'd have thought that nearly all IRC bots do that. [19:02:51] If they log, then I should imagine they do. [19:03:14] you only see the ip of an irc user if they don't have a cloak [19:03:31] unless you are doing DCC with them [19:04:45] That's what I mean though. [19:05:25] If you're logging the channel, and anyone without a cloak comes in, and you are logging, then you are collecting the IPs. [19:05:36] true [19:06:19] So by that definition, all IRC logging bots should have a great big disclaimer on them. [19:06:20] that gets tricky to interpret via the published policy [19:06:50] well except that disclaimer would need to be IRC facing [19:07:01] and that's where it all gets weird [19:07:21] because there is no way to "opt-in" to that data exposer [19:07:31] If I get a VPS or anything, then I can see IPs. I can see them if I run a server of any kind. [19:07:31] *exposure [19:07:44] yes, that is how tcp/ip works [19:07:57] it's not the capacity to get IP which is disallowed, irc bot are fine if they don't keep IP imho [19:08:30] Why was that policy introduced in the first place? If I visit any website on the internet, my IP is logged. It's a fact of the internet. [19:08:53] s/is logged/can be logged/ [19:09:30] It will be logged, in 99.999999999999% percent of cases. [19:09:57] If Google Analytics is used, for instance, Google will log the IP. [19:10:09] Are you an active editor Wikipedia on tom29739? You know how sensitive ip info of logged in users is there [19:10:15] tom29739: think about it from this perspective, the WMF could log and correlate all of the interactions you have with the wikis. We choose to limit this purposefully [19:10:36] It can still be accessed. [19:10:39] we ask that those using our shared resources do the same [19:10:50] (tool)labs is integrated quite a lot with production sites so being careful with this kind of info is also a thing here [19:11:35] What if there is a legitimate reason for needing the IPs? [19:11:46] (03CR) 10Ricordisamoa: "PS98 some refactoring in formatters.py" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [19:12:15] tom29739: Follow the guidelines at https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#What_can_and_can.E2.80.99t_be_done_with_user_information.3F [19:13:16] EarwigBot uses the IPs to check users coming into #wikipedia-en-help for blocks on Wikipedia. Would that be construed as logging IPs, if it just checks them for blocks? [19:13:29] no [19:13:45] tom29739: on a website, your IP is logged for sure but it is not disclosed publicly [19:14:00] Anyone can run a website though. [19:14:05] tom29739: on IRC you are joining a worldwide public channel and the protocol expose your IP to everyone [19:14:16] so that is an opt-in disclosure of your IP made "volontary" [19:14:19] That's the point. [19:14:48] but if I go to tools.wmflabs.org/ my IP is not supposed to be disclosed and should be kept on the labs infra [19:14:56] if someone / some tool leak my IP, that is a problem [19:15:14] What if it's not leaking the IP? [19:15:45] It limits the uses of IRC ban tracking stuff for instance. That would be extremely useful in -help. [19:15:48] information theory says if it is storing it then it will eventually leak it [19:15:54] tom29739: what is it you are trying to do? following along but not sure out on the specific question [19:17:20] so if your tool has some IP logged you are dealing with private informations that should be taken care of with extreme precaution. The safest way is to drop them (i.e. not log the IP) [19:19:04] in France the law state that hosting services have to keep a log of a bunch of private info for up to 1 year [19:19:27] so if you host a public website, you got to keep a year worth of log around which is really no fun to handle :( [19:19:34] (need backup / offsite backup etc) [19:19:38] some of the (if not a lot of the) predilection towards not exposing IP's into tools is that conforming to '...Purge, anonymize, or aggregate any Private Information you store no more than 30 days after storing it;' is quite difficult and auditing it more-so [19:20:12] chasemp, my tool, that I am developing, is the replacement for the ircredirect tool that is currently used for the #wikipedia-en-help channel. It will get the user's username from their Wikimedia SUL account using OAuth, if they have an account, and if not, then generate a random IP. The tool would get the ban list for the -help channel, put it in a database, and compare it with the people being redirected, and if that IP was banned in say, [19:20:13] the last week, then it would notify the helpers in -helpers, and put in extra precations to stop trolls, like captchas and the like. [19:21:09] I need to store the IPs in the ban list for at least a week. [19:23:02] ban list from help is a list of banned IP's? [19:23:07] tom29739: that is the IRC bannlist isn't it? [19:24:04] Yes/ [19:24:29] well it should be publicly available already [19:24:35] just: /mode #wikimedia-labs b [19:24:40] and the preliminary issue is that the tools proxy by nature does not expose XFF for the tool to use to compare to the ban list IP [19:24:43] should yield the list of nick!finger@host [19:25:54] the ip they are connected to $tool on is not necessarily the same one they are connecting to IRC w/ also yeah [19:25:56] The thing about that list though, by virtue of storing it, my tool is storing private information. [19:26:18] Why wouldn't it be the same IP? [19:26:46] usually proxies are by protocol so http/https etc for caching purposes [19:27:12] it's not uncommon for say a university to push students through a proxy that is ip 1 but irc is handled differently and would appear as ip 2 [19:27:13] KiwiIRC works by proxy. [19:27:56] You access the web interface, and it gets your IP, and puts it in the realname and username fields. [19:28:16] that's how KiwiIRC works. [19:28:21] iiuc the policy essentially, don't store this info unless you absolutely have to, and if you have to then don't store it longer than 30 days and generally you are subject to https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#What_can_and_can.E2.80.99t_be_done_with_user_information.3F [19:28:40] If you do collect any Private Information, you must: [19:28:40] Clearly communicate to End Users a) that Private Information is being collected, b) how you will use it, and c) when you will delete it; [19:29:09] that basically means I have to notify every single user that comes on IRC, that their IP is being logged. [19:29:11] sure, and deletion period being <=30 days [19:29:36] well, you said they hit the web interface first why not a notice there? [19:29:52] tom29739: does the channel have a "this channel is logged" disclaimer in the topic? [19:29:55] No. [19:30:04] No logging is allowed. [19:30:16] o_O except by you? [19:30:26] In fact, there isn;t anything in the channel notice about it. [19:30:33] Nothing at all. [19:30:57] In -helpers there is, but even that says just 'No public logging'. [19:31:16] If the tool did log, then it wouldn't do it publicly. [19:31:24] I'm still not getting which IPs you want to log then [19:31:34] The banlist. [19:31:51] that's IRC server configuration [19:32:02] And the IPs coming in, or the web interface ones. [19:32:17] So the tool can compare the 2. [19:32:31] why store the web traffic users if the idea is to only compare in real time to the stored ban list? [19:32:39] ^ that [19:32:54] but then the next problem is that as a tool you can't have this data [19:33:00] If the IP is on the ban list, then they are banned. Not let into the channel. [19:33:08] because we strip it at the http proxy [19:33:25] It;s a ban tracker, because bans are auto removed by the eir bot after a default of 24 hours. [19:33:32] so you would need to move outside of the tools project to get the XFF header [19:33:43] How do I do that? [19:33:52] Would I need my own labs project? [19:34:02] tom29739: ok my understanding is, user hits your web interface, they put in their info, you compare their source IP to that of the IP's on the banlist (you store the banlist) [19:34:09] yes or another hosting solution [19:34:12] so storing their source IP gets you nothing really for that real time compare [19:34:31] Yes, but the banlist for the past week. [19:35:02] the ban list is a red herring here [19:35:17] you can get it from freenode [19:35:26] IP's banned by a random IRC channel AFAIK are not under the same purview as IP's that hit properties under wmflabs.org and even so you can't keep for more than 30 days, but either way in tools you can't do this now [19:35:58] So I need my own labs project, or my own hosting solution? [19:36:30] to get access to the IPs of the users hitting your website, yes [19:36:31] to see the end user's IP address over http/https, yes [19:47:56] for labs-ops, I filled a task to bump the quota of the contintcloud project from 20 to 40 instances (max). Seems labs can handle it, I provided a bunch of metrics on https://phabricator.wikimedia.org/T133911 [19:48:15] that is part of migration moaaaar jobs to that labs project :} [19:56:31] 06Labs: Create new labs project for ircredirector - https://phabricator.wikimedia.org/T133941#2249641 (10tom29739) [19:56:50] ^ chasemp, bd808, did I do that right? [20:15:43] 06Labs: Create new labs project for ircredirector - https://phabricator.wikimedia.org/T133941#2249691 (10bd808) It seems that the only component of the planned project that needs the client IP address via HTTP/HTTPS is "All users using the tool web interface would have their IPs compared against this database, a... [21:05:07] (03CR) 10BryanDavis: "check experimental" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/285435 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [21:14:15] (03PS11) 10BryanDavis: Rewrite jsub in python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/285435 (https://phabricator.wikimedia.org/T132475) [21:16:29] (03CR) 10BryanDavis: "The desired tox test runner from Ie2ac8ff is active, but it is only run via "check experimental" for now. It can be made default either ju" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/285435 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [21:27:07] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 13Patch-For-Review, 15User-bd808: Rewrite jsub in python - https://phabricator.wikimedia.org/T132475#2199735 (10bd808) [21:27:57] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 13Patch-For-Review, 15User-bd808: Rewrite jsub in python - https://phabricator.wikimedia.org/T132475#2199735 (10bd808) [22:13:26] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Diffusion, 15User-bd808: Create application to manage Diffusion repositories for a Tool Labs project - https://phabricator.wikimedia.org/T133252#2250141 (10kaldari)