[11:03:56] qchris: ping [11:05:05] Hi yurik. [11:05:25] hey, i just read your comment - this change was the result of discussion with dee [11:05:47] Oh ... ? [11:05:51] Ok. [11:06:07] Let me clarify on gerrit then. Thanks. [11:06:27] yep - he said that you guys need to replicate all the logic of IP recognition and also do the ENABLED/DISABLED for zero [11:06:40] I will write some python code that shows how we currently handle that based on the configuration [11:07:25] but in order for you to avoid doing all the complex IP matching for zero requests, I can easily add that info to the x-analytics [11:07:29] qchris: ^ [11:07:57] Wait ... drdee said we'll consider en-/disabled? [11:08:18] (Or want to consider it?) [11:08:45] Mhmm ... if drdee said that ... he's our product owner. [11:09:09] qchris: well, you need to figure out when ZERO was on or off for the client [11:09:33] Up to now, we didn't do that. We considered when it was free and when not. [11:09:46] (Which typically is a long time after it got enabled) [11:09:51] but how do you know when its free? [11:10:00] https://wikimediafoundation.org/w/index.php?title=Mobile_partnerships#Where_is_Wikipedia_free_to_access.3F [11:10:07] Column "Free as of" [11:10:24] Analytics/Wikipedia-Zero agreed on that some weeks (months?) back. [11:10:33] X-CS id in header is an ok indicator at the moment because we duplicate some of that logic in varnish [11:10:38] but we will remove it soonish [11:11:04] at which point we will mark ALL requests from partner IPs with X-CS [11:11:19] for all wiki languages & sites [11:11:27] (mobile only) [11:11:28] Yes, I know about the X-CS/X-Analytics change [11:11:40] and that we'll have to adapt kraken to reflect that. [11:11:52] But it seems that you and drdee are (have been) discussing that [11:11:57] So I just follow your lead. [11:12:36] so that's when you will have to do the actual analysis - some magic code that i will return true/false based on HOST and historical configurations [11:12:56] and possibly other stuff [11:13:01] sounds good :) [11:13:23] Yes, it will mean some changes for our end. But if drdee knows about them, he will schedule them in time :-) [11:14:23] your trust in him is ... so rare in our days :D [11:15:31] Well ... it's his job to do these kind of things. [11:15:38] Why shouldn't I trust him? [11:15:39] The third party API is not responding [11:16:09] Meh CommanderData bot. Your luck that I lack priviledges to kick you. [13:15:37] milimetric: hi [13:15:40] qchris_away: hi [13:15:45] hi average [13:15:59] milimetric: can't login to the wikimetrics machine [13:16:10] oh logged in now [13:16:12] it has a really long lag a lot of times for me [13:16:14] took like ~1m [13:16:16] like really long [13:16:16] yea [13:16:22] I'm not sure why [13:16:23] I was told to stop working on that when something important came up [13:16:44] when I log in I always look for memory / cpu usage issues [13:16:47] and everything is low [13:16:56] but some monitoring would be nice [14:43:32] hey average/qchris: are you guys done with https://gerrit.wikimedia.org/r/#/c/92501/3/wikimetrics/database.py [14:43:35] can I delete it? [14:43:53] ?? Looking... [14:44:17] I think it's good to be abandoned. [14:44:18] Yes. [14:44:20] k [14:44:36] (Abandoned) Milimetric: Test [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/92501 (owner: Stefan.petrea) [14:51:05] wha whaaaaa [14:51:15] who's got a loaner laptop with a brain transplant? [14:51:16] this guy! [14:53:46] ! [14:53:49] yay, welcome back ottomata [14:55:38] qchris: I'm sorry to bother you, but I'm confused again [14:55:53] (gerrit of course) [14:55:54] Hi ottomata! \o/ [14:55:58] milimetric: sure. [14:56:03] Booting google machine [14:56:06] ok, so https://gerrit.wikimedia.org/r/#/c/92581/ [14:56:18] I pushed patchset 3, then rebased on average's changes [14:56:34] locally, I'm on a branch dan.card.818 [14:56:47] and I want to do more work [14:56:56] you're saying the right thing to do is to make another changeset? [14:57:04] based on this one? [14:57:33] Is the local rebased one the same as PS4 on gerrit? [14:57:39] no [14:57:44] local is the same as PS3 [14:57:53] Ok. [14:58:04] So we're ignoring PS4. That's ok. [14:58:17] But you will not be able to submit PS3 [14:58:20] well, no, I don't want to ignore it [14:58:38] Now I am also confused. [14:58:56] I didn't say to ignore PS4 [14:59:01] I just said now I want to do more work [14:59:30] so I'm not sure where to put the "more work" [14:59:31] I am in the batcave [14:59:34] k [15:11:09] k, putting screws back in compy, be back on in juuuuuust a bit [15:39:57] helloooooooo Snaps! [15:40:11] yotto! [15:43:00] yoyooy [15:43:10] i am back! on a loaner compy with a brain transplate [15:43:15] transplant [15:44:24] so, i want to get the librdkafka and varnishkafka packages all ready for deployment [15:44:27] rock'n'roll! [15:44:33] i have a buncha work to do with monitoring on the other side of things [15:44:41] are you doing more work on varnishkafka right now? [15:45:08] can I get push rights to librdkafka? [15:45:32] nopes nothing ongoing in varnishkafka, only remaining thing is config namespace renaming. But it's purely cosmetic. [15:45:35] yeah, sure [15:46:14] there, you're in [15:48:39] you're gonna do stuff on debian packaging, right? Sync with paravoid since he might be in motion on that [15:50:38] yeah just debian packaging, was just going to up the changelog number [15:51:06] hm ok ,i'll wait to hear from him before I push anything [15:51:20] but ok, let's figure out the config namespace stuff now then [15:51:32] we should get it at stableish before we build the official −1 version [15:51:36] once we build those i'll put in apt [15:52:23] sounds good [15:53:18] namespace: we've got the varnishlog related stuff on one side, and the varnishkafka own syslog/debug/tracing stuff on the other side (and rdkafka with its own "kafka." prefixed namespace) [15:55:33] any reference to "log" will be ambigious [15:55:47] so 3 namespaces [15:55:47] varnish [15:55:47] varnishkafka [15:55:47] kafka [15:55:58] yeah [15:57:28] so [15:57:37] maybe we should just try to prefix everything with a namespace? [15:57:48] log.data.copy is a varnishlog related config, right? [15:57:49] so maybe [15:57:56] varnishlog.data.copy? varnish.log.data.copy? [15:58:19] q: for the kafka related configs [15:58:26] they are 1:1 mapped with librdkafka configs, right? [15:58:35] so i assume it'd be annoying to try to change them? [15:59:39] yeah, we dont want to change the rdkafka ones, but we can prefix them with "kafka." without a problem. [16:00:44] oh that's ok? [16:00:45] hm [16:01:02] varnish.kafka.sequence.number [16:01:03] varnishkafka.sequence.number [16:01:03] varnishkafka.log.level [16:01:03] what prefix should we use for varnishkafka's stuff (syslog, et.al)? we dont want "varnishkafka." since it is likely to change name soon and a bit redundant to self-reference [16:01:03] ? [16:01:56] any interest is doing standup now? [16:05:40] now! i could do now [16:06:28] lets do it -- I'll jump on the hangout [16:06:51] just me andyou! ok [16:08:16] we are in the hangout sir [16:08:48] headphones [16:13:43] Snaps: maybe we should figure out what the name is going to be? [16:14:51] good start :) [16:23:46] I need to friday-dinner my family. we must do this some other time :| [16:25:42] oohhhhh ok [16:25:52] ok maybe i'll work on the monitoring stuff today after all :) [17:18:34] yoo qchris, i'm working on building kafka-ganglia now with 2.2.0 or whatever [17:18:37] getting a different error now [17:18:44] Failed to execute goal on project kafka-ganglia: Could not resolve dependencies for project com.criteo.kafka:kafka-ganglia:jar:1.0.0: The following artifacts could not be resolved: com.sun.jdmk:jmxtools:jar:1.2.1, com.sun.jmx:jmxri:jar:1.2.1: Could not find artifact com.sun.jdmk:jmxtools:jar:1.2.1 in nexus (http://nexus.wmflabs.org/nexus/content/groups/public) [17:19:35] ottomata: That sounds like you need to update the log4j part. [17:19:45] In the email I sent you, there was a patch attached [17:19:58] it bumped log4j from 1.2.15 to 1.2.16 IIRC [17:20:08] That should make the error disappear [17:20:42] ahaha ok [17:20:42] see it [17:20:42] what about maven-compilre-plugin [17:20:42] should I updated taht too? [17:20:46] ja see that [17:20:55] ok cool [17:21:43] You can set the maven-compiler-plugin version, but it's just a warning :-) [17:21:53] Now back to eating *nom nom* :-) [17:22:04] AGH [17:22:12] om.yammer.metrics.reporting.GangliaMessageBuilder is not public in com.yammer.metrics.reporting; cannot be accessed from outside package [17:22:22] i extended this to make multicast work [17:22:23] gaahhhh [17:22:26] guess i can't [17:45:06] * milimetric hates sealed / final classes [18:14:56] average: please start checking flake8 before you submit patchsets [18:15:31] because of the way we're using git now, I end up seeing your flake8 messages mixed with mine [18:15:47] so it just makes it harder for me to clean as I go [18:15:59] (especially because I can't just easily fix them and commit) [18:18:25] depending on what you are trying to achieve ... for a hack it typically suffices to inherit a public class from the non-public class in the package the non-public class lives in. [18:19:02] But the clean way is to get upstream to expose a sane API [18:19:04] ottomata: ^ [18:19:17] (PS1) Milimetric: Bringing the UI up to speed with async validation [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/93081 [18:19:39] qchris, I don't want to try to get anything into upstream Metrics code [18:19:44] because they are already on v 3.x [18:19:50] and this is v 2.20 or something [18:20:13] So if you're fine with hacks, the above should help to get around the visibility problem. [18:20:26] uhh, but then i'd have to package my own Metrics code, right? [18:20:34] in order to add the public class? [18:20:45] i mean, if I go that far I could put the multicast class into Metrics [18:20:51] but I'd really rather not bother i think [18:21:00] :) this is the rabbit hole [18:21:07] I agree, don't go down rabbit holes [18:21:25] The public class in the package of the non-public class can live in our jar. [18:21:31] It need not live in the upstream jar. [18:21:43] oh i just give it a package name of the same thing [18:21:44] milimetric: If I were you, I would completely stop review-ing everything on change 91207 until I finish the stuff I need to do. In fact I'll write that in the commit message so it will be more obvious. [18:21:45] hm [18:21:52] ottomata: exactly. [18:21:56] It's not reviewing average [18:21:57] does it have to be in the same dir hierarchy? [18:22:06] as the package [18:22:06] ? [18:22:09] since my change is based on your change, according to the model you agreed to, [18:22:12] I get all your changes [18:22:13] com.yammer, bla bla [18:22:16] ? [18:22:23] ottomata: Yes. That's the package name. [18:22:33] so your flake8 problems are trickled throughout unfortunately [18:22:34] right, but then the dirs have to be com/yammer/blabla [18:22:44] ottomata: Yes. [18:22:47] hm ok [18:22:53] Then don't use my change. There is a reason it's not merged yet, it's because it's not ready yet. [18:22:54] I'm just setting a small, semi-hard requirement for when you commit [18:22:57] ottomata: It's not nice. But that's what hacks are :-/ [18:23:11] If you do use patchset from an unmerged change then you agree to cope with the problems of not-yet-ready changes. [18:23:29] average - I get what you're saying. And I'm sort of sorry to do this, but I'm making this a rule [18:23:39] please only commit once you have flake8 issues resolved [18:23:47] if you forget once or twice it's fine [18:23:54] Yet another rule in a long chain of rules. Thanks [18:24:14] ?? [18:24:15] Put your rule in .travis.yml [18:24:18] milimetric: Why do you impose on what average can upload to gerrit? This does not look sane. [18:24:19] I'm surprised it works as well as it does [18:24:20] I'm getting Super pissed here [18:24:52] s/impose/impose restrictions/ [18:25:01] qchris: I am happy to just work on this alone if I have to [18:25:10] alone != team. [18:25:14] but before average ever joined the project, this was a rule [18:25:23] and I explained this to him loosely at first [18:25:29] and now I have to be stricter [18:25:31] But average is not pushing to master. [18:25:34] ... [18:25:54] I'm sorry, I'm very confused here [18:26:10] I was under the impression that I got to say what happens with wikimetrics since I wrote the damn thing [18:26:40] and I'm being basically strong-armed into using a stupid gerrit workflow that kills my productivity [18:26:56] on top of that, you guys are now arguing that I've just lost control over the quality of code that gets submitted for it [18:26:59] milimetric: then don't use my change and push to master [18:27:15] it's really *that* easy [18:27:38] I agree to average. He said more than once that his change is not yet ready to be consumed. [18:27:44] mark this moment - it is a break between me and the team [18:27:49] nice talking to you guys [18:27:49] later [18:27:54] ? [18:29:01] ok, so working in parallel on a card doesn't work [18:29:03] nothing new here [18:29:11] I encountered that in previous projects as well [18:32:39] 20:26 < milimetric> on top of that, you guys are now arguing that I've just lost control over the quality of code that gets submitted for it [18:32:56] that's completely the opposite of what I said. I said if a change on gerrit is not ready, by all means, do not use it. [18:33:36] If you do use it, then you agree to cope with the problems it creates as a work-in-progress change. [18:34:03] agh [18:34:05] qchris [18:34:05] subject's kind of dead to me average. I worked on wikimetrics in parallel with Evan and on Limn in parallel with David. Believe me those interactions had their challenges [18:34:23] but at the end of the day - I've never seen such crap [18:34:34] I've seen it before [18:34:44] interesting coincidence [18:34:55] I would not make that correlation so fast [18:34:59] these are common problems [18:35:15] which I solve with basic human civility [18:35:28] indeed [18:35:36] and some people apparently solve by pretending that subjective things are actually objective [18:40:57] ottomata: can I use udp2log to track requests to bits.wm.o/geoiplookup? [18:41:14] or do we only get log streams from the text caches? [18:42:26] we get everything [18:42:40] can you use sampled logs on stat1001? [18:42:44] or do you need unsampled? [18:42:51] no; sampled is good [18:42:58] and you have history :) [18:43:11] where are those files stored? [18:43:44] stat1001:/a/squid/archive/sampled-1000 i think [18:43:58] that's all webrequests sampled at 1/1000 rate [18:48:14] ottomata: on stat1 in /a I'm not seeing a squid directory; nor anything else that really looks promising [18:49:14] stat1001 [18:49:23] stat1002 :-) [18:49:24] oh; duh [18:49:26] doh [18:49:27] sorry [18:49:28] stat1002 [18:49:31] not stat1 or stat1001! [18:49:33] thanks qchris :[p [18:50:28] it appears I've never connected to either of the stat100[1-2] hosts before; can I tunnel through stat1; or do you have a different bastion? [18:50:56] sorry guys for missing scrum, broken wifi at airport and now i am about to board my plan, have an awesome weekend! [18:51:44] mwalker: I use bast1001.wikimedia.org as bastion. [18:52:04] drdee: Enjoy your weekend! [18:52:24] thanks :) [18:54:00] awesome; found the files! [18:54:05] * mwalker puts on splunking hat [18:54:07] mwalker: \o/ [18:54:41] 16 cores! this is a beastly box [18:58:38] ah great you are in :) [19:04:14] hmm; it looks like the signal I need is too small or something -- I'm not getting any hits [19:04:59] I am wanting to compare the ratio of IPv6 hits between geoiplookup.wm.o and bits.wm.o/geoiplookup in order to determine if the IPv6 lookup is actually useful [19:05:09] or if we're falling back to IPv4 for the majority of hits [19:06:24] hm. [19:06:50] mwalker, do you just need a short scoop of data, or do you need to collect this regularly? [19:07:01] just need to do this once [19:07:28] paravoid was asking; and we're considering allowing IPv6 to the fundraising cluster; so we need to know if we can trust the IPv6 lookup [19:07:40] or if we need to take other steps, like asking the user, what country they're in [19:08:01] I can capture some for you [19:08:53] alternatively; I believe y'all geocode some other log streams? [19:09:20] if I can isolate out the IPv6 requests from that; I can get the information I need from the countries [19:09:54] hm [19:10:18] there isn't anything else that specificlally filters for those domains [19:10:24] so theyd' all be limited [19:10:48] if you can get me a udp-filter or grep or whatever command to use to filter for what you want to see [19:11:02] i can capture some unsampled logs for you for a few minutes [19:11:10] kk [19:11:23] let me see if I can extract the data I need from the current banner logs [19:11:36] if I see IPv6 addresses from clients in them [19:14:05] ok [20:00:51] hey ottomata [20:01:00] hiya [20:01:02] do we have any sample logs from the mobile varnishes? [20:01:09] sure ja [20:01:14] I'd like to have mobile start speccing their reports [20:01:18] all of the mobile logs we ahve are from mobile varnishes [20:01:21] and they are sampled :) [20:01:44] ah -- will the data be the same? [20:01:49] stat1002:/a/squid/archive/mobile [20:02:01] the same? [20:02:16] the fields, etc [20:02:32] can you hangout? [20:03:15] ja [20:03:17] one sec [20:03:33] kk [20:04:15] ah poo i think i left my headphones at my friend's place [20:04:20] gonna run outside and talk to you via regular mic [20:04:23] but then am running home for interview [20:04:32] ok -- I'll be quick [21:30:24] ottomata: you still around? [21:31:10] interview. [21:32:35] *nods* if you would give me a ping when you're done I'd appreciate it [22:34:22] On the analytics db slaves, the most recent recentchanges entries for {ja,fr,ru}wiki (whole s6) come with a 2013-10-28 timestamp (i.e.: 4 days old). (s1, s2, ... seem fine) [22:35:01] Do we about this problem or is it just replication lagging? [22:35:19] ottomata: ^ [22:37:48] qchris, not sure [22:37:51] if you are logged into them [22:37:54] can you see any lag? [22:37:56] run: [22:37:57] show slave status; [22:38:06] I tried that, but lack permission [22:38:11] ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER,REPLICATION CLIENT privilege(s) for this operation [22:39:56] hm [22:39:57] k [22:40:20] oof, i've never even logged into these :p [22:40:23] what's a hostname? [22:41:23] s6-analytics-slave.eqiad.wmnet [22:45:12] qchris [22:45:13] yup [22:45:13] Seconds_Behind_Master: 22292 [22:45:14] lag [22:45:19] Ha! [22:45:28] 22268 | Updating | UPDATE /* HTMLCacheUpdateJob::invalidateTitles 127.0.0.1 */ [22:45:30] Go slave! Get in shape. [22:45:31] has been running for along time [22:45:52] hmm [22:45:58] wait no that's the slave process :p [22:46:04] ah [22:46:04] ha [22:46:08] halfak: [22:46:14] Time: 698432 [22:46:20] Hey. What's up? [22:46:22] ELECT [22:46:22] COUNT(DISTINCT linked_page.page_id) AS regular_inlinks ... [22:46:43] yeah. That ones taking forever. Most pages take less than 1 second. [22:46:46] oh, qchris that was s1 [22:46:47] Is it blocking something? [22:46:52] you're saying s6 is bad? [22:46:58] dunno, i'm jsut looking :) [22:46:58] ottomata: Yes. [22:47:29] ah, slave not running on s6 [22:49:02] dunno qchris, might have to ping someone in ops or create rt ticket [22:49:15] Ok. Thanks for having a look. [22:49:16] i could try to start the slave, but I'd rather not, as I don't know if someone else might have done this on purpose [22:49:33] mwalker [22:49:34] back. [22:49:34] No problem. [22:49:39] gonna head out soon, but what's up? [22:51:47] I was going to ask you to run udp-filter -p geoiplookup.wikimedia.org for a couple of minutes [22:52:15] uypyup [22:52:15] can do [22:52:34] *thumbsup* [22:53:26] wait what is geoiplookup.wikimedia.org? [22:53:40] do you know what kind of http server this is? [22:54:05] ok this is varnish? [22:55:58] mwalker: i don't really see any logs coming through for that domain [22:56:05] i'm grepping for just plain ol 'geoiplookup' now [22:56:15] and I see plenty of https://bits.wikimedia.org/geoiplookup hits [22:56:21] right; that's sort of expected [22:56:22] hadoop/hive trivia quiz; what would I do to sum values? [22:56:31] sum(pageviews), say? [22:56:42] we forward people to geoiplookup.wm.o if bits returns a null result [22:56:58] hmmm ok, [22:57:11] not sure why udp-filter isn't working for this [22:57:12] Well, that's a first [22:57:22] but i do see reqs when i do grep geoiplookup.wikimedia.org [22:57:25] want me to just grab those? [22:57:34] sure [22:58:06] (and you might have seen me hitting that host just to make sure I sent you the right one :p) [22:58:09] k, i'll let this run for 2 mins? [22:58:14] sounds good [22:58:49] how badly is it pegging the box you're on? [23:17:21] haha, oops, ran downstairs to grab food [23:17:28] its been running for 20 minutes now! [23:17:29] amazing! [23:18:47] mwalker, stat1002:/home/mwalker/geoiplookup.wikimedia.org.udp2log.tsv [23:19:33] tasty tasty data [23:19:35] thanks :) [23:20:50] ottomata: this wasn't sampled at all? [23:21:46] there's only 344 entries in it -- which is awesome if it's unsampled data [23:22:02] because it means that we can, in fact, trust ipv6 lookups! [23:22:15] *in general [23:22:56] DarTar: I'm a little confused by your email as I haven't used user tagging before. Can you generate a table/datafile of user_ids for me to work with? [23:23:30] halfak: sure [23:23:42] unsampled [23:23:50] mwalker [23:23:59] yeah there's not much traffic with that domain [23:24:17] awesome; thanks [23:24:22] ook, i'm outty, laters all! [23:25:28] DarTar: I've got to run now, but I'll plan to have my stats run before you're in the office on Monday. [23:25:33] have a good weekend all. :) [23:28:39] sweet [23:28:50] mail for you with the tables you requested [23:29:01] have a great weekend halfak [23:29:12] <3 [23:40:43] ottomata, ping [23:40:47] darnit [23:40:51] milimetric? [23:40:58] hey :) [23:41:25] where are you Oliver? [23:41:28] UK or SF? [23:41:29] I always forget [23:41:40] Ironholds: ^ [23:41:54] milimetric, US :) [23:42:10] question; what format do page titles in the hadoop pagecount table take? [23:43:12] they *should* be like normal titles [23:43:17] so, foo_bar? [23:43:18] but there is a *lot* of junk in there [23:43:19] yea [23:43:23] foo_bar [23:43:30] kk [23:43:33] hmn, and LIMIT isn't working. odd. [23:43:46] how do you mean about limit? [23:44:30] so, in the absence of documentation on format I used SELECT * FROM pagecounts LIMIT 1; as a way of getting the same information, which generated FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "pagecounts" Table "pagecounts"