[00:06:31] nuria, can I help you in something? [00:09:36] mforns: I think wikimetrics is doing good [00:10:52] mforns: so no need to do anything, i am doing tasks that i had on my queue here and there. [00:11:01] nuria, ok, I'm heading to the airport then [00:11:21] mforns: good, see you in the internets in couple days [00:11:36] ottomata: ping? [00:11:38] ok... bye nuria! see you [00:11:38] i'm around now [00:14:00] ori yoo hooo [00:14:07] you on the busy side? [00:14:12] i am coming searching :) [00:15:16] (PS1) QChris: Set a proper timeout for refining jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/187293 [00:15:42] ottomata ... before you go searching ... mind a quick review ^ [00:19:03] mhmmm ... nuria ... you're doing oozie work too :-) [00:19:14] Wanna help me with a quick review? ^ [00:19:29] qchris: suree [00:19:39] thanks [00:20:24] qchris: gerrit or hangout? [00:20:44] gerrit :-) [00:20:46] https://gerrit.wikimedia.org/r/187293 [00:25:11] Quarry: Quarry does not respect ORDER BY sort order in result set - https://phabricator.wikimedia.org/T87829#1000128 (MahmoudHashemi) NEW a:yuvipanda [00:28:03] qchris: looking [00:28:34] (CR) Ottomata: [C: 2 V: 2] Set a proper timeout for refining jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/187293 (owner: QChris) [00:28:48] oh i just merged it :/ [00:28:50] :D [00:28:59] ottomata, nuria: Thanks :-D [00:29:28] ottomata is a merging-ninjaaaaa [00:30:05] nuria, i'll merge this if you fix qchris' comments :) [00:30:05] https://gerrit.wikimedia.org/r/#/c/187011/1/modules/eventlogging/manifests/monitoring.pp [00:30:55] ottomata: will do right now [00:36:31] ottomata: done, comments corrected. [01:27:47] Analytics-Engineering, Analytics-Cluster: Researchers have page_id in X-Analytics field of webrequest logs - https://phabricator.wikimedia.org/T77416#828237 (Ottomata) This is going to happen! I chatted with Ori about this today. The extension will not work in all cases due to unfortunate output buffering r... [12:15:28] (PS8) Ananthrk: Added Geocode core class that uses Maxmind-V2 Added GeocodedCountryUDF to get country code from IP address Added GeocodedDataUDF to get Geo data from IP address Added ClientIpUDF to determine client IP address given source IP and XFF header Added new depe [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 [14:50:59] no kevinator, no milimetric, no nuria, no mfors ... I guess that means another day without daily scrum. [15:00:52] oh..i was about to ask whether we are meeting today [15:01:18] ananthrk: I thought that today we'd meet again ... but it's only the two of us. [15:01:39] I guess we'll just skip it. What do you think? [15:03:19] okay...i guess we have no other choice today [15:04:01] ananthrk: It seeems nuria just joined. [15:04:11] And she told me that milimetric is around too. [15:04:17] So I guess we meet. [15:05:22] nuria: I which hangout? (The batcave is showing as empty for me) [15:05:29] oh..great [15:05:36] what? [15:05:43] we are both here... [15:27:06] nuria: The problem seems to the "Webrequests" class being referred in org.wikimedia.analytics.refinery.hive.IsCrawlerUDF. In refinery-core, the class is called "Webrequest", but the UDF class refers to it as "Webrequests" (with an 's' in the end) [15:30:01] nuria_: The problem seems to the "Webrequests" class being referred in org.wikimedia.analytics.refinery.hive.IsCrawlerUDF. In refinery-core, the class is called "Webrequest", but the UDF class refers to it as "Webrequests" (with an 's' in the end) [15:35:44] kevinator: hi, we're in the batcave [15:35:48] operations, ops-core, Analytics: Deprecate HTTPS udp2log stream? - https://phabricator.wikimedia.org/T86656#1000896 (QChris) >>! In T86656#999773, @Ottomata wrote: > The data is still being backfilled in hadoop. Done. [17:02:56] qchris: mornin! [17:03:11] Gooooood morning sir ottomata! [17:03:33] * qchris feels the smell of decommissioned webstatscollector in the air today! [17:03:36] i came down stairs at the hotel this morning, and an excited brandon told me the backfilling was done! [17:03:37] :) [17:03:43] data looks good? [17:03:50] i can tell him that turning off nginx udp2log is ok? [17:04:13] only if he turns on SPDY [17:04:14] also, should I remove those *0001 files? [17:04:16] haha [17:04:29] tit for tat! [17:05:02] bblack says that is the eventual plan! [17:05:09] Oh. nginx has priority :-D Let me think things through once again. [17:06:27] ottomata: Green light from my side. [17:08:45] awesoooOMe [17:09:17] operations, ops-core, Analytics: Deprecate HTTPS udp2log stream? - https://phabricator.wikimedia.org/T86656#1000993 (Ottomata) Looking GOOD! Brandon, you may turn of nginx udp2log! :) [17:09:26] qchris: i should remove the *0001 files from dumps, ja? [17:09:35] i have them backed up [17:09:44] Yes, please remove them :-) [17:10:00] done [17:10:21] ottomata: btw. the front-page of the pagecounts-raw still shows the old page. [17:11:19] Could you run the script by hand? [17:11:27] really? [17:11:29] hm [17:11:31] checking [17:11:39] http://dumps.wikimedia.org/other/pagecounts-raw/ [17:11:58] It's just the front page at that url. [17:12:04] The month subpages are ok. [17:12:04] it shouldn't, because puppet runs all of the scripts at once [17:12:07] hm ok [17:12:47] I guess it'll be run automatically on 2016-01-01, with the change of the year. [17:13:59] ah, yeah, i think if index.html exists and that, it doesn't run [17:14:03] i'll remove index and rerun [17:14:14] Cool. Thanks. [17:14:27] there we go [17:15:07] \o/ [17:15:09] Thanks. [17:15:18] ok, heading to the office, back in a bit [17:15:30] k. Thanks again. [17:28:35] Analytics-Engineering: Calculate per-wiki Edit Completion Rate for Visual Editor edits - https://phabricator.wikimedia.org/T87865#1001019 (Milimetric) NEW a:Milimetric [17:35:30] Analytics-EventLogging: Multiple user_ids per username in account creation events from ServerSideAccountCreation log - https://phabricator.wikimedia.org/T68101#1001041 (Milimetric) Thanks Chris. No rush, we were just wondering what the timeline looked like. [17:43:14] operations, ops-core, Analytics: Deprecate HTTPS udp2log stream? - https://phabricator.wikimedia.org/T86656#1001069 (BBlack) nginx udp2log is off and nginx configs have been reloaded: https://gerrit.wikimedia.org/r/#/c/186257/ [17:44:05] (PS1) QChris: Prepare webrequest dump for addition of refined data [analytics/refinery] - https://gerrit.wikimedia.org/r/187416 [17:44:07] (PS1) QChris: Add refined tables to webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187417 [17:44:09] (PS1) QChris: Add pagecounts-all-sites to webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187418 [17:44:11] (PS1) QChris: Add pagecounts-raw to webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187419 [17:44:13] (PS1) QChris: Separate different datasets better in webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187420 [17:54:20] (PS1) Ottomata: Fix bad symbol 'Webrequests' in IsCrawlerUDF.java [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187424 [17:54:41] (CR) Ottomata: [C: 2 V: 2] Fix bad symbol 'Webrequests' in IsCrawlerUDF.java [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187424 (owner: Ottomata) [17:59:52] operations, ops-core, Analytics: Deprecate HTTPS udp2log stream? - https://phabricator.wikimedia.org/T86656#1001093 (faidon) Open>Resolved \o/ [18:03:42] kevinator: are we blocking 30 mins between 10.30 and 11 and inviting toby? [18:03:55] he’s in other meetings [18:04:08] ok… Toby is at the Collab QR so he’s up there right now [18:04:37] I’ll text him, we need to wrap up his section [18:19:41] Analytics-Engineering, operations: Decommission webstatscollector - https://phabricator.wikimedia.org/T87868#1001126 (Ottomata) NEW a:Ottomata [18:42:15] ottomata: I like those webstatscollector changes that fly by :-) [18:42:42] If gerrit had a "Thank you" button ... ;-) [18:43:23] that's not a bad idea [18:51:36] Analytics-Engineering, operations: Decommission webstatscollector - https://phabricator.wikimedia.org/T87868#1001260 (Ottomata) Open>Resolved [18:51:53] * qchris dances [18:52:21] Awesome ottomata! [19:04:54] :D [19:32:21] Analytics-EventLogging: Convert EventLogging to use extension registration - https://phabricator.wikimedia.org/T87912#1001539 (Legoktm) [19:41:01] Analytics-EventLogging: Convert EventLogging to use extension registration - https://phabricator.wikimedia.org/T87912#1002033 (Legoktm) [19:55:35] Analytics-Engineering, Analytics-EventLogging: Convert EventLogging to use extension registration - https://phabricator.wikimedia.org/T87912#1002116 (kevinator) [20:00:54] for realz -- congrats folks [20:01:04] (on the WSC deco) [20:01:07] decom [20:45:05] qchris: the X-Analytics initial commit patch should be ready to go, from my perspective [20:45:41] I did not check since back when you said you addressed the MobileFrontend comment. [20:46:11] yeah, it's resolved [20:46:21] Argh ... gerrit is extra slow for me today :-( [20:47:37] Finally ... there it is: https://gerrit.wikimedia.org/r/157841 [20:48:17] No CR-1 left from me. Feel free to merge at will. [20:48:39] qchris: could you? [20:49:26] (I tested it with ottomata yesterday, he can vouch :P) [20:49:30] I didn't review/test the code. [20:49:36] ottomata|meeting: Wanna merge some code? [20:49:59] Ah ... I see "meeting" [20:51:33] ori: how did the concerns about ob_flush-ing work out? Did those places got fixed in core? [20:51:38] (I did not follow core recently) [20:52:41] qchris: ottomata explained that the thing he's most keenly interested in is page_id for page views. The headers for page views are handled in one place, so ottomata is OK with other requests potentially going "unheadered", like 404s and ResourceLoader module requests. [20:53:33] ori: I am sorry. It really seems ottomata should CR+2 then. [20:53:54] OK, no problem. [20:53:54] I basically do not understand all the interplay with core here. [20:54:08] Sorry :-( [20:54:42] it's ok, you will answer for it in the afterlife [20:55:18] (joking) [20:56:01] I AM NOT AFRAID OF SPAGHETTI MONSTERS CHASING ME! [20:56:14] * qchris arms himself with a fork and a starving belly. [20:56:38] (So much for afterlife) [20:56:39] heh [20:57:06] all hail the noodly appendage [20:57:48] * qchris wonders who has "spaghetti monster" set as stalk word in his IRC client. [21:44:49] jsahleen: fyi I looked into the query and there's a clear problem with generate.py where it just overwrites all your old data [21:45:26] milimetric: Your end or ours? [21:46:07] The numbers are correct now, but the graph is completely flat. [21:46:34] jsahleen: our end [21:46:39] it's just ... broken [21:46:59] Ah. Well, that makes me feel a bit better. I couldn't see anything wrong with the query. Sorry it's broken. [21:47:23] I have a couple of ideas why, but basically it just doesn't handle this type of query well - where we don't pass in the date but still want daily data [21:47:30] sadly, can't work on it for a bit [21:47:45] Not a big issue. Quarterly review is over. :) [21:48:29] So, whenever you get time is fine. [21:49:04] jsahleen: meantime I'm just running the query manually in a cron [21:49:18] you can, btw, feel free to do the same thing and copy the results into the datasets directory [21:49:31] then you can feed the graph from that instead of from the broken generate.py output [21:49:32] milimetric: which repo? [21:49:38] no repos here, wild west [21:49:42] crap [21:50:07] jsahleen: I put it in /home/milimetric/daily-content-translation-beta [21:50:07] Ok. We are still "discussing" our metrics, so that needs to be settled first. [21:50:12] yeah, good [21:50:32] but just fyi, you don't have to chain yourself to this broken data collector [21:50:42] since Limn is pretty easy to make data for [21:51:16] The team is talking about updating our own stats page that is part of the Content Translation extension. Bypassing the whole limn thing. [21:51:40] But again, still "discussing." [21:52:00] that's ok with me, and we can ping you when we fix the pipeline [21:52:05] Sounds good. [21:52:20] You have other fish to fry [21:52:49] Big fish [21:55:39] warning: All fish in retrospect are bigger than they appear [21:55:53] :) [22:36:38] (PS9) Ottomata: Added Geocode core class that uses Maxmind-V2 Added GeocodedCountryUDF to get country code from IP address Added GeocodedDataUDF to get Geo data from IP address Added ClientIpUDF to determine client IP address given source IP and XFF header Added new depe [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [22:40:12] (CR) Ottomata: [C: 2 V: 2] Prepare webrequest dump for addition of refined data [analytics/refinery] - https://gerrit.wikimedia.org/r/187416 (owner: QChris) [22:41:06] (CR) Ottomata: [C: 2 V: 2] Add refined tables to webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187417 (owner: QChris) [22:42:09] (CR) Ottomata: [C: 2 V: 2] Add pagecounts-all-sites to webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187418 (owner: QChris) [22:42:29] (CR) Ottomata: [C: 2 V: 2] Add pagecounts-raw to webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187419 (owner: QChris) [22:42:48] (CR) Ottomata: [C: 2 V: 2] Separate different datasets better in webrequest dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/187420 (owner: QChris) [22:43:07] ottomata: Thanks for the merges. [22:43:16] yessuhhh! [22:43:24] I am not sure ... should I update the cron that sends out emails to you and jg-age too? [22:43:32] sure! [22:43:42] well, hm. [22:43:46] Do you want "--datasets all" ? [22:43:48] for refeined, yes [22:43:49] hm [22:43:53] (That's very long lines) [22:43:53] not sure i want pagecounts* ones [22:44:02] i want refined and raw [22:44:07] that'll be good enough for report i think [22:44:10] k [22:45:46] (CR) Ottomata: "Hm, still has IpUtil, etc." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [23:28:06] Analytics, operations: Fix Varnishkafka delivery error icinga warning - https://phabricator.wikimedia.org/T76342#1002702 (Ottomata) Open>Resolved [23:41:10] Analytics-Cluster: Create another done-flag level for webrequest_*_raw oozie datasets that indicates the partition is mostly good - https://phabricator.wikimedia.org/T87060#1002734 (Ottomata) Open>Resolved [23:42:06] Analytics-Cluster: Raw webrequest partitions that were not marked successful due to only esams caches causing unknown problems - https://phabricator.wikimedia.org/T74809#1002737 (Ottomata) [23:42:07] Analytics-Cluster: Raw webrequest partitions for 2014-12-07T20/2H not marked successful - https://phabricator.wikimedia.org/T77024#1002736 (Ottomata) Open>declined [23:53:06] Analytics-Wikimetrics: Story: WikimetricsUser searches for cohort (filters) using tag name - https://phabricator.wikimedia.org/T75071#1002803 (Capt_Swing) Potential issue: what if I have a tag and a cohort with the same name? What will be returned by my search? [23:58:55] holaaaaa