[00:00:09] mforns: no, ummm the 2015 beirut bombings was only created on Nov 12 [00:00:48] i probably would expect the garissa university page to plot all the data, and only 4 days for the beirut bombings [00:00:59] i dont know why that's not happening though [00:01:08] do you think since 2105-11-15, no-one visited it? may be... [00:01:25] ummm what? [00:01:40] hehe [00:01:50] i have to assume at least the person who created it looked at it once :P [00:02:36] mforns: nov 12th was when the Beirut page was created, since then for 3 days it seems to have views [00:02:53] the other page was created earlier though, and should have data for all of october [00:03:08] madhuvishy, yes, it's weird though that after 3 days it goes from 30000 view to 0 in one day [00:03:10] what i'm saying is i cannot reproduce the bug exactly as it was reported [00:03:30] aha [00:03:30] mforns: oh i don't see that [00:03:56] mforns: i was able to reproduce it 5 minutes back [00:04:02] but now it behaves differently [00:04:04] madhuvishy, if you add the garissa university college attack page [00:04:05] dont know [00:04:25] and select the full month of october, you'll see it I think [00:05:40] i did [00:05:50] madhuvishy, mmmm that's a nasty bug [00:06:08] because the pageviews that were originally for 12 november [00:06:24] after adding the second article, are plotted at 1st october [00:06:29] yeah [00:06:33] that's the reported bug [00:06:41] all i can see now is http://i.imgur.com/wfD5vLK.png [00:06:45] dont know why [00:06:52] this is not correct either [00:07:05] mforns: ^ [00:07:42] madhuvishy, I think this is a data computation bug, because the article has edits for the 16th, so it should have views as well [00:08:10] but there is another bug, a visualization one, originated because of the caveat in the API [00:08:15] yeah [00:08:24] that 0 values are returned as 'empty' [00:08:44] pfffff [00:08:48] something [00:09:24] i'm very confused :P i do know that the bug that's reported is something on the viz side though - the november data points are plotted in october [00:09:33] why i cant reproduce it i dont know [00:10:13] madhuvishy, try to rechoose the date range [00:10:20] just open the selector and click ok [00:11:22] mforns: oof [00:11:26] ya [00:11:28] i see it now [00:11:31] ok [00:11:36] that behaves differently than [00:11:46] deliberately choosing the date range [00:11:49] huh [00:11:55] yea... weird... bad [00:13:05] madhuvishy, I think there are 3 bugs: 1) data computation (missing pageviews) 2) data holes crashing linechart 3) chart.js bug in chart update [00:13:19] yup [00:13:44] mforns: you should sleep though! this can wait [00:14:02] i'll add some notes to the ticket [00:14:05] I guess solving 2 can solve 3 [00:15:00] yes, I didn't want to add that code to the sample, but it'll be necessary [00:15:22] hmm [00:15:28] guys it's still the 16th, so the data's just not ready yet :) [00:15:38] it'll only be available on the 17th [00:15:56] and then it has to wait like 2-3 hours after the 16th is over to make it through the pipeline [00:16:04] milimetric, of course! so, no bug in computing data [00:16:07] milimetric: true [00:16:22] the viz bug looks to me like the dates are not aligning when adding two pages that don't both have data for the full range [00:16:24] filling in gaps should do the trick [00:16:28] right [00:16:37] mforns: go to sleep, I'll do that later tonight [00:16:39] I'll do that [00:16:46] you guys are too quick to blame yourselves [00:17:15] I'll do that in a sec, and you review it milimetric :] [00:20:54] :) it's a demo, hopefully nobody's feeling too bad about it not being perfect. As a matter of fact, I'd argue it should be imperfect to motivate others to make a production version [00:21:07] it's fun to say - hey! maybe I can do better than these WMF people :) [00:21:34] yea :] for sure [00:22:14] milimetric: yup! but the bug was reported like - we have missing data :) people who aren't checking the actual api think that the demo is the reflection of absolute truth [00:22:33] Analytics-Backlog: Weirdnesses in top_articles - https://phabricator.wikimedia.org/T117343#1809615 (Tbayer) See also the top 200 list at T117945 : These three pages are in the list of most viewed pages in `pageview_hourly` even when restricting it the list to `agent_type = "user"`. [00:23:30] ah! I just saw the pageview api announcement. congratulations, mforns, nuria, madhuvishy, joal, milimetric, and ottomata. :-) [00:23:34] This is sooo awesome. [00:23:37] leila: :D [00:23:40] :] [00:23:57] no, seriously. :D [00:23:59] madhuvishy, mforns , milimetric : a TON of bugs filed from people regarding pageview APi will be awesome, that is a sign of success [00:24:21] did nuria just say I'm a bug? :D [00:24:24] nuria, yes sure! [00:24:25] let's sit back and not try to fix everything too soon until we have gotten some traction [00:24:27] leila: he he [00:24:52] this really calls for an in-person celebration. :-) [00:24:57] enjoy it, y'all. [00:25:02] mforns, milimetric , madhuvishy : let's bake things for a few days [00:25:16] leila: just fyi, we have this awesome thing where you can use a-team to ping us all here! just in case it's tiring to type all names :) [00:25:20] leila: old fashion food & drinks event i would say [00:25:30] yeah! [00:25:41] nuria: totally. I'm getting ready. [00:25:43] woohoo [00:25:50] thx leila :D [00:25:55] madhuvishy: good that you said it. I was looking on the staff page to make sure I'm not missing anyone. :D [00:25:59] nuria: I agree, let's bake cupcakes!! [00:26:03] leila: :D [00:38:13] I can bring cupcakes at the API talk at the mediawiki dev summit :-) [00:38:36] ::nom nom:: CUPCAKES :) [00:38:49] :P [00:40:25] https://upload.wikimedia.org/wikipedia/commons/0/09/Bridal_Boudoir_Affair_%284280075406%29.jpg [00:43:31] * milimetric turns into a cartoon wolf and drops jaw [00:47:46] :] [01:08:35] milimetric, I finished a patch for the fill-in of zero values [01:08:47] do you think I put it in the gist? [01:09:15] you can put it anywhere, but I have changes to the title and text and comments and stuff on the deployed one [01:09:21] also I switched the cdnjs etc. [01:09:33] so if you put it in your gist I'll wget it and do the diff and apply that way [01:10:51] sorry, mforns ^ [01:11:17] milimetric, I'd like to have your comments and changes in the gist [01:11:37] but the cdnjs should be kept in it [01:11:46] right... [01:11:50] I'll update both [01:12:03] with your and my latest changes [01:12:07] ok. you can download the deployed file and update that, then push it to both places [01:12:16] ok [01:12:57] oh then the cdnjs stuff will be overwritten so I guess copy that over [01:13:12] aha [01:19:14] milimetric, I added nulls instead of 0s [01:19:37] so that the chart won't infere false 0 [01:19:56] do you want to review it before deploying? [01:20:17] eh... easier to just deploy and fix :) [01:20:26] ok [01:23:01] milimetric, it's on air [01:25:28] sweet, looking [01:27:09] looks good mforns. Also, there's a disturbing spike in Cat pageviews on October 2012. Weiird :) [01:27:20] *October 12th [01:27:27] milimetric, I know xD [01:27:28] I need sleep... geez [01:27:39] ok, see you tomorrow, I'm going too [01:27:42] bye! [01:27:42] nite@! [01:27:44] thx [01:28:00] Analytics-Backlog: Missing Pageview API data for one article - https://phabricator.wikimedia.org/T118785#1809867 (mforns) Thanks @Ainali and @madhuvishy for spotting this bug. The problem was not missing data, but formatting of the timeseries: a problem caused by one of the caveats of the API. It should be fi... [01:28:23] Analytics-Kanban: Missing Pageview API data for one article - https://phabricator.wikimedia.org/T118785#1809870 (mforns) p:Triage>High a:mforns [01:29:03] Analytics-Kanban: Missing Pageview API data for one article - https://phabricator.wikimedia.org/T118785#1809878 (Milimetric) (the caveat is that the API does not return data if there are no pageviews for a particular timestamp or date. That means you'll have to fill in with nulls depending on how your charti... [01:35:09] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1809895 (Milimetric) Ocaasi, whereas what you're proposing would be tricky with the *current* API endpoints, I could see a new endpoint that would look like: p... [01:36:32] Analytics-Kanban, RESTBase, Services, RESTBase-API: configure RESTBase pageview proxy to Analytics' cluster {slug} [34 pts] - https://phabricator.wikimedia.org/T114830#1809906 (Milimetric) I think we should open up a new task for that, in the sake of marking that some progress was made. But I'll le... [02:07:50] PROBLEM - Check status of defined EventLogging jobs on eventlog2001 is CRITICAL: CRITICAL: Stopped EventLogging jobs: consumer/server-side-events-log consumer/mysql-m4-master consumer/client-side-events-log consumer/all-events-log processor/server-side-0 processor/client-side-0 forwarder/server-side-raw forwarder/legacy-zmq [04:25:03] Analytics, Services: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810033 (MZMcBride) NEW [04:59:45] *sigh* again: "ERROR 2013 (HY000): Lost connection to MySQL server during query" while querying via research@analytics-store.eqiad.wmnet from stat1003 [05:19:49] Analytics, Services: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810052 (GWicke) [06:29:29] Analytics-Kanban: Missing Pageview API data for one article - https://phabricator.wikimedia.org/T118785#1810104 (Ainali) >>! In T118785#1809549, @madhuvishy wrote: > @Ainali can you check if you are still seeing this bug? No, it has been fixed. [08:36:02] (PS1) Addshore: Remove total_views to example cron [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253565 [08:36:16] (CR) Addshore: [C: 2 V: 2] Remove total_views to example cron [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253565 (owner: Addshore) [08:36:52] Analytics, Services: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810244 (mobrovac) p:Triage>High This is really strange. I can reproduce this in the exact way as outlined in the desc when I issue requests from my machine, but cannot reproduce... [08:37:21] joal: fyi ^^ [08:48:12] Analytics, Services: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810259 (mobrovac) Switching to a new terminal window gives me 200 OK for the first time and then 503's. The same cannot be reproduced for other domains, only `wikimedia.org`. [08:48:38] Analytics, Services, operations: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810260 (mobrovac) [08:49:38] (PS1) Addshore: Add dispatch tracking script [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253566 [08:49:54] (CR) Addshore: [C: 2 V: 2] Add dispatch tracking script [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253566 (owner: Addshore) [09:32:43] Analytics-Kanban: Missing Pageview API data for one article - https://phabricator.wikimedia.org/T118785#1810312 (JAllemandou) Not fixed for me :( [09:36:14] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1810321 (Symac) Thanks fot the API, that is great! I have been playing with views data to build this small tool http://www.wikifamo.us and at the moment I am pr... [09:41:40] hey mobrovac [09:41:50] Thanks for yesterday's deploy and for pointing the bug [09:42:11] I guess having assigned to ops is the right thing do, isn't it mobrovac ? [09:45:45] joal: i'm working with ops to resolve this as we speak [09:45:55] ops channel I guess [09:46:07] thanks mobrovac [09:46:14] HAve any idea from where it comes from ? [09:46:51] joal: it seems that in our infra the wikimedia.org domain is dealt with differently than others, so we are trying to find out what exactly is causing this [09:47:01] right [09:47:10] Thanks a milion mobrovac for taking care of this [09:47:25] np joal :) [09:51:32] And also mobrovac: Thanks a lot for the help on setting up the API :) IT'S ALIVE !!!!! [09:51:56] hehe [09:52:02] it was my pleasure joal! [09:52:08] i'm so happy it's up there :) [09:54:24] (PS1) Addshore: Social metrics to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253571 (https://phabricator.wikimedia.org/T117735) [09:54:27] (PS1) Addshore: Convert site_stats to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253572 (https://phabricator.wikimedia.org/T117735) [09:54:30] (PS1) Addshore: Convert getclaims stats to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253573 (https://phabricator.wikimedia.org/T117735) [10:02:16] (PS1) Addshore: Fix paths to config for social stats [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253576 [10:07:58] (PS1) Addshore: +x .sh files [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253577 [10:16:29] (CR) Addshore: [C: 2 V: 2] Social metrics to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253571 (https://phabricator.wikimedia.org/T117735) (owner: Addshore) [10:16:32] (CR) Addshore: [C: 2 V: 2] Convert site_stats to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253572 (https://phabricator.wikimedia.org/T117735) (owner: Addshore) [10:16:35] (CR) Addshore: [C: 2 V: 2] Convert getclaims stats to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253573 (https://phabricator.wikimedia.org/T117735) (owner: Addshore) [10:16:38] (CR) Addshore: [C: 2 V: 2] Fix paths to config for social stats [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253576 (owner: Addshore) [10:16:41] (CR) Addshore: [C: 2 V: 2] +x .sh files [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253577 (owner: Addshore) [10:16:57] (Merged) jenkins-bot: Social metrics to graphite [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253571 (https://phabricator.wikimedia.org/T117735) (owner: Addshore) [10:17:26] (CR) jenkins-bot: [V: -1] Fix paths to config for social stats [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253576 (owner: Addshore) [10:17:35] (CR) jenkins-bot: [V: -1] Fix paths to config for social stats [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253576 (owner: Addshore) [10:18:12] (CR) Addshore: "recheck" [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253576 (owner: Addshore) [10:20:17] Analytics-Tech-community-metrics, Developer-Relations, DevRel-November-2015: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#1810431 (Aklapper) Daniel says there is some more work to do here to update these lists. [10:27:36] Analytics-Backlog: Write hive code doing pageview data anonimisation with two tables [13 pts] {hawk} - https://phabricator.wikimedia.org/T118838#1810450 (JAllemandou) NEW [10:28:42] Analytics-Backlog: Productionize hive code with Oozie job and refinery inclusion - https://phabricator.wikimedia.org/T118839#1810457 (JAllemandou) NEW [10:28:58] Analytics-Backlog: Productionize hive code with Oozie job and refinery inclusion [8 pts] {hawk} - https://phabricator.wikimedia.org/T118839#1810457 (JAllemandou) [10:29:46] Analytics-Backlog: Productionize hive code with Oozie job and refinery inclusion [8 pts] {hawk} - https://phabricator.wikimedia.org/T118839#1810463 (JAllemandou) [10:29:47] Analytics-Backlog: Write hive code doing pageview data anonimisation with two tables [13 pts] {hawk} - https://phabricator.wikimedia.org/T118838#1810473 (JAllemandou) [10:32:34] Analytics-Backlog: Deploy pageview sanitization and start ongoing process [5 pts] {hawk} - https://phabricator.wikimedia.org/T118841#1810476 (JAllemandou) NEW [10:32:52] Analytics-Backlog: Deploy pageview sanitization and start ongoing process [5 pts] {hawk} - https://phabricator.wikimedia.org/T118841#1810483 (JAllemandou) [10:32:53] Analytics-Backlog: Productionize hive code with Oozie job and refinery inclusion [8 pts] {hawk} - https://phabricator.wikimedia.org/T118839#1810484 (JAllemandou) [10:36:53] (PS3) DCausse: Test do not merge [analytics/refinery/source] - https://gerrit.wikimedia.org/r/253393 [10:37:45] (PS4) DCausse: Test do not merge [analytics/refinery/source] - https://gerrit.wikimedia.org/r/253393 [10:38:38] Analytics-Backlog: Backfill pageview_hourly sanitization - 1 month - [8 pts] {hawk} - DUPLICATE THIS TASK FOR EACH MONTH TO BACKFILL - https://phabricator.wikimedia.org/T118842#1810488 (JAllemandou) NEW [10:40:39] Analytics-Backlog: Sanitize pageview_hourly - subtasked [0 pts] {hawk} - https://phabricator.wikimedia.org/T114675#1810499 (JAllemandou) a:JAllemandou>None [10:41:14] Analytics-Backlog: Deploy pageview sanitization and start ongoing process [5 pts] {hawk} - https://phabricator.wikimedia.org/T118841#1810476 (JAllemandou) [10:41:15] Analytics-Backlog: Productionize hive code with Oozie job and refinery inclusion [8 pts] {hawk} - https://phabricator.wikimedia.org/T118839#1810463 (JAllemandou) [10:41:17] Analytics-Backlog: Backfill pageview_hourly sanitization - 1 month - [8 pts] {hawk} - DUPLICATE THIS TASK FOR EACH MONTH TO BACKFILL - https://phabricator.wikimedia.org/T118842#1810488 (JAllemandou) [10:41:19] Analytics-Backlog: Write hive code doing pageview data anonimisation with two tables [13 pts] {hawk} - https://phabricator.wikimedia.org/T118838#1810450 (JAllemandou) [10:41:46] Analytics-Kanban: Sanitize pageview_hourly - subtasked [0 pts] {hawk} - https://phabricator.wikimedia.org/T114675#1702699 (JAllemandou) [10:58:50] hein dis dis dis ? [10:58:53] oops: ) [11:02:00] Analytics-Tech-community-metrics, Gerrit-Migration: Make MetricsGrimoire/korma support gathering Code Review statistics from Phabricator's Differential - https://phabricator.wikimedia.org/T118753#1810527 (Aklapper) [11:05:33] Analytics-General-or-Unknown, WMDE-Analytics-Engineering, Wikidata, Story: [Story] Statistics for Wikidata exports - https://phabricator.wikimedia.org/T64874#1810535 (Addshore) [11:06:51] Analytics-General-or-Unknown, WMDE-Analytics-Engineering, Wikidata, Story: [Story] Statistics for Wikidata exports - https://phabricator.wikimedia.org/T64874#676394 (Addshore) > Can you get us an internal overview from hadoop? Yes [11:23:20] Analytics-Backlog: Set up auto-purging after 90 days {tick} - https://phabricator.wikimedia.org/T108850#1810556 (mforns) [11:23:21] Analytics-EventLogging, Analytics-Kanban: {tick} Schema Audit - https://phabricator.wikimedia.org/T102224#1810555 (mforns) [11:23:56] Analytics, Services, operations: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810559 (akosiaris) curl with `--compressed` is succeeding every single time. curl with `--compressed` will set **Accept-Encoding: deflate, gzip ** whereas this does... [11:37:15] hi a-team ] [11:37:20] hey mforns :) [11:37:22] *:] [11:37:33] howdy ? [11:37:36] hey joal, I just read your comment on the api sample bug-fix [11:37:41] late work yesterday I have seen [11:37:43] good, thx, you? [11:37:48] good :) [11:38:07] does the bug really persist in your browser? [11:38:37] nope, bug is fixed, I was just looking at the wrong URL (the old one, in labs) [11:38:46] ah! ok :] [11:38:50] That's why I think we should remove that one ! [11:39:05] yes, you're totally right, will do that now [11:39:09] Just to prevent eedjits like me to fill in fake bugs :) [11:39:21] sure [11:40:19] joal, done [11:40:33] awesome :) [11:42:20] joal: update on the ticket, it seems there's a problem in the interaction between rb and aqs [11:42:36] will keep you updated as the situation progresses [11:42:50] mobrovac: you mean an issue between rb and varnish, right ? [11:44:24] no joal, for some unknown reason, content gets gzipped either in rb or in aqs when it shouldn't [11:44:41] weirdo mobrovac [11:44:50] yup [11:44:53] I thought it would have been expected to send gzip [11:45:41] joal: that should happen only iff the client sends accept-encoding: deflate, gzip [11:46:14] mobrovac: hum, maybe rb sends aqs a request with those headers set ? [11:46:46] according to the code, it shouldn't [11:47:01] mwaaarf :( [11:47:05] joal: i'm exploring some options, will keep you up to date [11:47:23] np mobrovac, maybe some tcpdump could also help clearing what happend [11:48:55] joal: weird new fact: the first req succeeds because the content-encoding: gzip header is not set [11:49:00] grrrrrrrrrrrrrrrr [11:49:24] well, at least that looks normal, doesn't it ? [11:50:02] the first one, yes [11:50:07] the others don't [11:50:10] wth??????????? [11:50:48] ¯\_(ツ)_/¯ [11:50:56] hehehe [12:05:56] Analytics-Kanban: Backfill cassandra pageview data - August [5 pts] {slug} - https://phabricator.wikimedia.org/T118845#1810595 (JAllemandou) NEW a:JAllemandou [12:26:47] Analytics-Backlog, Analytics-Dashiki, Google-Code-In-2015, Need-volunteer: Vital-signs layout is broken - https://phabricator.wikimedia.org/T118846#1810619 (mforns) NEW [12:33:55] Analytics, Services, operations: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810628 (mobrovac) >>! In T118817#1810559, @akosiaris wrote: > curl with `--compressed` is succeeding every single time. curl with `--compressed` will set > > **Accept... [12:37:26] a-team I'm away for an hour roughly [13:30:36] Analytics, Services, operations: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810701 (mobrovac) a:mobrovac We have debugged this further and hopefully found the root cause: `preq` (the lib used by RESTBase to issue external requests) forces gz... [13:30:56] joal: found the problem ^^ [13:31:04] will fix it and deploy today [13:31:32] joal: it is likely you will not need to do a deploy of AQS for this, this should be a RB-problem only [13:34:25] Analytics, CirrusSearch, Discovery, operations, audits-data-retention: Delete logs on stat1002 in /a/mw-log/archive that are more than 90 days old. - https://phabricator.wikimedia.org/T118527#1810713 (ArielGlenn) Adding @Ottomata and a link to T84618 which is still pending with a number of open... [13:55:37] Analytics-General-or-Unknown, WMDE-Analytics-Engineering, Wikidata, Story: [Story] Statistics for Wikidata exports - https://phabricator.wikimedia.org/T64874#1810734 (Addshore) Just as a sample here is for a single hour > > 6 text/html; charset=UTF-8 > 2460 application/rdf+xml; charset=... [13:58:18] awesome mobrovac :) [13:58:21] Thanks again ! [13:58:35] np! [14:03:31] o/ joal & milimetric. [14:03:34] Sorry I'm late [14:06:31] Analytics, CirrusSearch, Discovery, operations, audits-data-retention: Delete logs on stat1002 in /a/mw-log/archive that are more than 90 days old. - https://phabricator.wikimedia.org/T118527#1810745 (Addshore) [14:21:04] (PS1) Addshore: Fix metric name for getclaims property use [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253604 [14:21:12] (CR) Addshore: [C: 2 V: 2] Fix metric name for getclaims property use [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/253604 (owner: Addshore) [14:25:12] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 26.67% of data above the critical threshold [30.0] [14:27:03] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 25.00% above the threshold [20.0] [14:55:35] ottomata: hi, I've read some tickets about the EventBus, do you plan to use avro binary with it? [14:57:55] dcausse: no, not at first anyway. we may do some avro in the future, it wouldnt' be too hard to support i think. [14:58:07] ok [14:58:09] but, the event-schemas repo that we just created for event bus is where the avro schemas you are working with should live [14:58:27] is it ready? [14:58:36] there's nothing there afaik yet [14:58:36] https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/event-schemas [14:58:40] but, we can iterate on it [14:58:44] submit a patch :) [14:58:46] :) [14:58:48] make an avro/ dir [14:58:57] is there some guidelines to handle schema evolution ? [14:58:59] and i dunno what the hierarchy shoudl be yet, i guess you should make that be java style [14:59:08] for avro or for jsonschema? [14:59:11] avro [14:59:35] avro should be the usual rules, and we will likely use jenkins to reject changes that don't do it right [14:59:38] I'm running into an issue where the consumer needs to know both schema [14:59:43] buuuuut, not anytime soon [15:00:03] right, you were saying that yesterday, did the union + default thing not work? [15:00:12] no [15:00:14] rats. [15:00:17] http://mail-archives.apache.org/mod_mbox/avro-user/201502.mbox/%3CCAGHyZ6KhssPCq%3DjCoDRHKbCzaAzvMVY2UMWLu%3DHCXVPprkS-0g%40mail.gmail.com%3E [15:00:52] they suggest a solution where you encode the schema rev id in the message [15:01:22] this implies that the consumer must have access to all schema revisions [15:01:24] interesting, dcausse, and this is a problem always? or just for when you are trying to use a union? [15:01:34] ottomata: whenever you update the schema [15:01:57] hm, [15:02:06] rats, that is strange. hm [15:02:11] lemme look around at some code [15:02:18] i mean, kinda makes sense, but is totally annoying [15:02:33] see comments here: https://phabricator.wikimedia.org/T118570 [15:02:59] I was wondering why hive works, but in fact the schema is stored with each records [15:03:13] that's not the case with our kafka topic [15:03:33] right [15:08:41] well, crapo. dcausse [15:08:42] https://issues.apache.org/jira/browse/AVRO-1661 [15:08:46] you are right, and this won't work. [15:09:00] our hopes of not needing to encode a schema id in the kafka message it hink won't work. [15:09:19] so. what to do... [15:09:38] dcausse: i wonder if we should just use confluent stuff for avro.... [15:09:47] stand up their schema registry and use their camus. [15:09:59] nuria: ^^ [15:10:07] for mediawiki it can be annoying [15:10:14] confluent? or what? [15:10:29] having a schema an external schema repo [15:10:29] writing the int id in the message? [15:10:34] no [15:10:44] who will deploy the new schema? [15:10:46] no, i think you don't need that, all you need mw to do is be sure that it writes the message with the proper schema id [15:10:52] yeah [15:10:56] i think that is also not ideal [15:10:57] hm. [15:11:12] wel, i mean, the stuff we are buidling for eventbus makes eventlogging act as a schema registry... [15:11:17] but, we don't have good unique ids [15:11:19] atm [15:11:25] and no avro support ety [15:11:25] yet [15:11:56] We can store all schema revs in the jar for now [15:12:12] yeah, but how will you ID a given kafka message to a schema? [15:12:22] manually :/ [15:12:49] I mean I change the format, add small int in the body [15:12:49] how? camus needs to know the two schemas when writing, right? [15:12:53] oh [15:13:04] ok so add the int id in the message? [15:13:24] but then camus needs to know how to map that to the schemas [15:13:31] or does just hive? [15:13:36] i guess camus doesn't at all [15:13:39] it only needs the writer schema [15:13:51] then on camus side I have CirrusSearchRequestSet.1.avsc CirrusSearchRequestSet.2.avsc ... [15:13:58] in src/resources [15:14:01] yeah, but that is not a unique id [15:14:07] topic + rev [15:14:14] hmm, MMmMmMMMMmmM [15:14:37] this is hackish I know :) [15:14:44] yeah, we do have a topic config that specifies the schema [15:15:09] SchemaRegistry is meant to do that [15:15:13] but i'm experiencing what all noobs experience when they try to do things a different way...realization that smart people have already done this and they did it their way for a reason [15:15:46] dcausse: we could make something generatea unique id for you when you evolve the schema maybe [15:15:59] CirrusSearchRequestSet.4451324355322334.avsc [15:16:03] hm, woudl be nice if it was incrementing though [15:16:09] could do timestamp + iteration [15:16:15] but I need to know what is the latest schema :) [15:16:42] Analytics-Backlog, WMDE-Analytics-Engineering, Wikidata, Graphite, Patch-For-Review: Enable retention of daily metrics for longer periods of time in Graphite - https://phabricator.wikimedia.org/T117402#1810894 (Addshore) Open>Resolved [15:16:48] Schema.REVTS.avsc [15:16:58] sounds good [15:16:58] TS=12345646 [15:16:58] REV=2 [15:17:03] dunno [15:17:55] ottomata: I deployed your CI configuration change for /eventlogging.git [15:18:04] cool thank you! [15:18:08] though the service branch fails flake8 in some commented out code [15:18:18] yeah... [15:18:25] how do I make this recheck again? [15:18:25] https://gerrit.wikimedia.org/r/#/c/253547/ [15:18:27] it'll def fail [15:18:32] recheck? [15:18:33] in comment [15:19:02] dcausse: i don't have time to work on avro support in this now...........hm. [15:19:29] ottomata: ok, will check with erik what we should do then [15:19:33] but, what do we need? we need: shared unique ids for avro schemas, [15:19:45] those IDs embedded in kafka messages [15:19:52] and camus needs to know how to map that Id to a schema [15:19:58] that's it, right? [15:20:04] yes sort of, and write a custom schemaregistry in refinery-camus [15:20:06] yes [15:20:22] yeah, i think it will have the schema repo checked out and just load it or something, but yeah.... [15:20:39] not sure what other analytics devs are up to, maybe madhu can help you with that. [15:20:53] but, i see no reason why you can't submit patches to mediawiki/event-schemas now [15:21:04] maybe an avro/ dir with the java style path and your schemas [15:21:11] ok [15:21:14] and...a hacky bash script to auto evolve a new file with id? [15:21:27] like, given a schema file, it will create a copy with the new name for you? [15:21:28] why not [15:22:14] if you do write that, see if you can make it generic and not avro specific [15:22:22] :) [15:22:26] the jsonschema files can be revisioned the same way [15:22:45] ok [15:22:48] sorry this isn't smoother! [15:23:12] np [15:27:29] joal: how do I know whether or not I'm killing the cluster? Oh... I think Christian posted a how-to-know that on wiki somewhere... [15:27:56] Analytics, CirrusSearch, Discovery, operations, audits-data-retention: Delete logs on stat1002 in /a/mw-log/archive that are more than 90 days old. - https://phabricator.wikimedia.org/T118527#1810931 (Ottomata) I don't know much about the mw-logs. Maybe @bd808 knows more, or who to ask? [15:28:03] Analytics-General-or-Unknown, WMDE-Analytics-Engineering, Wikidata, Story: [Story] Statistics for Wikidata exports - https://phabricator.wikimedia.org/T64874#1810930 (JAllemandou) Hi, quick questions on that: Is the need regular, or would one shots make it ? Also, what level of aggregation ? Daily... [15:28:09] aha: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load [15:28:11] milimetric: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load [15:28:19] Ah ... you beat me :-) [15:28:20] I found it! [15:28:26] :) [15:28:49] But I think some (all?) parts of it went stale. [15:28:54] thanks qchris__ and hey, btw, not sure if you saw -we released the pageview API [15:29:00] yeah, I'll update the doc as needed [15:29:20] Congrats! [15:29:24] Analytics, Services, operations: Wikimedia pageview API intermittently throwing HTTP 503s - https://phabricator.wikimedia.org/T118817#1810938 (mobrovac) Open>Resolved [preq PR #9](https://github.com/wikimedia/preq/pull/9) fixed this issue entirely. IT has been deployed and now everything works as... [15:29:45] I noticed yesterday late evening, but I did not find time to go through my emails. [15:29:49] joal: milimetric: ^^ [15:30:06] you rock mobrovac :) [15:31:43] Analytics-Cluster, Database: Replicate Echo tables to analytics-store - https://phabricator.wikimedia.org/T115275#1810941 (jcrespo) I do not think this is possible right now. I would like you to request a different approach- if you want echo on analytics-store, something else has to go (like a core produc... [15:31:56] joal: yw :) [15:32:41] woah, tricky issue, nice fix, thx [15:43:08] ottomata: do we have a good graph / dashboard on CPU usage for the Hadoop cluster? The load monitoring article covers memory and Bytes Out: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load [15:43:35] milimetric: ganglia: http://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&c=Analytics+cluster+eqiad&h=&tab=m&vn=&hide-hf=false&m=cpu_report&sh=1&z=small&hc=4&host_regex=&max_graphs=0&s=by+name [15:49:30]