[00:00:58] Quarry: Add 'download in HTML format' option (Quarry) - https://phabricator.wikimedia.org/T117644#1787266 (XXN) yes ...and this will be very usefull. HTML table can be used on external personal websites, on WMFLabs as well on all MediaWiki wikis. [00:02:17] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1787267 (ezachte) So I checked with hive query on pageviews_hourly {F2920445} Close but no cigar. To do: -Need to check at le... [00:52:56] Analytics-Tech-community-metrics, DevRel-November-2015: Backlogs of open changesets by affiliation - https://phabricator.wikimedia.org/T113719#1787348 (Aklapper) [05:27:46] Analytics, Language-Engineering: Investigate anomalous views to pages with replacement characters - https://phabricator.wikimedia.org/T117945#1787598 (Tbayer) NEW [06:04:16] Analytics-Kanban, Patch-For-Review: Analytics support for echo dashboard task {frog} [8 pts] - https://phabricator.wikimedia.org/T117220#1787620 (matthiasmullie) It looks like all 3 charts are using only 1 data source (http://datasets.wikimedia.org/limn-public-data/metrics/echo/monthly_production_and_consu... [09:54:26] Analytics-Kanban, Patch-For-Review: Add avro schema to refinery-camus jar [3] {hawk} - https://phabricator.wikimedia.org/T117885#1787827 (JAllemandou) [09:54:47] Analytics-Kanban, Patch-For-Review: Add avro schema to refinery-camus jar [3 pts] {hawk} - https://phabricator.wikimedia.org/T117885#1785987 (JAllemandou) [10:00:11] (CR) Joal: [C: -1] "As a reminder: For this job to work, camus cron needs to be updated in puppet --> add '--check' option (as in webrequest cron command) for" [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (https://phabricator.wikimedia.org/T117575) (owner: DCausse) [10:06:02] joal: hi! should I do something for the --check flag? ^ [10:32:51] having troubles with oozie, oozie jobs -len 2 is stuck and doesn't display anything [10:34:30] oops sorry, it just worked just a bit long to run [10:49:07] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1787865 (Qgil) Today is November 6, and this proposal is basically not on track. Unless the situation suddenly changes and/or @robla-wmf and the Architecture Co... [10:55:09] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1787899 (akosiaris) [11:12:37] (PS6) DCausse: Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (https://phabricator.wikimedia.org/T117575) [11:15:24] (CR) DCausse: "I tested the job on stat1002 and fixed few issues in coordinator.xml." [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (https://phabricator.wikimedia.org/T117575) (owner: DCausse) [11:30:28] Analytics-Kanban: Understand the Perl code for "Visiting Country per Wiki" report {lama} - https://phabricator.wikimedia.org/T117247#1787980 (JAllemandou) Didn't read the code, but did a detailed review of the report. My view of the thing summarized in "Pageviews Per Country" section( first) of [[ https://et... [11:32:10] Analytics-Kanban: Understand the Perl code for this report {lama} - Visiting Country per Wikipedia Language {lama} - https://phabricator.wikimedia.org/T117244#1787987 (JAllemandou) [11:33:35] Analytics-Kanban: Understand the Perl code for this report {lama} - Visiting Country per Wikipedia Language {lama} - https://phabricator.wikimedia.org/T117244#1769498 (JAllemandou) Collected data on this report when going over https://phabricator.wikimedia.org/T117247. See last report of the first section (Pa... [11:33:48] Analytics-Kanban: Understand the Perl code for this report {lama} - Visiting Country per Wikipedia Language {lama} - https://phabricator.wikimedia.org/T117244#1787992 (JAllemandou) a:JAllemandou [11:37:25] Analytics-Kanban: Understand the Perl code for this report - Visiting Country per Wikipedia Language {lama} - https://phabricator.wikimedia.org/T117244#1787997 (JAllemandou) [11:38:07] Analytics-Kanban: Understand the Perl code for this report - visiting country {lama} - https://phabricator.wikimedia.org/T117243#1787998 (JAllemandou) a:JAllemandou [11:39:00] Analytics-Kanban: Understand the Perl code for this report - visiting country {lama} - https://phabricator.wikimedia.org/T117243#1769490 (JAllemandou) Collected data on this report when going over https://phabricator.wikimedia.org/T117247. See first report of the first section (Pageviews per country) of [[ ht... [12:32:27] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1788081 (JanZerebecki) Or phrased differently: For any solution for archiving of metrics (those we can not recreate) we need to provide an answer to th... [13:23:59] ohhh ! Hi dcausse [13:24:04] joal: morning [13:24:05] Sorry I missed your ping [13:24:10] hey mili|gone [13:24:13] I see really weird things going on with restbase [13:24:21] you're not gone very far ;) [13:24:23] :) [13:24:31] ok, what do you mean ? [13:24:56] from my browser, https://<> doesn't work [13:25:10] from my browser, http://<> works, redirects to https [13:25:12] I have not tried that milimetric [13:25:28] I only use : https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Godzilla/daily/2015101400/2015101400 [13:25:32] for instance [13:25:50] from curl, sometimes https://<> works sometimes it gives a 503 "service unavailable" [13:25:50] this works straight away for me [13:25:55] :( [13:26:07] ok, now that https link works for me too [13:26:13] hm [13:26:30] but this isn't the first time, I thought it was a fluke before [13:26:39] it seems to be a persistent fluke [13:26:59] hm... something is wrong with the internets somehow [13:27:17] The internets are weird, you know that better than anyone I guess :) [13:27:21] anyway, checking for the new column now [13:27:32] milimetric: let's investigate in restbase logs maybe? [13:28:06] milimetric: still not here in cassandra [13:28:07] i think asq1001 is down because of the recent db table changes [13:28:25] i'm exploring a bug now which might be related [13:28:35] right, i just realized this [13:28:50] it says the status of restbase and cassandra are both running [13:29:00] but it doesn't respond to curl localhost:7231 [13:29:00] mobrovac: other issue: the compaction I changed manually is back to leveled now that restabase has restarted [13:29:04] yes, because the processes are up [13:29:13] uncool :/ [13:29:33] ah joal [13:29:34] damn [13:29:47] might be because of some recent changes we did in the driver [13:29:54] but, one thing at a time [13:32:06] one thing at a time is ok. Let us know if we can help mobrovac [13:32:23] kk [13:36:31] mobrovac: shall I change the compaction back or will there be new deploy soon ? [13:39:05] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1788179 (Milimetric) @ezachte, I checked yesterday a little bit, by looking at vital signs data [1] which is what this chart [2]... [13:46:46] Analytics-Kanban, Patch-For-Review: Analytics support for echo dashboard task {frog} [8 pts] - https://phabricator.wikimedia.org/T117220#1788189 (Milimetric) of course it is, 'cause limn's crazy :) I think I know what's going on, I'll try to fix now. [14:00:01] develop/9a27a44 (#181 by milimetric): The build has errored. https://travis-ci.org/wikimedia/limn/builds/89646007 [14:00:29] Analytics-Kanban, Patch-For-Review: Analytics support for echo dashboard task {frog} [8 pts] - https://phabricator.wikimedia.org/T117220#1788221 (Milimetric) Ok, fixed. If you're curious: https://github.com/wikimedia/limn/commit/9a27a4418e025edf3af32208178e581f0661066f [14:01:09] master/9a27a44 (#182 by milimetric): The build has errored. https://travis-ci.org/wikimedia/limn/builds/89646031 [14:01:26] thanks travis :P [14:10:37] joal: you can change it back, we'll then put the strategy in the tables' defs [14:10:51] mobrovac: awesome, thank you :) [14:11:19] mobrovac: something else: the replication factor :) [14:11:32] what about it? [14:12:35] we changed it to 2, now back to 3 :) [14:12:43] mobrovac: --^ [14:12:57] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1788236 (Milimetric) @Qgil: I wanted to wait until the Pageview API was announced publicly by the communications team through a blog post. However, I'll update... [14:14:06] joal: actually, don't change the stategy jsut yet [14:14:14] 2~k, waiting then [14:14:33] joal: i need to start/stop/start/stop restbase on aqs1001 to make sure i know what the problem is [14:14:49] mobrovac: please go on, nothing critical running [14:14:53] kk [14:14:58] * mobrovac doing it [14:16:14] thanks milimetric for the answer to Qgil [14:16:25] I sent nuria an email asking her for guidelines [14:17:05] joal: I'll handle this. I thought Nuria had an understanding with Qgil, but it's clear that he expected something else from us [14:17:15] yup milimetric [14:17:26] I'll take over the wikistats report thing [14:17:27] I will be updating the task now and cc-ing as many people as possible [14:17:30] joal: milimetric: who did the update of the aqs deploy repo? [14:17:38] and then I'll beat Qgil at chess so he stops being mean to us [14:17:42] :P [14:17:48] hehehe milimetric :) [14:17:49] mobrovac: me [14:17:59] I could try with go, but I'm bad at chess :) [14:18:03] I had problems with the git commands [14:18:03] did you supply --force to the build script? [14:18:11] i did not [14:18:14] kk [14:18:28] milimetric: it's better to do so always [14:18:41] i follow this guide: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/AQS#Deploying [14:18:47] I'll update with --force now [14:18:51] kk gr8 [14:18:56] btw, the --review plain never works mobrovac [14:19:18] milimetric: probably because you haven't set your git user in the deploy repo [14:19:19] it works for me [14:19:54] milimetric: will you run the build script or should i? [14:19:57] RAHHHH mobrovac : git push --force -- http://i.imgur.com/R7tEQPA.gif [14:20:16] loool joal [14:20:50] mobrovac: I can run it and try to fix the git user thing... [14:22:22] mobrovac: git config user.name and user.email are set, is there anything else it needs? [14:22:41] hm strang [14:22:42] re [14:23:18] it's not like it fails to do anything, it just hangs if I run it with --review [14:23:24] and exits normally otherwise [14:23:32] h [14:23:34] hm [14:23:34] anyway, I'll redo the server.js --build-repo --force [14:23:40] we'll need to inspect that later [14:23:52] k, one thing at a time :) [14:23:59] milimetric: server.js build --deploy-repo --force [14:24:13] that's what i mean, sorry [14:24:33] joal: you want SizeTieredCompactionStrategy ? [14:24:38] on which tables? [14:24:50] Yessir here are the details: ALTER TABLE "local_group_default_T_pageviews_per_article_flat"."data" WITH compaction = {'max_sstable_age_days': '365', 'base_time_seconds': '60', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}; [14:25:02] This only table is ok [14:25:11] The rest is small enough not to be any issue [14:25:40] k, mobrovac that looks like what went wrong: https://gerrit.wikimedia.org/r/#/c/251500/ [14:25:48] oops mobrovac, actually not size tiered, but date tiered [14:25:53] and ottomata needs to deploy again now [14:26:10] yeah sorry joal, that's what i meant [14:26:15] np mobrovac :) [14:27:35] joal: hm, we've got { 'class': 'DateTieredCompactionStrategy', 'base_time_seconds': '45', 'tombstone_threshold': '0.02', 'unchecked_tombstone_compaction': 'true' } [14:27:40] would that work for you? [14:28:31] mobrovac: we don't delete (alsmost never), so tiombstone stuff I don't really know [14:28:41] base time seconds 45 is fine, yes [14:28:44] k [14:29:06] mobrovac: whatever, short enough for compaction to happen very regularly on small amounts [14:29:14] kk [14:29:18] good :) [14:41:41] joal: trying to test email notification, I set timeout to 1 min and run the job on a folder without the _IMPORTED flag. The job fails with DONEWITHERROR but no mails are sent: https://phabricator.wikimedia.org/P2287 [14:41:54] are mail sent when jobs fail with a timeout? [14:42:24] very good question dcausse, I think we have not tested that case [14:42:28] joal: milimetric: https://github.com/wikimedia/restbase/pull/400 [14:43:13] mobrovac: what's that dtcs fanciness? [14:43:30] milimetric: joal can explain it :) [14:43:39] joal: ok I'll try to simulate another error (I'll drop the table) [14:43:40] * mobrovac is back to hunting bugs [14:43:50] milimetric: DateTieredCompaction [14:43:54] dcausse: ok [14:44:52] joal: please review it, and if it's ok we can merge it so that it converts the CS automatically on the next deploy [14:44:58] which is bound to happen today anyway [14:45:33] joal: can merge that cirrussearch camus cron change, should I go ahead? [14:45:44] please ottomata :) [14:46:10] mobrovac: Shall I trust you about the pattern: 'timeseries', or do you want me to ge deeper ? [14:47:34] joal: here's the def of "timeseries": https://github.com/d00rman/restbase-mod-table-cassandra/blob/7e717b89060809558c346955f8e2507dd7d3f948/lib/dbutils.js#L945-L959 [14:48:18] ok mobrovac, good to go for me :) [14:48:49] joal: fwiw, i checked it works locally [14:48:52] kk, merging [14:48:59] awesome, thanks mobrovac [14:49:17] joal: actually, could you just state on the PR that it's LGTM ? [14:49:27] mobrovac: just a reminder after compaction, replication foactor ? [14:49:39] ah right [14:49:41] you want 2? [14:49:48] yes please :) [14:49:53] but 1 commit at a time :) [14:49:59] joal: for all tables? [14:50:06] Just commented on the thread mobrovac [14:50:32] dcausse: puppet change a bout camus --check merged [14:50:42] thanks! :) [14:50:46] I'll monitor to see if _IMPORTED is correctly created [14:50:46] joal: all tables or just flat? [14:51:02] only flat keyspace, the rest is peanuts [14:52:58] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1788326 (Milimetric) a:Milimetric [14:53:20] joal: merged and applied on an27 [14:53:56] awesome ottomata, will monitor next camus run [14:57:07] joal: damn, only 1 and 3 replicas are supported implicitly in the storage module [14:57:21] i propose we deal with it next week [14:57:32] mobrovac: no prob, I'll change it manually for now [14:57:45] kk joal [14:57:47] mobrovac: I'll wait for next deploy and change it after [14:57:58] yup, makes sense [14:58:10] joal, i'm merging the DTCS PR as soon as travis declares all good [15:41:27] joal: milimetric: ottomata: https://gerrit.wikimedia.org/r/#/c/251514/ [15:41:36] when you deploy that aqs1001 should come up [15:41:46] and joal, it contains the switch to DTCS [15:42:26] I'll let milimetric and ottomata deploy (they are my master deployers), and will check keyspace config (and change replication foactor) [15:42:35] mobrovac: --^ [15:42:38] Thanks mobrovac :) [15:42:48] kk [15:42:50] np joal [15:43:09] i should deploy? [15:43:20] ottomata: if milimetric says so [15:43:33] Chain of command guys :) [15:43:43] haha [15:43:52] i wish i wasn't a deploy proxy [15:43:58] right [15:43:58] :( [15:44:06] this is *no* chain of command [15:44:09] this is a comedy of errors [15:44:13] :D [15:44:33] Maybe a chainsaw of command :) [15:44:35] argh: Encountered AvroSerdeException determining schema. Returning signal schema to indicate problem: Unable to read schema from given path: jar:file:/home/dcausse/refinery-camus-0.0.23-SNAPSHOT.jar!/CirrusSearchRequestSet.avsc [15:44:36] ok, so that last change from marko's merged [15:44:48] so ottomata you can do the ansible-deploy thing anytime [15:44:59] and let's hope it comes up this time [15:45:01] arf dcausse :( [15:45:04] at runtime (executing a query) the jar:file trick does not work [15:45:17] unless this jar has to been present on all nodes [15:45:21] Told you to test with data dcausse :-P [15:45:24] huhu [15:45:37] refinery camus is deployed on all nodes? [15:45:42] first question: have you added the jar to the session ? [15:45:46] running --check [15:46:26] dcausse: --^ [15:46:33] joal: no? [15:46:41] not sure how to do that [15:47:01] ok ! first thing first: ADD JAR /local/path/to/jar.jar; in hive [15:47:05] dcausse: --^ [15:47:09] !log deploying aqs [15:47:10] Might help :) [15:47:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [15:47:13] ok [15:49:41] dcausse: hdfs dfs -ls /wmf/data/raw/mediawiki/mediawiki_CirrusSearchRequestSet/hourly/2015/11/06/14 [15:49:44] milimetric: , joal, mobrovac, aqs deploy successful [15:49:50] \o/ [15:49:57] dcausse: --> _IMPORTED created :) [15:50:01] thx all [15:50:06] ottomata: camus for media with check is fine [15:50:10] joal: \o/ [15:50:12] Thanks for the merge / deploy [15:50:16] :D [15:50:20] Happy friday ! [15:50:29] yey [15:51:13] yeehaw! [15:51:51] !log Change replication factor to 2 in cassandra per_article_flat keyspace [15:51:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [15:52:20] mobrovac: compaction config ok :) [15:52:20] I think I'll try the STORED AS AVRO keyword to avoid this schema dependency :/ [15:52:43] joal: \o/ [15:52:45] hmmm dcausse [15:53:01] I don't know enough about avro, but whatever works is good ! [15:53:13] thanks a lot mobrovac for the Friday fix ! [15:53:39] also milimetric mobrovac, new v field in per-project table, typed as bigint :) [15:53:49] Yay \o/ [15:54:00] :) [15:55:29] querying seems to work too [15:55:33] joal: mind testing if https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Breakfast/daily/2015100100/2015101000 works? [15:55:51] Works for me milimetric [15:55:56] what... the hell... [15:55:59] this works for me: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/What/daily/2015100100/2015101000 [15:56:09] so same exact url with "What" instead of "Breakfast" [15:56:12] cache turned off, everything [15:56:33] and the Breakfast one doesn't milimetric ? [15:56:48] Man, if no breakfast, stay in bed ! [15:56:52] yes, Breakfast one does a google search! [15:57:00] WHAT IS GOING ON [15:57:05] :) [15:57:07] This is chrome shiiiite I guess [15:57:18] good idea, let's try ff [15:57:40] works in FF [15:57:44] ;) [15:57:50] f-ing Chrome [15:57:54] ok, well, w-ever [15:58:10] joal: what did you think about my updates for the summit proposal? [15:58:34] milimetric: same as last week: I'll keep my weekend quiet and will start backfilling new V va;ues on monday, ok ? [15:59:07] * joal reads [16:00:10] yeah, no worries, no rush on the monthly numbers [16:00:22] milimetric: Task description is great ! [16:00:27] Thanks mate [16:01:21] but q above ^ [16:02:08] ? milimetric [16:04:13] :) what? [16:04:16] good? [16:04:48] man, I'm lost milimetric :) What do you mean by 17:01:21 < milimetric> but q above ^ [16:05:09] just this: joal: what did you think about my updates for the summit proposal? [16:05:52] Right milimetric, answer got lost in the middle: 17:00:22 < joal> milimetric: Task description is great ! [16:10:13] so, joal, cool! you are using the --check to make an _IMPORTED flag on the Cirrus data? [16:12:30] STORED AS AVRO seems to work [16:13:48] ottomata: correct [16:14:03] ottomata: it wokrs, but timestamp data is till wrong [16:14:42] milimetric: Thanks a lot for the update on the summit task :) [16:15:53] ah, joal how is it wrong? [16:16:24] ottomata: timestamp is "job" timestamp, not data timestamp (remember yesterday, we talked about that) [16:16:43] ottomata: https://gerrit.wikimedia.org/r/#/c/251267/ [16:17:53] HM! interesting. [16:46:29] hey a-team [16:47:24] hallo! [16:52:25] so does anyone have good examples of UDFs that operate over arrays? [16:52:34] the Cirrus data is stored in a semi-nested way and I want to be able to process it :) [16:53:02] Ironholds: use LATERAL VIEW ? [16:53:11] joal, does that work within Java UDFs? [16:53:26] Ironholds: Hive way to explode arrays as rows [16:53:27] this is for ongoing reporting and ETL processes [16:53:32] aha [16:54:35] Analytics-General-or-Unknown, Database, Patch-For-Review: Create a table in labs with replication lag data - https://phabricator.wikimedia.org/T71463#1788695 (jcrespo) p:Triage>Low The table exists already, but for some reason it does not show all shards. I will have to investigate later. [16:55:44] Analytics-EventLogging, Database: db1046 innodb signal 6 abort and restart - https://phabricator.wikimedia.org/T104748#1788699 (jcrespo) p:Triage>Low [16:57:36] ideally I'd like to be able to write it as UDFs to minimise faffing around on the user end. Hrn. [16:59:09] Ironholds: my view on UDF vs hive - If it's easy in hive, why use a UDF ? [16:59:26] Ironholds: So, as I don't know how complex the things you want are, I can't really say :) [17:00:32] joal, outputs based around multiple regex-based nested conditionals operating over multiple elements ;p [17:00:38] a-team: anyone in standup? [17:00:46] am i in a parallel world? [17:00:58] sorry milimetric [17:00:58] coming! [17:01:00] joining now [17:01:12] Ironholds: Ok, ok, go for your Java stuff :-P [17:01:17] *thumbs up* [17:01:31] so any idea how I'd go about that? Is there just an Array type in hive that maps to a Java Array? [17:02:06] Ironholds: Take example on how to produce a map in mediacount for instance [17:02:12] Ironholds: nothing better in our code base [17:02:49] joal, perfect! I didn't know it existed; thankee :) [17:02:54] np Ironholds [17:03:41] Analytics-Kanban: Improve loading Analytics Query Service with data {slug} [subtasked] - https://phabricator.wikimedia.org/T115351#1788724 (Milimetric) a:JAllemandou [17:04:46] Analytics-Kanban: Improve loading Analytics Query Service with data {slug} [5 pts] - https://phabricator.wikimedia.org/T115351#1788736 (Milimetric) [17:04:52] Analytics-Kanban: Pageview API showcase App {slug} - https://phabricator.wikimedia.org/T117224#1788737 (mforns) I've put the code in here for now: https://gist.github.com/marcelrf/49738d14116fd547fe6d#file-article-comparison-html And here is the demo accessible for review: https://metrics-staging.wmflabs.or... [17:20:51] Analytics-Kanban, Patch-For-Review: Add avro schema to refinery-camus jar [3 pts] {hawk} - https://phabricator.wikimedia.org/T117885#1788790 (Milimetric) a:Nuria [17:34:40] we will have to copy the avro schema to hdfs :( [17:35:41] is there a script that deploys the refinery artifact to hdfs? [17:36:37] dcausse: every jar we have refinery is deployed to hdfs yes [17:37:08] would it be possible to hook a small script to extract the schema from the jar and deploy it to hdfs ? [17:37:55] dcausse: not nice :( [17:37:59] yes :( [17:38:21] other option would be to use an url like http:// [17:39:11] dcausse: it can also be embedded directly into the create table statement i suppose [17:39:15] dcausse: I think we should park this and discuss next week (I'm currently in meeting, and leave just after) [17:39:47] ebernhardson: I tried but it does not work :( [17:39:56] oh :( [17:40:30] dcausse: the docs suggest not using http, "as the schema will be accessed at least once from each task in the job, this can quickly turn the job into a DDOS attack against the URL provider" [17:40:38] maybe I've done something wrong in my shell, I tried to pass it as a hive param, I'll try to copy it directly in the hql file [17:40:39] maybe not a big deal in our use case, but they seem to not suggestit :) [17:44:14] lol, marcel, there's such an article as CatDog :) [17:44:21] sorry, mforns ^ [17:44:25] milimetric, xD [17:44:50] marcel pings me, too [18:00:06] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1788962 (mforns) Hey, I've been using the API a bit for a sample App we (Analytics) are developing, and I have a couple practical observations: 1) It may be... [18:02:46] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, and 2 others: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1788998 (Jgreen) >>! In T97676#1785592, @Pcoombe wrote: > @awight Sounds like it would be safest to just take ca... [18:03:39] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, and 2 others: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1789013 (Jgreen) [18:03:42] Analytics-Cluster, operations, Patch-For-Review: Turn off webrequest udp2log instances. - https://phabricator.wikimedia.org/T97294#1789012 (Jgreen) [18:07:01] mforns: compared linux distro pv :) so niiiiice [18:07:13] a-team, I'm off for today ! [18:07:18] :] [18:07:24] Have a good weekend y'all! [18:07:25] have a nice weekend joal! [18:07:29] have a great weekend [18:50:51] Analytics-Kanban: Pageview API showcase App {slug} - https://phabricator.wikimedia.org/T117224#1789377 (kevinator) I showed it to Dario and he had a suggestion: underneath the chart, add links to the to the RESTFul API call being made. That way users can see how the API is being used and can follow the lin... [20:09:23] Can someone provide me a math angle on calculating sample ratio with a random token? I see this implementation https://gerrit.wikimedia.org/r/#/c/238306/6/modules/ext.wikimediaEvents.searchSuggest.js which uses division remainder [20:09:31] e.g. rand % popSize === 0 [20:09:56] Over here though, a devision with max_int is used. https://gerrit.wikimedia.org/r/#/c/250057/8/modules/ext.eventLogging.Schema.js [20:10:08] granted that uses a sample ratio between 0 and 1 instead of a population size, but that can be converted. [20:10:18] I'm just curious whether one is more attractive than the other for some reason [20:10:28] the remainder method seemed stable in initial testing [20:10:34] nuria: [20:52:36] Analytics-Backlog, The-Wikipedia-Library, Wikimedia-General-or-Unknown: Implement Schema:ExternalLinkChange - https://phabricator.wikimedia.org/T115119#1789825 (Sadads) [21:08:29] Analytics-EventLogging: Many duplicate events collected from client side javascript - https://phabricator.wikimedia.org/T101867#1789856 (Krinkle) [21:21:23] Analytics-EventLogging: Many duplicate events collected from client side javascript - https://phabricator.wikimedia.org/T101867#1789920 (Krinkle) I'm not sure whether that query is an accurate way to determine whether events were sent to the server multiple times. There will inevitably be many duplicates (w... [22:08:48] Analytics-EventLogging: Many duplicate events collected from client side javascript - https://phabricator.wikimedia.org/T101867#1790038 (Krinkle) Here's a quick query. I'm not super confident but it's a first shot. ```lang=mysql select event_method, count(*) as client_count, AVG(unique_percent) from (... [22:29:45] Analytics-Backlog, The-Wikipedia-Library, Wikimedia-General-or-Unknown: Implement Schema:ExternalLinkChange - https://phabricator.wikimedia.org/T115119#1715388 (Sadads) [23:09:45] Analytics-Backlog, WMDE-Analytics-Engineering, Wikidata, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1790200 (Lydia_Pintscher) [23:51:01] Analytics: Update reportcard.wmflabs.org with July-October data - https://phabricator.wikimedia.org/T116244#1790321 (Tbayer) a:Milimetric [23:52:05] Analytics: Update reportcard.wmflabs.org with July-October data - https://phabricator.wikimedia.org/T116244#1790325 (Tbayer) Updated the task description (said charts are still stuck in June) and assigned it to Dan, who IIRC has been taking care of this previously, based on data provided by Erik.