[01:28:58] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3226802 (10Tbayer) @jcrespo I totally hear you on 10.2 not being ready. But there has to be some way to record this analytics need on Phabricat... [04:06:26] 10Analytics: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3226937 (10Nuria) [04:07:01] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3226949 (10Nuria) I think the code can benefit from many improvements on frontend before you need backend improvements given that there is no paralelization, the throt... [05:24:57] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3226250 (10Marostegui) >>! In T164224#3226802, @Tbayer wrote: > @jcrespo I totally hear you on 10.2 not being ready. But there has to be some w... [05:42:11] 10Analytics, 10DBA: Json_extract available on analytics-store.eqiad.wmnet - https://phabricator.wikimedia.org/T156681#3227009 (10Marostegui) Sorry I missed T156681#3195441 but as I said in the previous comment, we were (and are) only evaluating 10.1 for production nowadays. 10.2 doesn't even have a GA yet. [06:00:15] 10Analytics-EventLogging, 06Analytics-Kanban, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#3227029 (10Krinkle) They no longer happen for NavigationTiming or SaveTiming, but they are still visible for... [06:13:00] 10Analytics-EventLogging, 06Analytics-Kanban, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2780126 (10Krinkle) 05Open>03Resolved a:03Krinkle [07:14:38] o/ [07:14:55] joal: I am pretty sure that the dataloss is related to the varnish upload mailbox issue [07:15:44] afaik sometimes the thread responsible to for the TTLs is overloaded and Varnish all of a sudden starts acting weird [07:16:04] and Varnishkafka might surely read null records or get to timeout while reading the shm log [07:16:15] so "expected" but of course not really great :( [07:20:30] I am now reading #traffic where Andrew and Brandon had a conversation [07:20:41] seems that they didn't find a good explanation [07:20:54] but I am pretty sure it is a Varnish issue [07:21:56] even if, from what I can read, the mailbox issue is a varnish backend problem, and varnishkafka is (righfully) reading from the varnish frontends only [08:08:54] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227167 (10jcrespo) @Tbayer Let me show you how to use a programming language to obtain the desired results: in PHP: ```lang=php $query = "SE... [09:22:44] Thanks elukey for the nice explanation :) [09:26:52] joal: o/ [09:27:30] elukey: Was the weekend good? [09:28:07] joal: yep! Not a great weather but it was nice and relaxing :) [09:28:26] (except on Sunday that a restbase host exploded :P) [09:28:40] elukey: wow /o\ [09:28:46] elukey: usual issue or something else? [09:29:27] joal: it was the "usual" OOM due to tombstones filling up the heap in eqiad (inactive DC used by ChangeProp only atm) [09:29:31] elukey: having the new nodes in the cluster really makes a difference :) [09:30:08] elukey: it computed 2 month of daily uniques in half a day ::) [09:30:12] elukey: mwarf [09:30:18] \o/ [09:30:42] joal: I'd like to ask you an opinion about local_one vs local_quorum for AQS [09:31:00] while reading things about cassandra I stumbled again upon read repairs [09:31:37] and even if we don't need repairs most of the times, now it would be good not to rely only on local_one on read [09:32:01] since the new hw and compaction scheme should easily tollerate local_quorum [09:33:43] (restbase uses local_quorum) [09:41:39] elukey: Thinking about quorum - I think it'll bump our response latency if we go from local_one to local_quorum [09:41:53] elukey: But it would be interesting to check :) [09:42:51] joal: probably a little bit but it is easy to revert with Scap if we see that it is not feasible for us [09:43:03] the good think would be that we'd get (again) read repairs if needed [09:43:32] elukey: I understand :) [09:44:02] elukey: DC consistency is way stronger than local_machine consistency both for read and write - it has a cost :) [09:44:14] The question elukey is if we can pay the price :) [09:46:35] only testing will let us know :) [09:47:38] elukey: Ahh, those products where price is not shown ;) [09:53:52] !log Restart mediawiki history jobs to pick up new snapshot format [09:53:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:24:50] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227445 (10Tbayer) >>! In T164224#3227167, @jcrespo wrote: > @Tbayer Let me show you how to use a programming language to obtain the desired re... [10:46:53] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227476 (10jcrespo) > This task is about the ability to do this in SQL and get direct results without adding steps That was declined at T16422... [10:58:19] 10Analytics, 06Operations, 10Traffic: Add VSL error counters to Varnishkafka stats - https://phabricator.wikimedia.org/T164259#3227497 (10elukey) [10:58:36] joal: --^ [10:58:57] let me know (when you have time) what you think about it [11:10:27] 06Analytics-Kanban, 10DBA, 06Operations, 15User-Elukey: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073#3227526 (10elukey) [11:10:38] 06Analytics-Kanban, 15User-Elukey: Metrics and Dashboards for Piwik - https://phabricator.wikimedia.org/T163204#3227527 (10elukey) [11:16:19] another thing that I am wondering if we want to apply a retention time to Piwik's data [11:16:33] because it would surely be easier to backup [11:37:50] elukey: about piwik we shouold ask Dan, he knows the users - But I support the idea :) [11:38:53] 10Analytics, 06Operations, 10Traffic: Add VSL error counters to Varnishkafka stats - https://phabricator.wikimedia.org/T164259#3227497 (10JAllemandou) +1 for that! Thanks @elukey for raising this. [11:39:34] joal: --^ which one? 1) or 2) ?:) [11:40:25] elukey: let's re-articulate - I just sent a comment on varnishkakfa: +1 [11:40:37] elukey: About Piwik, +1 as well, but we need Dan :) [11:41:26] hahahaha sure sorry I didn't ask very clearly [11:41:35] in the Varnishkafka task there are two options [11:41:41] which one do you prefer? [11:41:53] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227550 (10Marostegui) >>! In T164224#3227445, @Tbayer wrote: > > This task is about the ability to do this in SQL and get direct results wit... [11:44:25] elukey: Both would work for me [11:44:35] elukey: I have no real preference [11:45:08] elukey: The first would mean us managing our alarms better, the second one allows to check when an alarm occurs for us if it comes from those counters [11:45:17] elukey: no strong opinion :) [11:48:03] or both! [11:48:07] :) [11:48:08] :) [11:49:49] 06Analytics-Kanban: Finalize list of metrics, breakdowns, and filters for Wikistats 2.0 backend - https://phabricator.wikimedia.org/T163356#3227558 (10JAllemandou) a:03JAllemandou [11:51:42] 06Analytics-Kanban: Finalize list of metrics, breakdowns, and filters for Wikistats 2.0 backend - https://phabricator.wikimedia.org/T163356#3194569 (10JAllemandou) See - https://docs.google.com/document/d/10cTkWcxOE89kx_HejlAbRyiRjlhXL13Cii0hfPOki4c (last 2 sections) - https://docs.google.com/spreadsheets/d/... [11:53:45] 10Analytics, 10Analytics-Wikistats: Backend for wikistats 2.0 - https://phabricator.wikimedia.org/T156384#3227566 (10JAllemandou) [11:53:47] 06Analytics-Kanban: Finalize list of metrics, breakdowns, and filters for Wikistats 2.0 backend - https://phabricator.wikimedia.org/T163356#3227565 (10JAllemandou) [11:56:31] currently reading https://piwik.org/faq/troubleshooting/#faq_42 [11:56:37] 10Analytics, 10Analytics-Wikistats: Backend for wikistats 2.0 - https://phabricator.wikimedia.org/T156384#3227572 (10JAllemandou) [11:56:39] 06Analytics-Kanban: Design document for wikistats prototype backend - https://phabricator.wikimedia.org/T162817#3227571 (10JAllemandou) [11:57:09] 06Analytics-Kanban: Initial Launch of new Wikistats 2.0 website - https://phabricator.wikimedia.org/T160370#3227574 (10JAllemandou) [11:57:13] 10Analytics, 10Analytics-Wikistats: Backend for wikistats 2.0 - https://phabricator.wikimedia.org/T156384#2973128 (10JAllemandou) [11:57:34] 06Analytics-Kanban: Initial Launch of new Wikistats 2.0 website - https://phabricator.wikimedia.org/T160370#3096561 (10JAllemandou) [11:57:37] 06Analytics-Kanban: Design document for wikistats prototype backend - https://phabricator.wikimedia.org/T162817#3175684 (10JAllemandou) [11:58:43] 06Analytics-Kanban: Productionize Edit History Reconstruction and Extraction - https://phabricator.wikimedia.org/T152035#2836132 (10JAllemandou) a:03JAllemandou [11:59:25] elukey: doing some cleanup in board - Can we move T157807 and T152713 to done ? [11:59:25] T157807: Reinstall Analytics Hadoop Cluster with Debian Jessie - https://phabricator.wikimedia.org/T157807 [11:59:25] T152713: Hadoop cluster expansion. Add Nodes - https://phabricator.wikimedia.org/T152713 [12:01:14] 06Analytics-Kanban, 10Analytics-Wikistats: Backend for wikistats 2.0 - https://phabricator.wikimedia.org/T156384#3227596 (10JAllemandou) [12:01:16] so --^ might need a bit of time since we are still waiting one host [12:01:32] elukey: ah right, forgot about an1030 [12:01:42] nope an1069! [12:01:44] elukey: What about cluster expansion?> [12:01:49] Ohhh [12:02:02] elukey: I completely misunderstood :) [12:02:12] elukey: We're still misding 1 new host? [12:02:44] yep! [12:02:55] Arf [12:03:02] ok nvermiond then :) [12:03:14] in the meantime I discovered that we already purge data older than 30 days and keep only reports [12:03:56] elukey: Arf, okey then [12:05:08] piwik says though: Current database size: 18.3 G [12:05:12] but I can see 66GB [12:05:19] elukey: Weird !! [12:05:26] elukey: indices? [12:07:25] not sure! checking now [12:07:33] with show table status from piwik; [12:13:39] or maybe this is normal for INNODB [12:13:41] totally ignorant [12:14:26] but maybe indexes are playing a role in here [12:16:13] (03CR) 10Gergő Tisza: [C: 032] Fix CI errors [analytics/multimedia] - 10https://gerrit.wikimedia.org/r/324380 (owner: 10Gergő Tisza) [12:17:26] (03CR) 10Gergő Tisza: [C: 032] "Self-merge, not much interest in this repo and cannot get more broken than it is already." [analytics/multimedia] - 10https://gerrit.wikimedia.org/r/324379 (https://phabricator.wikimedia.org/T98449) (owner: 10Gergő Tisza) [12:17:31] (03Merged) 10jenkins-bot: Fix CI errors [analytics/multimedia] - 10https://gerrit.wikimedia.org/r/324380 (owner: 10Gergő Tisza) [12:18:04] so I used SELECT table_schema as `Database`, table_name AS `Table`, round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB` FROM [12:18:07] information_schema.TABLES ORDER BY (data_length + index_length) DESC [12:18:19] and the 18.3G are index + data len [12:18:20] weird [12:18:30] I checked all the other databases and nothing came up [12:18:46] (03Merged) 10jenkins-bot: Fix SQL queries [analytics/multimedia] - 10https://gerrit.wikimedia.org/r/324379 (https://phabricator.wikimedia.org/T98449) (owner: 10Gergő Tisza) [12:18:59] weird indeed elukey [12:21:16] taking a break a-team, later ! [12:23:42] seems the ibdata1 file that is 66GB in size :/ [12:38:40] 06Analytics-Kanban, 10DBA, 06Operations, 15User-Elukey: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073#3227650 (10elukey) About those 66GB: Piwik uses ~18.5 GB of data, but /var/lib/mysql/ibdata1 is 66GB (no innodb_file_per_table set). [12:45:20] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227655 (10faidon) First off, let's tone this down a little bit and try to unpack the conversation to all of the separate contentious points :)... [12:54:01] !log enabled mysql slow query log on bohrium (/var/log/mysql/slow-query.log0 [12:54:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:57:49] !log set long_query_time=5 to mysql on bohrium [12:57:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:58:23] now let's see if anything good comes up [13:17:37] running errand for ~1 hour, ttl team1 [13:17:40] ! [13:17:42] :) [13:17:57] (reachable via hangouts in case you need me) [13:22:10] 10Analytics, 06Operations, 10Traffic: Add VSL error counters to Varnishkafka stats - https://phabricator.wikimedia.org/T164259#3227754 (10Ottomata) Why not both!? :) [13:30:42] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227779 (10Ottomata) > it is a bit surprising that this is now a drawback for the solution that was implemented on: T153207 We didn't lose any... [13:47:15] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227811 (10faidon) >>! In T164224#3227779, @Ottomata wrote: >> and hoping to "[get] people of of EventLogging MySQL in Q1 next FY". > I now thi... [13:50:34] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3227830 (10Ottomata) The way we are talking about it, it is the former. We're sure we want T162610, so we are doing it before we think about h... [14:05:37] hi team :] [14:06:08] hiii [14:12:46] joal: yt? [14:21:24] 10Analytics, 10MediaWiki-General-or-Unknown, 15User-Tgr: Make aggregated MediaWiki Pingback data publicly available - https://phabricator.wikimedia.org/T152222#3227871 (10Tgr) [14:30:36] mforns: was it you that attended that anti-harassment task force with fran? [14:30:58] milimetric, no... I think it was joal? [14:31:13] cool, thx, jo if you have links, I'd be interested [14:36:33] 10Analytics, 06Developer-Relations, 10MediaWiki-API, 06Reading-Admin, and 4 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#3227944 (10Tgr) [14:37:23] Hi ottomata [14:38:17] hi joal, was going to ask you some questions about loading druid data in labs, but i think i'm onto something [14:38:26] k ottomata [14:38:32] here now, you can ask :) [14:38:50] heh, k will when i get unstuck, currently not enough memory for middlemanager indexing :/ [14:38:53] in labs [14:39:37] weird, middle-manager is only a peon manager for us, right? [14:40:03] ottomata: o/ [14:40:20] milimetric: looking for notes link [14:42:15] 10Analytics, 06Operations, 10Traffic: Add VSL error counters to Varnishkafka stats - https://phabricator.wikimedia.org/T164259#3227966 (10elukey) >>! In T164259#3227754, @Ottomata wrote: > Why not both!? :) Yes! I was concerned that the new field would have been a bit too much, but if we are ok with the new... [14:47:14] milimetric: notes from meetings 1 & 2 - https://docs.google.com/document/d/1aUXisjTtED0EycSntxFOc_3GJOAE-bDavo1IQ-kL-Ts/edit#heading=h.9vxson5murib - https://docs.google.com/a/wikimedia.org/document/d/1mjtLWntyzj-kYiBCJfsyPBgsxt8Am3QhFpSI8gyr9q4/edit?usp=sharing [14:47:20] milimetric: next meeting is tomorrow [14:48:47] thanks joal [14:48:59] np milimetric [15:02:16] 06Analytics-Kanban: Update druid to latest release - https://phabricator.wikimedia.org/T164008#3228070 (10Ottomata) a:03Ottomata [15:02:23] 06Analytics-Kanban: Update pivot to latest source - https://phabricator.wikimedia.org/T164007#3218097 (10Ottomata) a:03Ottomata [15:02:39] 06Analytics-Kanban: upgrade druid and pivot - https://phabricator.wikimedia.org/T157977#3228075 (10Ottomata) a:03Ottomata [16:01:02] 10Analytics, 10Analytics-EventLogging, 10DBA: Update MariaDB on analytics-store to a version that supports JSON functions - https://phabricator.wikimedia.org/T164224#3228211 (10Nuria) >We didn't lose any functionality by converting this field to JSON. Previously it was a free form user agent string, difficul... [16:16:45] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3228250 (10kaldari) Regarding solution #1, it looks like the script is currently waiting 0.01 seconds after every 99 requests, which basically has no effect. So removi... [16:23:37] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3228279 (10kaldari) For reference, each request to the pageview API takes about 2 seconds to complete via curl. [16:44:21] milimetric, what is what you need from me in github then? my username? [16:49:27] mforns: nono, just that you forked the repo and I can add you as a remote [16:49:35] milimetric, oh ok! [16:49:48] the way we can work then is like: [16:49:53] git flow feature start ... [16:50:00] commit commit ... [16:50:06] git flow feature finish [16:50:07] git push [16:50:16] and we can pull from each other [16:50:59] we don't need to go through pull requests at the start, we'll be too separate for that [16:51:12] I see [16:51:20] I forked it already [16:53:02] * elukey afk! [17:01:01] Question about pageview querying: https://www.mediawiki.org/wiki/Topic:Tpk6jz1sw6lmp128 [17:03:27] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3228455 (10Nuria) >For reference, each request to the pageview API takes about 2 seconds to complete via curl. mmm.. that seems quite high, maybe worth troubleshootin... [17:06:59] 10Analytics, 06Research-and-Data-Backlog: Host API for edit productivity dataset - https://phabricator.wikimedia.org/T164280#3228458 (10Halfak) [17:14:27] milimetric, joal: do you guys have a sec to talk in batcave about a request that is kind of long to explain on irc? [17:14:47] omw [17:15:11] joining [17:16:47] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3228508 (10Niharika) >>! In T164178#3228279, @kaldari wrote: > For reference, each request to the pageview API takes about 2 seconds to complete via curl. I wonder wh... [17:29:08] ottomata: bc? [17:32:10] sho [17:33:27] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3228581 (10kaldari) My test was from my local machine. Wow, that's a huge difference! [17:36:48] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3228594 (10Nuria) FYI that our "usual" latencies (when not in switchove mode) are arround 50ms at percentile 99 [17:43:56] ottomata: o/ [17:44:20] are you going to add Victoria's user to the wmf ldap group? [17:45:01] did! [17:45:03] i think [17:45:06] nuria is confirming [17:45:41] oh yes just seen it [17:45:42] nice :) [17:46:04] was it done directly with the puppet merge or with ldapvi etc..? [17:46:50] hmm, i did puppet merge, then i ran /usr/local/bin/cross-validate-accounts from ldap-codfw.wikimedia.org [17:46:53] then i did ldapvi [17:47:02] ahh okok [17:47:09] super [17:47:13] hmm [17:47:15] not from lda-codfw [17:47:16] uhh [17:47:29] from wasat [17:47:33] .codfw [17:47:36] (codfw terbium) [17:47:52] yep yep [17:48:28] thanks! Going offline again, saw the pings and checked, didn't see that you were already done :) [17:48:31] gooooood [17:48:37] ok cool, laters! [17:48:46] o/ [18:00:33] 10Analytics, 10Analytics-EventLogging, 06Collaboration-Team-Triage, 10MediaWiki-ContentHandler, and 5 others: Multiple MediaWiki hooks are not documented on mediawiki.org - https://phabricator.wikimedia.org/T157757#3015493 (10Mainframe98) Another list of missing hooks can be found on https://www.mediawiki.... [18:15:42] ottomata for your review (spark job's not ready yet): https://gist.github.com/milimetric/35bd2d79850e4a39b7e1dee837c38eac [18:20:49] 10Analytics, 10Analytics-EventLogging, 06Collaboration-Team-Triage, 10MediaWiki-ContentHandler, and 5 others: Multiple MediaWiki hooks are not documented on mediawiki.org - https://phabricator.wikimedia.org/T157757#3228717 (10Smalyshev) Good task for T126500. [18:22:01] milimetric: sounds good, will need to change topic and group id though, ja? [18:22:16] also [18:22:16] "dataSource" : "pageviews-hourly", [18:22:17] ? [18:22:21] I already did, the banner one was like test_banner_..._joal [18:22:23] i thought it was projectview [18:22:27] OHhH [18:22:28] sorry [18:22:36] that's why i'm confused, are we doing project or pageveiew? [18:22:39] it's pageviews-hourly not pageview_hourly, lol [18:22:48] it's technically more project-view-ish [18:22:54] but called pageviews-hourly in druid [18:23:11] (it's the hourly projectview data that goes back 3 months) [18:23:22] hmm ok [18:37:12] nuria_: almost up-to-date hourly [18:37:25] joal: did you relaunched jobs? [18:37:30] nuria_: still missing 2 hours and then, it's missing data [18:37:34] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0010661-170424154741156-oozie-oozi-C/ [18:37:39] nuria_: it's a new job [18:37:45] joal: k [18:37:48] nuria_: written ad-hoc [18:37:58] joal: that seems perfect for the time [18:38:03] nuria_: milimetric has taken on the spark base I used [18:38:13] nuria_: will come back after diner [18:39:45] nuria_: https://goo.gl/K5sQDP [18:39:54] (it's up to 3pm, the job he submitted) [18:48:49] milimetric: wow! [18:49:21] yeah, I know, joseph's pretty awesome :) [19:05:00] jo, I left the new version in a comment on https://gist.github.com/jobar/80a89380ec657948ce365a68a82b7828/ [19:57:45] Heya milimetric [19:57:51] hey joal [19:57:51] will test job in minutes [19:58:10] ok, great, lemme know if I can help, I left two TODOs where I wasn't sure, I think everything else is ok [19:58:23] milimetric: can you double check the kafka topic we had in your tranquility conf? [19:58:34] I wonder about singular or plurial for pageivews [19:58:57] (03PS1) 10Joal: [WIP] Add streaming for druid pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/351370 [20:00:34] milimetric: I've changed small things, but your patch is good :) Many thanks for having moved it forward :) [20:01:06] joal: it was test_pageviews_joal: https://gist.github.com/milimetric/35bd2d79850e4a39b7e1dee837c38eac [20:01:19] awesome great [20:01:20] :) [20:01:27] milimetric: good news! the 'binary' dist is way easier to use than the source code [20:01:34] and it is mostly nodejs, so uhhh, the source is there anyway [20:01:57] HMM [20:01:59] wait [20:02:03] we don't have to host this privately at all [20:02:19] i just downloaded the tarball from imply's website [20:02:25] and then stuck in the license file they gave us [20:03:06] we should check with them first ottomata, right? [20:06:31] hmm, don't think so, beacuse we would only be hosting a version of exactly what anyone can download from their site [20:06:37] https://imply.io/download [20:06:48] download the imply dist [20:06:56] pivot has a 30 day eval without a license file [20:07:04] we copy the dist/pivot stuff into our git repo [20:07:08] and use scap to deploy it [20:07:18] and use puppet private stuff to put the license file in place [20:07:46] we'd still need to figure out where to keep the original source they did give us [20:07:52] but we don't need to deploy it [20:12:07] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add streaming for druid pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/351370 (owner: 10Joal) [20:19:29] ottomata: hello? [20:20:19] ottomata: would you mind starting a tranquility using milimetric config? [20:20:58] HIiIi [20:21:00] oh yYYyaaaaa [20:21:16] joal shoudl work as is with that topic? [20:21:21] test_pageviews_joal [20:21:21] ? [20:21:24] yup [20:23:28] oook its running, is there data? [20:23:44] starting now [20:25:00] oh by the way ottomata: https://github.com/criteo/lolhttp [20:25:06] ottomata: for scala lovers ;) [20:29:35] ottomata: data flows on my side (AFAIK) - Anything on yours ? [20:30:40] i see data, tranquility died, looking [20:30:46] Arf [20:30:54] joal: java.lang.IllegalArgumentException: Instantiation of [simple type, class io.druid.granularity.QueryGranularity] value failed: No enum constant io.druid.granularity.QueryGranularity.MillisIn.HOURLY [20:33:43] milimetric: nuria_, is there a phab task or wikitech page outlining the choice of pivot? [20:33:46] i know we've explained this before [20:33:53] but my email to ops is generating the expected questions [20:34:24] hmm, ottomata - should be hour, no? [20:34:30] ottomata: line 16 [20:34:36] ottomata: sorry for not spotting [20:35:23] oh k [20:35:24] both hour? [20:35:46] yessir [20:35:55] joal: how many msg/ sec in this topic? [20:36:04] hmm [20:36:04] - Flushed {test_pageviews_joal={receivedCount=361846, sentCount=0, droppedCount=0, unparseableCount=361846}} pending messages in 3ms and committed offsets in 43ms. [20:36:11] unparseableCount ? [20:36:22] ottomata: ??? [20:37:08] joal: this is a lot of messages, i hope tranquility can keep up! [20:37:16] ottomata: I think the rate is less than 10k messages per second [20:37:21] looks like a burst of 50Kish(?) messages once every few seconds [20:37:40] ottomata: correct, about 60k messages every 10 secs [20:37:41] hmm, anyway, still getting unparseableCount=467169 [20:37:43] yea [20:37:44] hehe, ok [20:38:20] ottomata: with 24 workers (1 per webrequest partition), I barely keep up filtering and refining in time [20:38:56] joal: can you couple it with the banner job for performance? Like also check isPageview there and output twice? [20:39:27] ottomata: and by the way, I wonder if this is uesful -- Will druid even consider serving the segment before the hour end (since hourly granularity)? [20:40:01] milimetric: it's not really about reading - banners one reads the same amount of messages and works its stuff in less than 5s [20:40:34] ottomata, milimetric: I still don't see data in pivot (while I see it for banners [20:41:07] ottomata: we only did this thing: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Druid/Load_test [20:41:30] and there was that old page where we listed out all the options we considered, but I don't think we explained why we chose druid [20:41:50] ottomata:pivot comes as part of value proposition when we decide to choose druid as a columnary data store, our phab tickets and workis mostly is an evaluation of whether druid capabilities met our use cases (they do), we can link to docs like http://static.druid.io/docs/druid.pdf and our wikistats 2.0 design doc but i am afraid they would not be of help [20:41:50] here [20:42:06] oh, joal, oops, it might be the Geocode instance [20:42:16] it makes an instance every time instead of getInstance() like the others [20:42:26] something is wrong with the tranquilty conf / json data [20:42:31] it says its all unparaseable [20:42:32] milimetric: nope, I added the singleton pattern to geocode object ;) [20:42:35] k [20:42:41] guys i gotta run real soon, joal you can run tranquility as well as i can i think [20:42:45] you have ssh to druid1003 ya? [20:42:47] ottomata: no other columnary data store (cassandra, clickhouse or elastic search) fits our use cases that well and pivot is deluxe as a basically free ui to the datastore [20:43:08] yeah, nuria_ i understand, but i wanted to link to something to justify it easily [20:43:19] ottomata / joal: I propose we shut it down for tonight and try tomorrow, the hourly batch job is already really great and getting good data [20:43:21] ottomata: any CLI example? [20:43:33] milimetric: works for me :) [20:43:39] and it's only behind by like 2 hours, which is totally fine [20:44:03] joal: ya, [20:44:06] cat /home/otto/test_pageviews_joal.tranquility.sh [20:44:07] that's it [20:44:21] ottomata: ah ok in that case the load test is the best we have [20:44:22] Thanks ottomata [20:44:46] joal: i'm going to stop my tranq process [20:44:53] ottomata: okey, will try [20:44:54] ottomata: given that everyone is super looking at pivot now with the turkish blockage let's be real mindful with updates [20:44:57] nuria_: we might want to consider writing something up [20:45:08] ottomata: now? [20:45:09] i betcha we will be asked this over and over again [20:45:10] nono [20:45:19] in general [20:45:21] make a task maybe [20:45:50] ottomata: will do [20:46:36] 10Analytics, 06Analytics-Kanban: Document rationale of choosing druid - https://phabricator.wikimedia.org/T164302#3229272 (10Nuria) [20:53:26] (03PS2) 10Joal: [WIP] Add streaming for druid pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/351370 [21:00:38] (03PS1) 10Ottomata: Updates to pivot 0.11.33 from Imply distribution 2.1.1 [analytics/pivot/deploy] - 10https://gerrit.wikimedia.org/r/351450 (https://phabricator.wikimedia.org/T164007) [21:02:50] 06Analytics-Kanban, 13Patch-For-Review: Update pivot to latest source - https://phabricator.wikimedia.org/T164007#3229329 (10Nuria) Let's be super mindful of update as pivot has quite a lot of usage, could we try update on a parallel domain (pivot-candidate.wikimedia.org) before we do switchover? [21:02:56] (03PS2) 10Ottomata: Updates to pivot 0.11.33 from Imply distribution 2.1.1 [analytics/pivot/deploy] - 10https://gerrit.wikimedia.org/r/351450 (https://phabricator.wikimedia.org/T164007) [21:09:21] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add streaming for druid pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/351370 (owner: 10Joal) [21:36:48] (03PS3) 10Joal: [WIP] Add streaming for druid pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/351370 [21:38:41] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add streaming for druid pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/351370 (owner: 10Joal) [22:59:31] 10Analytics, 06Research-and-Data-Backlog: Host API for token persistence dataset - https://phabricator.wikimedia.org/T164280#3229879 (10DarTar) [23:23:22] 10Analytics, 06Community-Tech: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3229990 (10DannyH) p:05Triage>03Normal [23:26:45] (03PS2) 10Nuria: Changes datasets api to accept a project or array of same [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/351210