[00:36:11] Hey, now that https://gerrit.wikimedia.org/r/#/c/273549/ is merged into analytics/limn-edit-data, how do we get it pushed into production so it runs? [01:27:00] is this edit-analysis.wmflabs.org ? [01:31:14] seems to be in the 'dashiki' project in labs [01:31:48] I would ask Yuvi, madhuvishy, or milimetric [08:15:25] hello!! [08:15:55] I just tried to run "make" in my varnish-kafka dir with libvarnish 4.0 [08:16:29] and it wasn't good [08:16:35] everything has changed :D [09:05:09] Analytics-Tech-community-metrics, DevRel-February-2016: Backlogs of open changesets by affiliation - https://phabricator.wikimedia.org/T113719#2071554 (Qgil) How can it be that " Oldest open Gerrit changesets without code review" is from 2013 but "Oldest open Gerrit changesets by organization without code... [10:08:27] !log Deploying refinery to see if previous deploy was causing https://phabricator.wikimedia.org/T128295 [10:10:36] Analytics, Analytics-Kanban, Pageviews-API: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2071617 (JAllemandou) First investigations: The problem comes from page_title computation on the cluster. It seems the cluster... [10:23:33] Analytics, RESTBase: Wrong analytics for ru.wikinews.org through RESTBase - https://phabricator.wikimedia.org/T128360#2071641 (Krassotkin) [10:25:20] Analytics, RESTBase: Wrong analytics for ru.wikinews.org through RESTBase - https://phabricator.wikimedia.org/T128360#2071655 (JAllemandou) [10:25:22] Analytics, Analytics-Kanban, Pageviews-API: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2071656 (JAllemandou) [10:27:41] Analytics, Analytics-Kanban, Pageviews-API: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2071660 (JAllemandou) Tried to deploy refinery in hdfs to see if the problem could be coming from a bad deployed version. There... [11:10:04] Analytics-Tech-community-metrics, Developer-Relations, DevRel-February-2016, developer-notice: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#2071748 (Aklapper) [offtopic] >>! In T103292#2069654, @Aklapper... [11:18:05] Analytics-Tech-community-metrics, Developer-Relations, DevRel-February-2016, developer-notice: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#2071755 (Krenair) >>! In T103292#2071748, @Aklapper wrote: >List... [11:38:04] Analytics-Tech-community-metrics, Developer-Relations, DevRel-February-2016, developer-notice: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#2071786 (Aklapper) From a quick glance, candidate repositories to... [11:45:16] Analytics, Analytics-Kanban: Back-fill pageviews data for dumps.wikimedia.org to May 2015 - https://phabricator.wikimedia.org/T126464#2071792 (elukey) Managed to archive one hour of May data with: https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/PageViewDumps [11:55:26] joal: hellooooo [11:55:56] I updated https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/PageViewDumps with my latest changes.. I am running an oozie job atm and it is filling data nicely in my tmp dir [11:56:32] and I tried to -get them and inspect with zless, looks good to me [12:00:26] (if you want to double check hdfs dfs -ls /tmp/elukey/pageviewdumps/archive/pageview/legacy/hourly/2015/2015-05) [12:02:01] * elukey commutes to the office, brb in ~15mins [12:11:04] Analytics-Tech-community-metrics, Developer-Relations, DevRel-February-2016, developer-notice: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#2071834 (Qgil) Looks good to me and, in fact, I thought we got ri... [12:21:29] back! [12:40:11] Analytics, Analytics-Kanban, Pageviews-API: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2069811 (Shizhao) Since 2016-02-23 many wiki pageviews API encoding error, see https://wikimedia.org/api/rest_v1/metrics/pagev... [12:59:33] Analytics, Analytics-Wikistats, DevRel-February-2016: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#2071930 (Aklapper) [13:03:59] Hi elukey [13:04:31] We have an issue currently on the cluster, I'll double check your code/reuslts after having solved the other one :) [13:16:03] elukey: Have restarted something on the cluster in the last hour ? [13:17:54] (PS1) Joal: Change Hadoop and hive dependencies to latest CDH [analytics/refinery/source] - https://gerrit.wikimedia.org/r/273889 [13:44:11] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072057 (elukey) Summary of my findings so far: Varnish logs various kind of data (statistics, requests handled, etc..) in a shared memory file rather than in a file to a... [13:45:28] joal: sorry just seen the message! nope I haven't done anything, just kicking off an oozie job [13:47:51] joal: brb in 10 minutes for a coffee, but please let me know if I can help with the cluster :) [14:00:23] hey a-team [14:01:34] mforns: o/ [14:02:01] hey :] [14:09:24] Analytics, Operations, Traffic: Sort out analytics service dependency issues for cp* cache hosts - https://phabricator.wikimedia.org/T128374#2072088 (BBlack) [14:10:25] Analytics, Operations, Traffic: Sort out analytics service dependency issues for cp* cache hosts - https://phabricator.wikimedia.org/T128374#2072101 (BBlack) [14:15:35] ottomata: https://phabricator.wikimedia.org/T124278#2072057 (work in progress) [14:27:36] (PS3) Mforns: Add hive queries for the traffic breakdown reports [analytics/reportupdater-queries] - https://gerrit.wikimedia.org/r/272635 (https://phabricator.wikimedia.org/T127326) [14:31:34] (CR) Mforns: "I set the start date for the reports on June 2015, that's the first date that we have data in the intermediate browser_general table since" [analytics/reportupdater-queries] - https://gerrit.wikimedia.org/r/272635 (https://phabricator.wikimedia.org/T127326) (owner: Mforns) [14:43:06] Analytics: Migrate limn-mobile-data/reportupdater reports to use standalone reportupdater - https://phabricator.wikimedia.org/T128375#2072136 (mforns) [14:45:55] (PS2) Mforns: Add all reportupdater files to this dedicated repo [analytics/reportupdater] - https://gerrit.wikimedia.org/r/272712 (https://phabricator.wikimedia.org/T127327) [14:49:04] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072177 (Ottomata) OH awesome! Does the Varnish Utils lib have a similar ABI enough to just work with existing varishkafka code? [14:51:20] Hi ottomata [14:52:10] HIIII [14:52:13] i see your email! [14:52:18] am getting through others [14:53:07] np ottomata, I subscribe you to the phab task I'm using [14:54:01] k vool [14:54:02] cool [15:07:35] woah! I didn't know phab picked up ticket IDs from SAL, cool! [15:08:25] ok joal am looking now [15:08:40] that upgrade in progress thing is very very strange, and actually has been with us for a long while. [15:08:56] since some previous upgrade. i looked into it once, but everything i could find says that the upgrade was finalized [15:09:01] its only in the gui that it says it isn't [15:09:09] but, bad connect ack... [15:09:09] hm [15:12:06] joal: do you always get that badConnectAck? [15:12:10] is it just while deploying to hdfs? [15:19:26] hey ottomata & joal: how can I help with this weird encoding issue? [15:24:01] not sure yet [15:24:12] gotta figure out where its coming from [15:24:20] i'm running joal's query on just the raw table [15:24:44] pre refined using just the UDFs [15:26:58] ok, so investigative stage. Cool, I can do that [15:27:52] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072286 (elukey) @Ottomata not sure yet, but it would be nice to have a single source code branch. The next step is for me and @ema to figure out if and how we could use... [15:28:34] Analytics, Analytics-Kanban, Pageviews-API: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2072291 (Ottomata) FYI, I just ran simliar query on just wmf_raw.webrequest table. ``` ADD JAR /usr/lib/hive-hcatalog/share/hc... [15:29:34] milimetric: so, it seems the GetPageviewInfoUDF [15:29:34] UDF is returning two different things for the same input [15:29:48] ha, ok, looking at that [15:29:56] could be because different things on different nodes [15:29:58] not sure [15:34:55] ottomata: can we execute `select get_pageview_info('en.wikipedia.org', '/wiki/Lasse_%C3%85berg', '')` on each of the nodes? [15:35:08] like, how would I do that, I can do it [15:35:31] hmmmm [15:35:46] hmmMmmM [15:36:13] we'd need to dl the refinery jar with that func, compile some java code that does that, and then run it with java --classpath=refinery-core.jar [15:36:17] but that can be done! [15:36:43] milimetric: if you write a simple java class with amain func that does that [15:36:45] actually [15:36:46] have it call [15:36:47] pageviewDefinition.getPageTitleFromUri [15:37:00] org.wikimedia.analytics.refinery.core.PageviewDefinition.getPageTitleFromUri [15:37:04] i can see about doing the rest [15:37:19] k [15:37:30] good idea [15:38:06] i was going to see if there was a way to do that in hive and print out the nodename [15:38:13] that the code is executing on [15:38:16] in the results [15:38:19] but i'm not sure if that is possible [15:38:20] hmMmMMM [15:38:30] we'd have to make a UDF that got the current hostname [15:41:56] ottomata: this work? [15:42:02] https://www.irccloud.com/pastebin/KZbrOrrd/ [15:42:26] (not sure how to package / import) [15:42:42] Analytics, Analytics-Kanban, Pageviews-API: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2072347 (Ottomata) I ran this a second time, and got slightly different results, indicating to me (as @JAllemandou already note... [15:43:03] milimetric: that'll doi t [15:43:05] print it out [15:43:07] oops, I guess like print [15:43:08] right [15:43:11] i'll take care of linking [15:43:27] actually, [15:43:33] do you have refinery-core.jar somewhere? [15:43:36] don't worry if not [15:43:42] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072350 (ema) Although it would be great for libvarnishtools to be a proper shared library, that is apparently not going to happen anytime soon in varnish itself, maybe fo... [15:43:46] actually nm, i got it [15:43:52] https://www.irccloud.com/pastebin/xuvmQPNY/ [15:44:01] ok trying.. [15:44:27] uh... I mean I can build refinery-core but aren't we trying to see if that's different on different nodes too? [15:44:33] i got it [15:44:36] hm, but hm [15:44:38] its from hdfs [15:44:40] so it won't be [15:44:46] i think we are trying to see if some java lib is [15:44:52] ah, gotcha [15:45:12] I'll keep reading the code of the UDF and PageviewDefinition, to see if it's possible something is ambiguous [15:45:14] k [15:45:26] I gotta say, there are a lot of moving parts for simple string processing. [15:45:34] milimetric, ottomata: sorry, Need to care Lino [15:45:35] Like, a dailywtf-worth of moving parts :) [15:45:46] joal: psh, get outa here, we got this [15:46:11] ottomata: did that script work? seems like it wouldn't [15:46:14] milimetric: the decoding happens at the end of the getTitle stuff: using our own PercentDecoder [15:46:30] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072374 (elukey) So the plan would be to create two separate branches: 1) 3.X containing the current codebase 2) 4.X containing the vut.c/.h files to leverage the tools l... [15:46:48] ottomata: cause that is not executed through hdfs java environment [15:47:15] also ottomata, hive is unusually slow (taking very long time to even launch map-reduce job --> could be related to metastore config ...) [15:47:23] nuria: sure it totally works [15:47:37] just like tests work outside of hadoop [15:47:37] ottomata: executing the libs you are using on hdfs? [15:47:45] it doesn't use any hdfs libs [15:47:45] ottomata: Got the hdfs error only when trying to deploy (but didn't try anything else) [15:48:07] * joal is back to Lino's food [15:48:07] hmm ok joal [15:48:08] ottomata: sorry, executing all java libs that hdfs has on its classpath [15:48:15] joal: ja i noticed that too, takes a long time to start [15:48:21] nuria: ? [15:48:57] i think there is an issue with analytics1031 [15:49:12] joal, did you notice if hte IPs for that error cahnged [15:49:13] ? [15:49:15] or was it always .131? [15:49:20] ottomata: was wondering how do we ensure that java classpath is the same than teh one you have when running inside hadoop when executing snippet [15:49:48] nuria: not sure why it would matter, since it doesn't use any code from hadoop jars [15:50:07] although, milimetric i do wonder, where is PercentDecoder from? [15:50:11] i don't see it in import [15:50:15] is it buildin or somethign? [15:50:24] uh [15:51:16] I remember jo telling me about this [15:52:22] that doesn't seem to make sense... [15:52:25] ottomata: not hadoop libs per se but any libs in the path. It is very possible to have two versions of any package deployed and teh version you use depends on the one on classpath [15:53:35] ottomata: does it just not need to import because it's in the same package? package org.wikimedia.analytics.refinery.core; [15:53:46] oh [15:53:47] then yes [15:54:01] sorry, it just sounded so generic [15:54:05] forgot it was custom [15:54:05] ottomata: also.. (and please discard if it seems naive) did we execute `locale` in all nodes? to make sure that is set correctly [15:54:54] right, same package, no import, verified [15:56:22] now I'm wondering why we made our own decoder... https://github.com/wikimedia/analytics-refinery-source/blob/405488a2d3678b5d3743b4164daa2acd2d3663e4/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/PercentDecoder.java [15:56:40] * Decoder for percent encoded strings [15:56:41] *

[15:56:41] * In contrast to URLDecoder, this decoder keeps percent signs that are not [15:56:41] * followed by hexadecimal digits, and does not convert plus-signs to spaces. [15:57:09] but the normal decoder can have exceptions I think [15:57:57] ah, no that's python, I'm getting confused [16:01:07] hmm, there is def a problem with analytics1031 [16:01:26] you mean it's producing the weird characters? [16:01:27] not sure if it is related to this [16:01:55] OH metting! [16:13:29] Analytics, ArchCom-RfC, Discovery, EventBus, and 7 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#2072426 (Ottomata) We will resolve this after T120212 is closed, and after we have the first consumer (change propagation) in production. [16:16:21] milimetric: Hey, now that https://gerrit.wikimedia.org/r/#/c/273549/ is merged into analytics/limn-edit-data, how do we get it pushed into production so it runs? [16:17:03] James_F: those run on puppet with ensure: latest [16:17:11] so it should be running at the next daily run [16:17:22] I think that's 19:00 UTC or something like that... not sure where it gets those times [16:18:01] oh! you mean it hasn't run for 2 days... ! I'll look at it [16:20:57] milimetric: Yeah. Thanks [16:22:05] James_F: I see, the job started on Feb. 27 but since it has a million days to catch up is taking a long time [16:22:31] James_F: while I have you, do you really need the compare dashboard to load up by default with all the data from April? [16:22:57] milimetric: Aha. [16:23:00] It has to transfer all the text regardless since this is based on flat files, but it's much faster if it doesn't have to render all that data, especially the first graph [16:23:14] milimetric: No, defaulting to the last three months but being able to access older data would be fine. [16:23:19] And faster is always nice. :-) [16:23:21] k, i'll do that [16:23:29] * milimetric presses the "faster" button [16:23:34] Awesome. [16:24:08] (Is it re-calculating its data from scratch, or 'just' from 1 December when it was switched off?) [16:28:43] James_F: december unless something's wrong [16:28:58] it just uses the files as the source of truth - whatever's missing it runs [16:30:09] Cool. [16:30:09] So in future it won't take forever. :-) [16:30:47] a-team: standdupp [16:33:03] Analytics-EventLogging, Analytics-Kanban, DBA, Patch-For-Review: Add autoincrement id to EventLogging MySQL tables. {oryx} - https://phabricator.wikimedia.org/T125135#2072462 (Nuria) Ping @jcrespo , let us know if now that edit table is trimmed we can proceed with this. [16:36:38] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072492 (Ottomata) Sounds good. Let’s make a new 3.x branch now, and make master be for 4.x support. [16:38:35] Analytics, Operations, Traffic: varnishkafka integration with Varnish 4 for analytics - https://phabricator.wikimedia.org/T124278#2072493 (Ottomata) Would it be helpful to build a varnish utils .deb package instead of directly adding sources? If so, I can help with that. [16:39:30] i dunno if it means anything, but i'm trying to deploy the discovery analytics repo to hdfs, and getting data timeouts: https://phabricator.wikimedia.org/P2687 [16:39:53] from 10.64.36.128 and 10.64.36.131 [16:41:37] it eventually completed though [16:57:41] ebernhardson: Hey, we have experienced similar problems, ottomata is onto it :) [16:57:48] yeah [16:57:52] something is wrong with 131 [16:58:02] ebernhardson: as far as I've seen, it should work, right? [16:58:09] its just that .131 has problems [16:58:14] ottomata: ebernhardson notived 128 as well [16:58:14] and prints errors [16:58:21] joal: that's in the from [16:58:24] oh FROM [16:58:24] ? [16:58:32] OH HM [16:58:33] yes [16:58:37] gr [16:58:38] ok [16:58:50] ebernhardson: i have ops meeting, but will be poking at this [16:58:51] thank you [16:59:05] Also milimetric, even if utterlyWTF, page_title extraction is not speculative :) [17:03:16] Analytics, Analytics-Cluster: Improve Hue user management - https://phabricator.wikimedia.org/T127850#2056054 (Milimetric) p:Triage>Normal [17:04:44] Analytics, Analytics-Cluster: Enable purging of old jobs and coordinators in oozie - https://phabricator.wikimedia.org/T127988#2060457 (Milimetric) p:Triage>High [17:07:39] Analytics, Analytics-Cluster: Use MySQL as Hue data backend store - https://phabricator.wikimedia.org/T127990#2072637 (Milimetric) p:Triage>Normal [17:08:21] Analytics: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2072641 (Milimetric) p:Triage>Normal [17:10:17] Analytics-Kanban: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2072647 (Milimetric) p:High>Unbreak! [17:13:02] Analytics, Operations, Traffic: Sort out analytics service dependency issues for cp* cache hosts - https://phabricator.wikimedia.org/T128374#2072088 (Milimetric) p:Triage>Normal [17:14:37] Analytics: Migrate limn-mobile-data/reportupdater reports to use standalone reportupdater - https://phabricator.wikimedia.org/T128375#2072136 (Milimetric) p:Triage>High [17:17:03] Analytics, Analytics-Cluster, Patch-For-Review: Create regular backups of Analytics MySQL Meta instance - https://phabricator.wikimedia.org/T127991#2072682 (Milimetric) p:Triage>High [17:17:46] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Create regular backups of Analytics MySQL Meta instance - https://phabricator.wikimedia.org/T127991#2060511 (Milimetric) [17:18:35] Analytics, Easy: Create table in hive with a continent lookup for countries - https://phabricator.wikimedia.org/T127995#2060605 (Milimetric) [17:19:10] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Set up Webrequest -> kafka flow in beta. - https://phabricator.wikimedia.org/T127369#2072694 (Milimetric) [17:20:22] Analytics: Make geowiki data safely public - https://phabricator.wikimedia.org/T127409#2072697 (Milimetric) p:Triage>Normal [17:21:41] Analytics, Pageviews-API: Pageviews API not updated on 2/18/2016 at 8;34 utc - https://phabricator.wikimedia.org/T127414#2072704 (Milimetric) Open>declined This is expected behavior, and clients should adjust. The pageview API can only update as fast as we get data through a pipeline. So it will... [17:22:37] Analytics, Services, Patch-For-Review: wikimedia.org/api and wikimedia.org/api/rest_v1 should redirect to the docs - https://phabricator.wikimedia.org/T118519#2072706 (Milimetric) Open>Resolved [17:24:10] Analytics: Make Pageview API date formats more flexible {slug} - https://phabricator.wikimedia.org/T118543#2072714 (Milimetric) Open>declined Rejecting my own request - we decided the API needs to be friendly to automatic clients, so its primary purpose is to be simple. Those clients in turn can make... [17:25:40] Analytics: Spike: Investigate situation of logstash and hadoop logs - https://phabricator.wikimedia.org/T121418#2072730 (Milimetric) Open>declined [17:27:48] Analytics: update comScore description on report card - https://phabricator.wikimedia.org/T122059#2072741 (Milimetric) Open>Resolved [17:28:12] Analytics: We may be missing some more spiders when tagging pageviews {slug} - https://phabricator.wikimedia.org/T121934#2072743 (Milimetric) [17:28:14] Analytics: Pageview API: Better filtering of bot traffic on top enpoints - https://phabricator.wikimedia.org/T123442#2072744 (Milimetric) [17:28:50] hmmm, have been able to print decoded titles and hostnames with spark [17:28:54] Analytics: Create a central page in wikitech to act as a central hub so users know where to go for different types of data - https://phabricator.wikimedia.org/T122970#2072748 (Milimetric) p:Triage>Normal [17:28:54] see it encoded weird on most hosts! [17:29:00] e..g [17:29:01] $Lasse_��berg: analytics1035.eqiad.wmnet [17:29:01] $Lasse_Åberg: analytics1052.eqiad.wmnet [17:29:13] Analytics: Move IOS team piwiki usage to production instance - https://phabricator.wikimedia.org/T123262#2072753 (Milimetric) Open>Resolved [17:29:20] ottomata: :( [17:29:32] ottomata: How have you managed to get the computing host info ? [17:30:15] Analytics: python-mwviews does not handle unicode in titles - https://phabricator.wikimedia.org/T123200#2072758 (Milimetric) Open>Resolved [17:32:52] Analytics-Tech-community-metrics, Developer-Relations, DevRel-February-2016, Patch-For-Review, developer-notice: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#2072770 (Aklapper) >>! In T103292#2071786,... [17:33:25] joal: https://gist.github.com/ottomata/028531cd9b1dcf3f7bb2#file-pt-scala [17:34:16] gonna try with just PercentDecoder [17:34:41] ok ottomata [17:34:47] sounds very very weird this thing ! [17:35:37] Analytics-Tech-community-metrics, Developer-Relations, DevRel-February-2016, DevRel-March-2016, and 2 others: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#2072780 (Aklapper) [17:38:10] ok, isolated to PercentDecoder [17:40:00] Also ottomata, camus complains about 131 bad connect as well :( [17:40:36] IIINteresting joal [17:40:41] with URLDecoder.decode(str, "UTF-8") [17:40:45] instead of PercentDecoder [17:40:51] ottomata: But partition dumps looks good [17:40:52] everything it 100% cool [17:41:12] joal: ja, i think that is just a warning, hdfs if functioning, but ja 131 (and 128?) are misbehaving [17:41:21] ottomata: can you try to encode to urf-8 before using percent decoder?n [17:41:28] k [17:41:30] ok ottomata, cool (for hdfs) [17:41:39] with URLDecoder? [17:41:42] or just some string encoding? [17:41:49] just some string encoding [17:41:59] Analytics, HyperSwitch, Pageviews-API, Services: Better error messages on pageview API - https://phabricator.wikimedia.org/T126929#2072812 (Pchelolo) The error messages were improved in #hyperswitch [[ https://github.com/wikimedia/hyperswitch/pull/13 | PR ]] We will make a new release soon, after... [17:44:20] ellery, apropos of nothing but I forgot to mention it at the all-hands: love the hair! [17:44:38] (for other people we just got out of a google hangouts meeting, I'm not just wandering around saying whatever comes to mind to random people) [17:46:59] ottomata: some random guess: since hdfs issues with some machines, could be related to loading the refinery jar? [17:47:35] hmm, joal, i doubt it, since i've already loaded that jar into my spark-shell session [17:47:40] and i'm still getting the same results [17:48:13] pretty sure my spark jar is not using hdfs [17:48:19] sorry [17:48:21] my spark session [17:48:23] spark job8 [17:48:46] joal: we are sure this was not happening before the upgrade? [17:48:51] or maybe it was just not noticed? [17:49:36] hmm, seems not [17:49:47] yea, starts on the 23 [17:49:51] hm ... pretty sure is was not happening: data before the upgrade doesn't show both decoded and not_decoded values (double checking again) [17:51:14] ottomata: are the 'failing to decode' machines the ones that, by random chance, would have had the logstash code ? [17:58:47] no [17:58:49] oh [17:58:50] logstash? [17:58:53] uhhh [17:59:07] i don't think so joal, there were more than those [17:59:09] not sure though [17:59:56] ottomata: worse double checking [18:00:03] ja [18:00:12] joal: i'm trying to isolate the encoding probl, will do that too [18:00:53] sure [18:01:00] ottomata: can I help ? [18:01:16] with encoding, I mean :) [18:02:12] hm, not sure, am runing local spark shells now on two of the differeing nodes, seeing if i can reproduce, then will get into PercentDecoder code to find problem [18:02:53] bah! [18:02:55] nope. [18:02:56] HMMMMMm [18:03:14] perhaps silly question ... are event logging uuid's monotonically increasing? [18:03:24] i don't think so ebernhardson [18:03:33] are you looking in mysql? [18:03:40] or hdfs? [18:03:59] ok joal interesting. [18:04:12] local spark shell on nodes that gave different result [18:04:24] give the same (correct) result for PercentDecoder.decode [18:04:30] so, it looks like its only in yarn context that this happens [18:04:31] HMmMMM [18:04:36] ottomata: RIIIIGHT [18:04:40] Getting closer ! [18:05:16] i looked at the expanded values of the full spark dist classpath when launching spark, and it was the same on both [18:05:20] ottomata: Was about to tell you that I triple-miliionuple read thre decoder code: except for java.util.Array.copy, bug, the rest is not broken [18:05:22] hmMm [18:05:32] aye ok [18:05:59] no speculative stuff, blah [18:06:02] this is very strange. [18:06:49] ok hm, i need to eat lunch [18:07:07] np ottomata, will be with Lino a bit, then back [18:07:13] will think about this during, then review marcel's code, then focus on this [18:07:18] be back in a short while [18:07:19] o [18:10:28] nuria, I will skip the unique device meeting unless you think I should go, I'll concentrate on reportupdater [18:10:49] mforns: k [18:10:54] thx! [18:14:43] ottomata: doh i'm slow to respond. I'm looking in mysql [18:15:22] since mysql doesn't have windowing functions i can't get a row from a group based on its min timestamp, so was thinking i could self-join against a query that pulls min(uuid) [18:17:31] i suppose i can think of some test queries that will give a general idea of if it's probably increasing. will look. I was thinking it probably is because mysql indexes don't like when you add random ids and has a strong performance preference for monotonically increasing primary keys [18:19:39] Analytics, Developer-Relations, MediaWiki-API, Reading-Admin, and 5 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#2072918 (bd808) [18:21:07] ebernhardson: to have a monotonically PK i think you need an autoincrement [18:21:46] ebernhardson: as the uuid is done for uniqness not to preserve "order" [18:26:31] ottomata: again something obvious [18:26:37] ottomata: are we starting teh jvm on yarn with any -D parameter for encoding/decoding [18:29:00] ottomata: as in -Dfile.encoding=UTF-8 [18:48:53] ottomata: confirmation that the thing was not happening before upgrade [18:50:09] ori: WSDM's best paper went to http://www.yichang-cs.com/yahoo/wsdm16_wholepage.pdf, neat research, indeed. [18:56:31] aye joal ok [18:59:12] ottomata: I'm getting out of ideas for that encoding issue :( [19:01:14] me too...am doing review of marcel's thing [19:01:17] then will get back to it [19:01:25] we need to be able to debug [19:01:28] ottomata, :] [19:01:28] Sure, sorry to bother [19:01:32] joal: , am thinking of building a new PercentDecoder [19:01:38] that has lots of debug prints [19:01:44] Could do [19:08:07] mforns: am looking at reportupdater::job [19:08:15] ottomata, yes [19:08:22] is there a case where the same $repository and different $titles will be declared and used? [19:08:26] like [19:08:32] yes [19:08:51] reportupdater::job { "job1": repository => A, } [19:08:51] reportupdater::job { "job2": repository => A, } [19:08:52] lok [19:09:04] like we can have reportupdater-queries, with different query folders [19:09:32] for example: have reportupdater-queries repository have a browser folder and a geo folder [19:09:37] and then have: [19:10:11] reportupdater::job { 'browser': repository => 'reportupdater-queries' } [19:10:23] reportupdater::job { 'geo': repository => 'reportupdater-queries' } [19:10:47] as is won't that run the same cronjob command though? [19:11:02] nuria: You're idea seams very promissing :) [19:11:18] nuria: what is file.encoding? [19:11:44] ottomata, no, because the default value for the 'query_dir' parameter is $title [19:11:57] ottomata: second answer http://stackoverflow.com/questions/12831138/specify-utf-8-encoding-for-hadoop-jobs [19:12:09] ottomata, and the cron command uses $query_dir [19:12:12] reading, (sorry, shoulda just googled) [19:12:22] ottomata: -D parameter when JVM starts [19:12:55] ottomata: Probably not an error due to upgrade, maybe an error due to puppetisation [19:13:08] hm maybe [19:13:17] I mean, nothin [19:13:20] mforns: the command knows to do a different thing if query_path is different? [19:13:28] sorry, doesn't cost a lot to try I guess [19:13:55] ottomata, yes, reportupdater stored the pid file and history file in that folder, so they are different things and can be executed in parallel [19:13:57] loooks like it will just do update_reports.py $query_path .. [19:14:02] *stores [19:14:17] oh, because $query_path contains configs? [19:14:52] ottomata, yes, query_path contains the query files (or script files) plus the config.yaml file that has metadata on the reports [19:15:13] and also is used by reportupdater to store temporary files (pidfile and history) [19:15:13] mforns: will be there an effort to convert the limng generate.py stuff to update_reports.py? [19:15:33] ottomata, https://phabricator.wikimedia.org/T128375 [19:15:50] it's in prioritized [19:15:55] great cool, so the case in init.pp is just temporary [19:16:02] ottomata, exactly [19:16:04] l [19:16:05] k [19:16:46] I hope we can migrate that soon [19:17:00] because we have 2 repos with the same code now :[ [19:17:14] mforns: i think you should decouple the reportupdater clone paths from the data dir clone paths [19:17:31] ottomata, what do you mean? [19:17:38] reportupdater { path => } can just say where to clone reportupdate code [19:17:49] and reportupdater::job can have a path that says where to clone the data/config repo [19:17:56] it doesn't have to be inside of $working_path [19:18:05] ottomata, makes sense [19:18:10] 'working_path' was just a concept from the statistics module to abstract /a from /srv [19:18:22] aha [19:18:32] nuria: YOU WIN :) [19:19:27] mforns: would it make sense to keep the logs inside of the reportupdate::job path? [19:19:31] instead of in /var/log/reportupdater? [19:19:40] then each dir would ahve its own log file local to itself? [19:19:42] not sure [19:19:48] fine by me [19:19:55] the output could be inside also... [19:20:09] yeah [19:20:16] ottomata, mmmm, the output maybe not [19:20:24] nuria, ottomata : https://gist.github.com/jobar/5d09f6de2ad6b40fa782 [19:20:25] how is output_path used? [19:20:27] joal: does the -Dfile fix teh encoding issue? [19:20:40] ottomata, it will help for all the jobs output to the same dir, so that it can be rsync'd [19:20:40] i don't see anything writing to it [19:20:46] * joal bows to nuria superpowa [19:21:07] ottomata, it just ensures that it can be written by the user [19:21:11] * milimetric shakes his fist at the encoding gods [19:21:17] nice! [19:21:23] ottomata, and also, if rsync is specified, it's the path that will be rsync'd [19:21:27] but how does reporupdater know to write data there? [19:21:39] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2073192 (madhuvishy) After some discussion, we've decided to drop ClientIPs from the even... [19:22:10] the repos holding the queries and the config.yaml <- it should be in the yaml file [19:22:17] hm [19:22:31] so it is configurable in the puppet module, but it is also configurable in the config.yaml? [19:22:35] but neither know about the other? [19:22:39] ottomata, yes :/ [19:22:43] joal: what i do not understand... [19:22:55] joal: is how it worked in some nodes, that has to be OS issue [19:23:08] joal: it would make sense if "locale" was different [19:23:20] nuria: I must say I have no idea ! [19:23:25] ottomata, the idea in generate.py was to be able to create subdirectories inside the output_folder [19:23:52] mforns: maybe output_path should be a parameter to ::job instad, and that should be passed to command for cron job [19:23:59] ottomata, but of course, you can break things if you configure the yaml to output outside the output_folder [19:24:17] ok a-team, I need to go soon, but I'll take on me to re-run jobs from tomorrow morning to fix the mess [19:24:25] mforns: it seems like your config yaml should,n't really know about stat1002 directory hierarchies, no? [19:24:26] ottomata, generate.py doesn't accept that input parameter [19:24:29] It'll be a duty I think I can handle having Lino at home :) [19:24:35] ottomata, sure [19:24:44] joal: ok, but what do we need to do? [19:24:48] this is happening in hive and spark [19:24:56] we should figure out why this is set differently on different nodes [19:24:57] i'll try and do that [19:25:08] mforns: can you link me a config.yaml again? [19:25:18] joal: have a nice night, I'm happy if Lino launches the jobs, he's gotta start somewhere right? [19:25:36] ottomata, https://github.com/wikimedia/analytics-limn-mobile-data/blob/master/mobile/config.yaml [19:25:36] indeed milimetric, oozie at first, anything else comes easy :) [19:25:56] * milimetric wishes he was taught oozie at birth [19:25:57] hm [19:26:01] i see [19:26:11] ottomata: I wonder if it wouldn't be better to have an explicit java option for jobs [19:26:48] ottomata, also, it would be nice to be able to specify different outputs for different reports, that's a feature we're considering... this would be via config [19:26:56] well, woudln't hurt joal, but i want this to be correct by default too [19:27:25] ottomata: http://stackoverflow.com/questions/1006276/what-is-the-default-encoding-of-the-jvm [19:27:37] ok mforns hm [19:27:48] ottomata, but you're totally right [19:28:08] mforns: ok general thoughs: [19:28:15] i'm ok with this two $output config thing for now [19:28:15] but [19:28:23] hehe [19:28:55] i thikn you should make reportupdater init.p handle cloning the repo, doing file perms like you have, and then setting any variables that are totally indepdent of jobs [19:29:06] then, make ::job do things that are related to job [19:29:08] ottomata: weird though: same locale on two machines having different outcome ... [19:29:11] i would even move the resync job to the ::job class [19:29:18] ottomata, aha [19:29:19] and just have many different crons [19:29:43] ottomata, this would actually solve everything [19:29:44] all init.pp should do is make it possible and easy for the ::job define to work with more specific params [19:30:04] ottomata, except we would need to change generate.py to accept output_folder [19:30:16] aha [19:30:45] mforns: i'm ok (if you put a big comment) saying that all reportupdater::job config.yamls on a given node must write to $output_path set on reportupdater class [19:30:50] ottomata: even better: http://stackoverflow.com/questions/2168350/java-charset-problem-on-linux [19:31:00] maybe you can fix the .py stuff and add a param later [19:31:03] and then remove that [19:31:12] ottomata, aha [19:31:25] OK, will look at it [19:32:01] joal: .getBytes( "UTF-8" ); [19:32:02] maybe? [19:32:06] in PercentDecoder [19:32:23] ottomata: no good to have that settled in code [19:32:34] ? [19:32:36] ottomata: Should be handled correctly by java [19:32:45] indeed, hm [19:32:45] ottomata: also i *think* it is not likely to work if teh jvm doesn't have teh charset [19:32:50] set correctly [19:32:51] hmm [19:32:54] ottomata, maybe the output_folder that is passed by puppet as a parameter to reportupdater can be the output root, and all configuration in the yaml file can be relative to that [19:32:54] k maybe [19:33:02] ottomata: but not sure 100% [19:33:10] mforns: that would be cool [19:33:28] well, you'd then have to infer the $output_path root somehow in the code [19:33:30] not sure how you'd do taht [19:33:51] but mforns maybe also change the name from $output_path to $output_basepath or seomthing like that [19:33:51] ottomata, no just pass it as a parameter [19:33:55] oh ok [19:33:58] aha [19:34:00] that'd be fine i suppose [19:34:07] if you are making a param though [19:34:12] it seems better to use absolute path [19:34:15] to the job [19:34:19] then you don't need to worry about it [19:34:35] all the job specific details would be handled by the ::job define then [19:34:42] including making sure the output_path existed, etc. [19:35:01] aha [19:35:24] brb [19:36:18] * joal gets to diner [20:01:12] hmmm [20:01:29] analytics1050.eqiad.wmnet: UTF-8 [20:01:30] analytics1051.eqiad.wmnet: US-ASCII [20:01:32] totally bonkers [20:01:33] why? [20:01:41] that is the result of Charset.defaultCharset [20:02:27] if i launch a local JVM on those two nodes [20:02:30] both report UTF-8 [20:02:35] its only in yarn do we see US-ASCII [20:02:50] milimetric, I've looked at all current limn-x-data repos that run generate.py and the only one that has reports actually run by generate.py (not RU) is limn-edit-data. And those reports are running each hour but not generating any new data since 2015-04 [20:03:31] milimetric, I was thinking of going ahead and completely removing generate.py, the puppet code will be a lot simpler [20:05:40] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2073364 (Tbayer) CCing @asherman as this will affect the WikimediaBlogVisit_5308166 schem... [20:14:38] ottomata: that smells like classpath [20:15:00] ottomata: if you do ps -aux -fwww so you print the "whole" jvm [20:15:14] ottomata: start comand, does it match in both machines? [20:15:32] checking [20:17:03] did you guys found the issue? [20:17:24] (I am still working on the lock managers, too many lines in the chat :P) [20:18:08] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2073437 (asherman) Curious. Thanks for mentioning my use case :)! [20:19:00] we are close i think elukey, default charset different for different nodemanager jvms [20:19:03] not sure why though [20:20:41] nuria: 100% identical classpaths [20:20:50] i did this [20:20:51] for d in $(ps -aux -fwww | grep nodemanager | tr ' ' '\n' | grep hadoop/conf | tr ':' ' '); do realpath $d; done | sort | uniq > /tmp/classpath.txt [20:20:52] on both nodes [20:20:55] and then diffed the files [20:21:01] ottomata: boy !!!!! [20:21:32] ottomata: and neither classpath has -Dfile. right? [20:23:43] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2073453 (madhuvishy) Open>Resolved @asherman @Tbayer We can solve the WikimediaBlog... [20:24:31] right [20:24:31] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2073456 (madhuvishy) Resolved>Open Uhhh didn't mean to resolve it. [20:25:10] ottomata: arghhhhh [20:25:25] ottomata: and LC_ALL when you do >locale? [20:25:27] nuria: i think more liklely somehow the envs the jvms are running in are different, locale is toally different [20:25:28] for htose envs [20:25:29] exactly [20:25:32] the are different [20:25:33] ahhhhh [20:25:35] but [20:25:38] only in the yarn jvm [20:25:40] ottomata: ok, that's it [20:25:42] not when i just run a local jvm [20:25:48] so i don't know how the yarn jvm is getting that setting [20:26:01] what about the linux prompt [20:26:04] ottomata: >locale [20:26:08] and LC_ALL? [20:26:10] ottomata: when we look at mapred -log doesn't it spit out everything? [20:26:47] nuria: i'm getting that from linux prompt [20:26:49] its always the same [20:26:53] unless i actually launcha yarn job [20:26:58] and then do a shell exec to print that [20:27:07] and then it is different, i see [20:27:08] jajaj [20:27:22] inside of yarn job, locale shell exec prints [20:27:23] this [20:27:23] https://gist.github.com/ottomata/442149ce0fe6b7d7a30d [20:27:36] ok, i have had this problem before but boy, this one is even more twisted [20:27:40] US-ASCII is result of Charset.defaultCharset [20:28:02] madhuvishy: from the job? with the setting? [20:28:04] hm checking maybe [20:28:14] ottomata: ya with the job id [20:30:29] ottomata: k 1 more idea [20:31:01] ottomata: the open jdk version we have in either? [20:32:01] same on both nuria [20:32:06] Version: 7u91-2.6.3-0ubuntu0.14.04.1 [20:32:08] java-1.7.0-openjdk-amd64 1071 /usr/lib/jvm/java-1.7.0-openjdk-amd64 [20:32:24] ottomata: arghhh [20:32:36] ottomata: and of course both machines are 64bit right? [20:33:08] ja [20:33:26] madhuvishy: no relevant info in either mapper log or yarn application log that i can see [20:34:55] hmmm [20:39:49] am poking around in jmx [20:41:15] HMMMMM [20:41:20] file.encoding [20:41:24] ANSI_X3.4-1968 [20:43:53] wait jmx? [20:45:27] ja just poking around at running jvm settings [20:45:28] but [20:45:29] this works tooo [20:45:35] sudo jinfo 119205 | grep file.encoding [20:45:35] file.encoding = UTF-8 [20:45:37] then on 1051 (bad) [20:45:40] Analytics-Kanban: Remove Client IP from Eventlogging capsule {mole} - https://phabricator.wikimedia.org/T128407#2073499 (madhuvishy) [20:45:41] file.encoding = ANSI_X3.4-1968 [20:45:46] HOWS IT GET THERE?!?!?! [20:45:47] milimetric, how far back does pageviews api show all-site data? [20:46:28] (PS7) Nuria: Fetch Pageview Data from Pageview API [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) [20:47:03] ottomata: sooo the file.encoding is on the jvm command? [20:47:03] Analytics-Kanban: Clean up Client IP related code on Eventlogging {oryx} {mole} - https://phabricator.wikimedia.org/T128408#2073520 (madhuvishy) [20:47:20] no [20:47:28] its just in the system properties of the running jvm [20:47:31] has not been specified on cli [20:47:45] ahahaha [20:47:47] or in the command started [20:48:02] i'm going to restart nodemanager on an51 and see what happens [20:48:40] !!!! [20:48:40] file.encoding = UTF-8 [20:48:42] it changed! [20:48:57] !!!!!!! [20:48:58] :? [20:49:08] maybe some strange chicken/egg problem during cluster install? [20:49:14] i'm going to restart all nodemanagers [20:49:17] 1 by 1 [20:49:36] :/ [20:50:18] (CR) Nuria: "Added some mote error checking plus tests on next patch." (6 comments) [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [20:50:37] ottomata: nice [20:54:12] Back from diner ! [20:54:16] Just backfilled [20:54:35] If restarting nodemanagers solved the issue, it's a reasonnably cheap solution ! [20:55:02] It however completely surrealist that the locale gets changed in yarn [20:55:20] ottomata: any news? Probably not yet) [20:57:56] joal: yes [20:58:07] the jvm process did indeed have file.encoding set poorly [20:58:17] i restarted nodemanager on one offending node sofar [20:58:21] and it went away [20:58:25] :o ! [20:58:31] ottomata: due to locale setting ? [20:58:36] no [20:58:36] man... [20:58:38] That's completely weird :S [20:58:38] locale is fine [20:58:41] k [20:58:45] just the jvm process had file.encoding wrong [20:58:48] the nodemanager process [20:58:49] k [20:58:50] don't know why [20:58:57] maybe some chicken/egg problem when we isntalled? [20:58:58] NOBODY knows why [20:59:00] and last restarted? [20:59:01] dunno [20:59:01] man ... [20:59:10] nuria: good catch ! We owe you a couple beer for that one :) [20:59:20] pufff.. to ottomata you mean [20:59:44] but boy, i even ask the guy in services which is a java guru [20:59:54] nuria: I think I wouldn't have found the file.encoding stuff [21:00:00] pssh, no nuria you found the file.encoding thing, [21:00:12] to see if he could think of anything else besides that ... [21:00:15] i would have been writing code to output individual byte values of the conversion [21:00:23] jajajaja [21:05:14] (CR) Nuria: [C: 2 V: 2] Add ApiAction avro schema [analytics/refinery/source] - https://gerrit.wikimedia.org/r/273556 (https://phabricator.wikimedia.org/T108618) (owner: BryanDavis) [21:07:38] nuria: would you have a moment to help with a couple hive queries? [21:08:08] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2073624 (Nuria) @asherman, @Tbayer : IPs as an only measure to count unique users is rea... [21:11:57] dbrant: what are your questions? I'll help if I can [21:12:02] ottomata: not sure if related but one of the changes pushed to the analytics hosts recently was apache-commons [21:12:44] just mentioning but probably not remotely related [21:12:57] elukey: hmmmm [21:13:06] possible, but dunno [21:13:10] I think it was a sec update [21:13:12] ja [21:13:28] it doesn't seem like a lib update would change the default jvm setting though [21:16:41] ottomata, elukey : right, it cannot as encoding is decided upon startup of jvm [21:17:23] nuria: yep yep but the package was installed before my rolling restart of all the JVMs [21:17:28] ottomata: can you send an email to me when you're done with restarting the nodes ? [21:17:34] surely it is not related though [21:18:10] mforns: good idea, much better to remove generate.py [21:18:17] milimetric, cool [21:18:35] Also ottomata, could be a good idea to restart datanodes (maybe same issue ? [21:18:53] ja but elukey we restarted all jvms as part of upgrade [21:18:57] joal: i have restarted all nodemanagers [21:18:59] just finsihed [21:19:01] yurik: it'll go back to May 2015 when we're done backfilling, but now I think august? [21:19:08] and checked, they all say UTF-8 now [21:19:09] great ottomata, double checking consistency ;) [21:19:16] will check datanodes for setting [21:19:52] ja many processes have this! [21:19:53] joal! [21:19:55] bad setting [21:19:58] will restart datanodes one by one too [21:19:59] Riiight [21:20:11] I could feel that :) [21:20:24] That hdfs error we had before might be related, no ? [21:21:38] ottomata: Controlled with Spark - All good [21:21:44] ottomata: Currently doing with Hive [21:22:20] joal: i think not [21:22:22] its on a lot of nodes [21:22:24] and that sounds different [21:22:25] but still [21:22:32] we will see i guess [21:22:39] k [21:25:38] milimetric, awesome, thx. I just updated the template a bit - now you can insert site's totals as well - https://en.wikipedia.org/wiki/Template:PageViews_graph [21:26:19] milimetric, for some reason the site's data shows monthly only for the past few months [21:26:36] but 8billion views does look impressive :) [21:26:41] just saw that, trying to figure out why [21:27:15] a-team: finished working on mc1001 finally, going offline :) [21:27:33] elukey, cool, see you next monday! [21:28:00] ah mforns you are travelling right! [21:28:04] enjoy your trip!! [21:28:05] yep :] [21:28:09] thx! [21:28:19] have a good trip! My brother's in New Delhi right now [21:28:27] mforns: Safe flight, and ENJOY :D [21:28:35] milimetric, cool! thanks [21:28:45] hehe but I'm not leaving yet today :] [21:28:50] huhuh :) [21:28:52] thanks anyway! [21:30:23] nuria: I won't make the meeting tomorrow; I have a conflict. [21:33:28] ottomata: problem solved ! [21:33:33] ottomata: in hive as well :) [21:34:05] yurik: the graph makes sense. You're doing 180 days, so that's September 22nd. There are only 4 complete months since then, Oct, Nov, Dec, Jan [21:34:15] Ok a-team, I'm off for tonight, as said earlier, I'll restart jobs as needed to cope with the encoding issues. [21:34:28] yurik: if you do like 400 days or something, you'll see August is the last month shown [21:34:35] ottomata, yt? [21:34:37] bye joal, have a nice week! [21:34:40] (we're back-filling though, so everything will be ready at some point [21:34:42] I'll send an email letting people know data is incorrect and we are currently recomputing, and will ask for people not to use the cluster becaude of backfilling [21:34:43] ) [21:34:55] * milimetric caught jo in his parens - 1 point [21:34:56] Thanks mforns :) [21:35:13] :) [21:35:36] * joal likes to be put between parenthesis :) [21:37:09] Bye a-team ! See you tomorrow [21:37:40] good night joal :) [21:40:34] Ironholds: hiyaa [21:40:34] ja [21:40:37] joal: laters!!! [21:40:40] thanks for your help [21:43:27] ottomata, so we have https://phabricator.wikimedia.org/T128117 - basically we're looking to transfer a cluster of scripts and a cron job that runs it to a non-person-tied account [21:43:29] ori:k [21:43:35] would the stats role account work well for this? how would we go about it? [21:43:49] nuria: would you like my input on anything ahead of that meeting? [21:44:25] ori: nah, we can talk as things come up, if any [21:44:38] Ironholds: we recently created a 'analytics-search' user for other discovery related jobs [21:44:42] ebernhardson: ^ [21:44:54] ottomata, cool! [21:45:09] so does that mean I should bug erik instead of you? :D [21:45:09] it was mainly for hadoop related stuff [21:45:23] yes probably! i'm not sure if you are asking to have it puppetized...or? [21:51:14] Analytics, Pageviews-API: Pageviews API not updated on 2/18/2016 at 8;34 utc - https://phabricator.wikimedia.org/T127414#2073823 (Alexdruk) As on https://wikitech.wikimedia.org/wiki/Analytics/PageviewAPI#Updates_and_backfilling: "Updates and backfilling The data is loaded at the end of the timespan in q... [21:51:42] elukey: , hmmMMM yeah actually, i think it can't be the commons upgrade [21:51:51] a 'locale' shell exec from that process [21:51:56] also printed out a bad locale [21:55:03] Analytics-Kanban: Pageviews API reporting inaccurate data for pages titles containing special characters - https://phabricator.wikimedia.org/T128295#2073845 (Ottomata) Phew, somehow many NodeManager (and, unrelatedly Datanode) JVM processes had got stuck with `file.encoding = ANSI_X3.4-1968` the last time the... [21:56:13] Ironholds: hmm, maybe :) [21:56:16] * ebernhardson scrolls p [21:56:47] Ironholds: looks like the right group, file a task in ops-access-requests to have you and bearloga added. or maybe just bearloga because :( [21:57:04] ebernhardson, cool! [21:57:16] ottomata, mostly "an account to own the cron job and code that isn't mine" [21:57:53] Ironholds: how is the code going to get there? [21:58:01] and how are the crons going to be set up? [21:58:09] you going to just put them there manually? [21:58:12] cron's should be done from puppet [21:58:14] and have lots of documentation for others to find them? [21:58:15] ottomata, it's a git repo and crontab -e ;p [21:58:30] ebernhardson: yes and no, it depends on what the crons are [21:58:39] generally we don't make researcher type folks puppetize everything they do [21:58:44] it would be a lot of overhead [21:58:45] oh, i suppose :) [21:58:57] :P [21:58:59] it also depends on how much do and forget you want [21:59:22] if you want to make a 'prod' cron job that is maintained by the discover+analytics+ops, then yea, puppetized [21:59:41] if you want to be fast and just run things yourself, then its ok to just maintain your own crons [22:00:36] milimetric, any plans to add "weekly" granularity/ [22:01:08] milimetric, are you saying that months always show 1st-31st ? [22:02:00] ebernhardson: were you able to deploy your hdfs stuff? [22:02:08] i can't reproduce the error anymore since restarting some datanodes [22:02:14] yurik: we have to balance usefulness with storage. Storing weekly granularity per article would probably not be worth it. For aggregate, sure [22:02:16] ottomata: so will Ironholds and I be able to just `su analytics-search`? [22:02:18] this one: https://phabricator.wikimedia.org/P2687 [22:02:26] bearloga: probably not at the moment, but we could make that possible [22:02:42] milimetric, you don't need to store it - you can calculate it on the fly :) [22:02:59] yurik: as for 1st - 31st, not all months have all those days, but yes the month has to be complete for its stats to be added [22:03:11] i realize that :D [22:03:14] would need to be added to the analytics-search-users group [22:03:22] ottomata: and then eventually whoever is Ollie's replacement [22:03:40] welp, the description of that group is apt [22:03:42] ' description: Group of users for managing search related analytics jobs' [22:03:50] yurik: right, so calculating on the fly is the responsibility of the clients as we see it. There are many in js, oython, etc, and things like weekly resolution should be added there [22:03:59] it would also give rights to deploy hadoop related analytics discovery stuff [22:04:12] i guess that's up to you if you guys want to grant that, but i think it would be fine [22:04:15] that keeps the api simple and meeting the general need and allows clients to specialize however they wish [22:04:29] milimetric, wouldn't it be better on the server? this way you can reduce the payload and it gets cached by varnish [22:05:48] requests for weeks of data would be cashed whether there's an abstraction layer in front of it or the client provides the abstraction, either way [22:06:48] the main point is that this api is trying to be friendly to machines and not people so much [22:07:32] milimetric, of course, just trying to make payload smaller [22:08:26] ok, prochtu [22:08:36] nvm [22:09:25] that we' ve talked about, making the payload smaller. We might do a v2 with a different output format. But gz transmission shouldn't care too much about the repetition [22:09:46] gtg, meeting soon [22:25:40] (PS3) Mforns: Add all reportupdater files to this dedicated repo [analytics/reportupdater] - https://gerrit.wikimedia.org/r/272712 (https://phabricator.wikimedia.org/T127327) [22:35:09] (CR) Mforns: "Sorry that I put the changes in the same patch as creating the new files. If this patch is OK, it can be merged directly, it has no deploy" [analytics/reportupdater] - https://gerrit.wikimedia.org/r/272712 (https://phabricator.wikimedia.org/T127327) (owner: Mforns) [22:35:53] Analytics-Kanban, Research-and-Data: Remove Client IP from Eventlogging capsule {mole} - https://phabricator.wikimedia.org/T128407#2073499 (leila) @madhuvishy I added the blocking task T125946. It's marked as resolved but it's not. It will be on March 8 or at the latest on March 14. I'll update this thread. [22:35:59] Analytics-Kanban, Research-and-Data: Remove Client IP from Eventlogging capsule {mole} - https://phabricator.wikimedia.org/T128407#2073970 (leila) [22:36:16] Analytics-Kanban, Research-and-Data: Remove Client IP from Eventlogging capsule {mole} - https://phabricator.wikimedia.org/T128407#2073959 (leila) [22:36:34] Analytics-Kanban, Research-and-Data: Remove Client IP from Eventlogging capsule {mole} - https://phabricator.wikimedia.org/T128407#2073994 (madhuvishy) @leila Thanks! [22:38:21] (PS4) Mforns: Add hive queries for the traffic breakdown reports [analytics/reportupdater-queries] - https://gerrit.wikimedia.org/r/272635 (https://phabricator.wikimedia.org/T127326) [22:43:25] (CR) Mforns: "In the end, I removed the output_folder config." [analytics/reportupdater-queries] - https://gerrit.wikimedia.org/r/272635 (https://phabricator.wikimedia.org/T127326) (owner: Mforns) [22:44:41] (CR) Mforns: "BTW, if this patch is OK, it can also be merged. It has no deployment dependencies." [analytics/reportupdater-queries] - https://gerrit.wikimedia.org/r/272635 (https://phabricator.wikimedia.org/T127326) (owner: Mforns) [22:53:21] Analytics, ArchCom-RfC, Discovery, EventBus, and 7 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#2074035 (RobLa-WMF) [23:01:56] is there a way to get the top most frequently requested images? [23:01:59] nuria: do you know what the purpose of this assertion is? https://github.com/wikimedia/eventlogging/blob/master/tests/test_jrm.py#L42 [23:02:10] ori: of all time? [23:02:26] madhuvishy: or the last day / week [23:02:34] we don't publish mediacounts through the api yet, but we can get it through hive [23:02:35] sorry, I should have clarified: a _public_ way [23:02:40] right, that I know [23:02:41] or dumps [23:02:51] but that would mean reading the dumps [23:02:56] right [23:03:07] there is a task to publish media counts with api [23:03:10] its in backlog [23:03:16] no reason not to [23:03:23] cc: milimetric [23:03:48] fully support publishing media counts via AQS (with a similar module to the API) [23:03:55] (is that what we're talking about?) :) [23:04:45] ori http://dumps.wikimedia.org/other/mediacounts/ [23:04:49] right, it is, yes, let's do it. It would be probably a couple weeks of work for a mediacounts/top type of endpoint [23:05:39] I didn't know that there were media-specific dumps [23:05:44] that is probably good enough [23:13:20] ottomata: when we remove ips - we will still get them in the raw event correct? we just won't parse them [23:14:39] yes, until we change varnishkafka settings [23:15:03] ottomata: okay cool [23:23:20] laters! [23:45:47] nuria, I will show up at standup tomorrow, just to give an update on today's work and sync with you on next steps on browser reports