[02:09:37] (PS1) Milimetric: [WIP] Join and denormalize all histories into one [analytics/refinery/source] - https://gerrit.wikimedia.org/r/307903 [02:10:33] (CR) Milimetric: "Didn't get very far, Joseph, got a little bogged down with the tedious mapping code." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/307903 (owner: Milimetric) [02:11:27] (CR) jenkins-bot: [V: -1] [WIP] Join and denormalize all histories into one [analytics/refinery/source] - https://gerrit.wikimedia.org/r/307903 (owner: Milimetric) [09:09:57] joal: sorry joining [09:10:06] was debugging and lost track of time -.- [09:13:06] elukey: np, Lino is off schedule today [09:13:25] elukey: Can we move the meeting to later on (let's say 1pm?) [09:13:52] suuuure [09:14:31] done :) [09:14:49] thx mate [10:59:18] elukey: meeting in batcave? [11:00:26] yep joining [11:20:49] * elukey lunch! [11:27:44] elukey: Arf, forgot one thing about conference travelling [11:27:53] elukey: we'll finalize this after :) [12:32:11] Analytics-Kanban: AQS Cassandra READ timeouts caused an increase of 503s - https://phabricator.wikimedia.org/T143873#2601271 (elukey) Today I tried to look into all the per article metrics in Graphite to check for any relevant pattern. The rationale is that $something triggers IOPS on disk that eventually tu... [12:50:34] Analytics-Kanban: AQS Cassandra READ timeouts caused an increase of 503s - https://phabricator.wikimedia.org/T143873#2601283 (elukey) One possible way to have more insight would be http://techblog.netflix.com/2015/07/java-in-flames.html I checked kernel+jdk should support this. The only requirement would be... [13:44:22] (PS2) Milimetric: [WIP] Join and denormalize all histories into one [analytics/refinery/source] - https://gerrit.wikimedia.org/r/307903 [13:45:48] (CR) jenkins-bot: [V: -1] [WIP] Join and denormalize all histories into one [analytics/refinery/source] - https://gerrit.wikimedia.org/r/307903 (owner: Milimetric) [14:00:21] git up [14:00:23] oops :) [14:09:33] milimetric: Currently reading your last scala patch [14:09:49] milimetric: if you want, we can pair from now to standup [14:10:04] working with Andrew now [14:10:11] ok milimetric, np [14:10:12] that patch is definitely mid-stream [14:10:19] we can chat about it a bit before standup, like 10 minutes? [14:10:23] milimetric: will try to move from there [14:10:29] sure [14:50:12] ok joal, wanna chat a little? [14:50:17] sure milimetric [14:50:23] milimetric: batcave ! [15:01:19] Analytics, Analytics-EventLogging, DBA: Queries on PageContentSaveComplete are starting to pileup - https://phabricator.wikimedia.org/T144278#2601560 (jcrespo) Resolved>Open This is now causing cronspam, 1 email per hour sent to root@. This is not a good solution: ``` rsync: mkdir "/limn-pu... [15:03:23] Analytics-Kanban: Create clean simplewiki output from edit history reconstruction - https://phabricator.wikimedia.org/T143321#2601574 (Nuria) [15:03:26] Analytics-Kanban: Edit History: Review scala code functionality and make page and user output uniform - https://phabricator.wikimedia.org/T143322#2601573 (Nuria) Open>Resolved [15:13:02] Analytics-Kanban: Switch AQS to new cluster - https://phabricator.wikimedia.org/T144497#2601636 (Nuria) [15:40:50] https://www.irccloud.com/pastebin/eIfNO6ru/ [15:41:01] revert weidness joal ^ [16:10:08] * elukey afk! o/ [16:58:13] milimetric: the sha1 thing is more difficult than I expected, I'm gonna leave it for now [16:59:00] On another aspect milimetric : Seems that clickhouse is faster than druid on aggregations and same-ish on time-series [16:59:36] but this is just fast-trying, more complete tests should be done [17:01:04] huh, interesting. I'd have expected the opposite joal [17:01:21] (druid faster on aggregation and slower on timeseries [17:01:27] right milimetric [17:01:46] but as for the sha1, yeah, totally, make it a no-op for now and we'll populate that field later. I think that's the beauty of this approach, doesn't have to be complete to be done [17:01:58] milimetric: ok great :) [17:02:36] milimetric: going through modifying every rev in-between every revert efficiently is the real [17:02:40] thing [17:26:43] hi milimetric. I'm back again. I have a question about analytic-users group. Can you tell me what kind of access the users in that group have? [17:27:01] Stas and I are discussing some access options, and I'm not clear about that group. [17:27:44] leila: i think you have to ask ottomata [17:28:03] nuria_: thanks. ottomata, ^ [17:36:33] leila: hiiii [17:36:41] they'd get login access to stat1002 & stat1004 [17:36:48] and they'd have an account created in hadoop [17:36:50] but that's it [17:37:01] they wouldn't have access to things that are group readable by people in the analytics-privatedata-users group [17:37:03] like webrequests [17:37:13] so, its really only useful if the user wants to use hadoop with her own data [17:37:16] not stuff we have [17:37:19] actually [17:37:26] there is stuff in there that should be readable by non privatedata user [17:37:30] like ummm, pageviews? maybe? [17:37:35] can't remember if that is private...it might be [17:37:58] but, other stuff like the edit history that milimetric et. al. are working on would probably be non private [17:38:03] so people in analytics-users could access that [17:38:16] its bascailly hadoop access without access to webrequests [17:41:32] ottomata: thanks. for my understanding: are the accesses governed by database names? for example, is it correct to say that analytics-users won't have access to wmf_raw database? [17:42:59] and if that's the case, will analytics-users have access to wmf database? (as you said, there is some data in wmf database, for example webrequest, or pageview_hourly, that is still sensitive, correct?) [17:48:30] leila: no, it is not database restricted [17:49:13] nope, access is not goverend by database [17:49:17] its just HDFS file permissions [17:49:23] Hive is just a mapping onto files in HDFS anyway [17:49:39] leila: it's like a directory tree on linux [17:49:47] the database are arbitrary [17:49:49] leila: in some directories you have reda permits [17:49:52] *read [17:49:59] its all about the location of the data for the external table partitoins [17:50:03] so if you do [17:50:10] show partitions wmf.webrequest [17:50:13] leila: but not in others that might be restricted just to be readable by root [17:50:13] (you'll get a LOT) [17:50:16] you can see the HDFS file paths [17:50:19] How can I get a list of the directories/data the user will have access to if in analytics-users? [17:50:20] and then if you do [17:50:23] hdfs dfs -ls [17:50:27] you can see the access perms [17:50:33] ummm [17:50:35] joal, milimetric : so, ahem, i found 1 inconsistancy that might be notable [17:50:48] joal, milimetric : in cassandra taht is [17:50:49] leila: not really sure, anything that is readable by users in that group. [17:50:49] hm [17:50:57] *that [17:51:02] ottomata: that's already helpful. Let me dig into that. [17:51:16] nuria_: what's the inconsistency? [17:51:50] milimetric: that "no views" are reported as null in new cluster and zero in old cluster [17:52:07] milimetric: [{"project":"wikidata","article":"Q604141","granularity":"daily","timestamp":"2016060100","access":"all-access","agent":"user","views":0} [17:52:09] versus [17:52:20] right [17:52:32] that was a hot topic when we first launched [17:52:44] {"project":"wikidata","article":"Q604141","granularity":"daily","timestamp":"2016060100","access":"all-access","agent":"user","views":null} [17:52:46] leila: it might be easier to just look at perms in hdfs [17:52:49] in /wmf/data/* [17:52:57] also if you find anything that isn't as it should be [17:52:59] let me know [17:53:34] e.g. if you happen to find webrequests that aren't 750 hdfs:analytics-privatedata-users [17:53:38] nuria_: yeah, we can go back to the discussions we had in the beginning, I think people preferred 0s. But we could easily change that behavior in javascript and just store the nulls as they are [17:53:47] I'll do that ottomata. :) [17:54:54] milimetric: ya, i do not think is a deal breaker but we should dig into why that is the case, will make a note in ticket [17:56:01] milimetric: but yes, we should pass 0s not nulls [17:56:04] nuria_: my memory is definitely foggy, but I remember joseph doing custom logic on his loader back then. Maybe he just forgot to do the same with the new loader [17:56:23] either way, I think it's more efficient to store nulls [17:56:33] I always thought we should store nulls and return 0s [17:56:35] milimetric: but api should return 0s no question [17:56:40] milimetric: agreed [17:56:43] right [17:57:20] and now we have more data about CPU usage and all that for the cluster, so we can make informed decisions [17:58:24] right, so ottomata, if I understand the output of hdfs dfs -ls /wmf/data/* correctly, /wmf/data/archive/webrequest will be accessible to analytics-users? [17:59:04] ow no! scratch that. that is available only to analytics-privatedata-users [17:59:50] leila: the ones in archive are not the unsampeld webrequests [17:59:54] so those are readalbe [18:00:00] those are historical copies of udp2log like data [18:00:09] see [18:00:14] /wmf/data/wmf/webrequest [18:00:16] and/or [18:00:18] /wmf/data/raw/webrequest [18:00:27] Analytics-Kanban: Continue New AQS Loading - https://phabricator.wikimedia.org/T140866#2602230 (Nuria) Tested a bit how are we doing consistency wise and thus far things checkout. I found 1 issue. See repro below. Current API: http://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/wikidata.org/all-... [18:44:50] a-team: public service announcement: you have to escape semicolons in hive strings otherwise it throws an EOF error: [18:44:58] select 'what the;'; throws an error [18:45:05] but select 'what the\;'; is ok [19:05:51] Analytics, Cassandra, Discovery, Maps, and 2 others: Investigate and implement possible simplification of Cassandra Logstash filtering - https://phabricator.wikimedia.org/T130861#2602486 (Eevans) I updated https://gerrit.wikimedia.org/r/#/c/282466 to include the equivalent 2.2 configuration, and... [19:06:52] nuria_, milimetric: Diff between nulls and 0 is due to me upgrading storage to use null instead of 0 (less space used) [19:07:16] ok, that makes sense, we can just coalesce them in the output code [19:07:18] joal: ok, I will just need to do changes to aqs to make sure those get wrapped at js layer [19:07:18] nuria_, milimetric: However I'd have expected restbase to convert null to 0 [19:07:33] joal: jaja, if we write the code sure [19:07:42] milimetric: I can also do that [19:07:54] nuria_: My bad, I had in mind that it already did it [19:07:58] I'll file the task so we don't forget and put it on kanban [19:08:04] great [19:08:17] milimetric: ok, was about to do it, just looked at code again and shoudl be easy [19:08:20] *should [19:08:36] milimetric: let me finish deploying latest changes to dashiki [19:09:19] Analytics-Kanban: Coalesce nulls to 0s in output - https://phabricator.wikimedia.org/T144521#2602510 (Milimetric) [19:09:20] https://phabricator.wikimedia.org/T144521 [19:15:49] Analytics, Analytics-EventLogging, DBA: Queries on PageContentSaveComplete are starting to pileup - https://phabricator.wikimedia.org/T144278#2602573 (Milimetric) ping @Ottomata because he was cleaning these directories a bit, may be related [19:16:45] logging off a-team, bye ! [19:18:59] byyye! [19:23:14] Analytics, Analytics-EventLogging, DBA: Queries on PageContentSaveComplete are starting to pileup - https://phabricator.wikimedia.org/T144278#2602603 (Ottomata) Thanks, just merged a fix. [19:23:56] Analytics, Analytics-EventLogging, DBA: Queries on PageContentSaveComplete are starting to pileup - https://phabricator.wikimedia.org/T144278#2602604 (Ottomata) Open>Resolved I'm pretty sure that cronspam problem was unrelated to this ticket. Closing. [19:50:05] (PS4) Milimetric: Script sqooping mediawiki tables into hdfs [analytics/refinery] - https://gerrit.wikimedia.org/r/306292 (https://phabricator.wikimedia.org/T141476) [19:53:26] (PS5) Milimetric: Script sqooping mediawiki tables into hdfs [analytics/refinery] - https://gerrit.wikimedia.org/r/306292 (https://phabricator.wikimedia.org/T141476) [20:00:24] (PS7) Nuria: Bookmark for browser dashboard regarding graph and time [analytics/dashiki] - https://gerrit.wikimedia.org/r/306980 (https://phabricator.wikimedia.org/T143689) [20:06:55] (CR) Nuria: [C: -1] "Need to fix couple comments" [analytics/dashiki] - https://gerrit.wikimedia.org/r/306980 (https://phabricator.wikimedia.org/T143689) (owner: Nuria) [20:26:20] argh, bug ! on bookmarks [20:38:07] (PS6) Milimetric: Script sqooping mediawiki tables into hdfs [analytics/refinery] - https://gerrit.wikimedia.org/r/306292 (https://phabricator.wikimedia.org/T141476) [20:39:14] (PS7) Milimetric: Script sqooping mediawiki tables into hdfs [analytics/refinery] - https://gerrit.wikimedia.org/r/306292 (https://phabricator.wikimedia.org/T141476) [20:39:28] (CR) Milimetric: "ok I think this is ready for review again" [analytics/refinery] - https://gerrit.wikimedia.org/r/306292 (https://phabricator.wikimedia.org/T141476) (owner: Milimetric)