[00:00:22] HaeB: no, it just only joins channels when it has a message to send. it always idles in #wikimedia-labs though [00:03:19] legoktm: ok thanks, i thought i had seen some eligible phabricator updates in the meantime (i.e. which wikibugs usually posts here), but maybe these projects are not covered here [01:21:28] Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2328835 (Jdlrobson) [05:43:47] Quarry: feature for quarry tool: show quarry time used - https://phabricator.wikimedia.org/T136266#2329010 (Yamaha5) [07:38:09] Analytics-Kanban, RESTBase, Services, RESTBase-API, User-mobrovac: Enable rate limiting on pageview api - https://phabricator.wikimedia.org/T135240#2329153 (mobrovac) We plan to start enforcing the limits next week. I think we should start with the currently-set limits (10 reqs/s/endpoint/cli... [07:47:26] o/ [08:20:36] Hi elukey :) [08:20:41] cassandra debrief ? [08:26:47] joal: I am trying to figure out a raid configuration for analytics1047 atm, do you mind if we do it later? [08:26:57] absolutely not :) [08:27:12] the drives are not coming up in the right order in /dev [08:27:13] sigh [08:27:18] and I think I found the problem [08:27:19] arrf :( [08:27:28] Well done ! Good luck with that elukey [08:27:38] but I need to figure out the megacli command :( [08:27:43] that is always a nightmare [08:27:52] joal: sync in ~30 mins? [08:27:59] when you have time :) [08:55:07] ah joal, whenever you have time can you show me how to try the spark dynamic allocation? [08:55:14] I am curious [08:55:32] elukey: sure :) [09:11:36] Analytics-Kanban, RESTBase, Services, RESTBase-API, User-mobrovac: Enable rate limiting on pageview api - https://phabricator.wikimedia.org/T135240#2329328 (mobrovac) RB logged a total of [30K excesses in the last 24h](https://logstash.wikimedia.org/#dashboard/temp/AVTsUtpi_LTxu7wlBfI-). The... [09:14:29] joal: I am ready [09:14:59] figured out the issue but I have no idea about what command I need to execute to fix the disk :P [09:16:34] it seems that our RAID controller on analytics1028->1058 creates Virtual Drives, each one implementing a single disk RAID0 [09:18:00] elukey: I tought hadoop was using JBOD? [09:18:27] elukey: batcave ? [09:18:31] yeah Chris suggested the same to me but it is not what I am seeing [09:18:42] https://phabricator.wikimedia.org/T134056#2329295 [09:18:45] joal --^ [09:18:50] hmmm ... We should discuss that with andrew as well then [09:18:52] going to the cave [09:19:08] well it is fine for our purposes [09:19:18] it doesn't change a lot [09:19:22] but it is weird :D [09:19:31] sure it's weird [10:57:29] elukey@analytics1047:~$ sudo service hadoop-hdfs-datanode status * Hadoop datanode is running [10:57:42] aah pasted incorrectly, buut works now :) [11:17:19] /eqiad/B/3RUNNINGanalytics1047.eqiad.wmnet:8041analytics1047.eqiad.wmnet:8042 [11:17:25] again [11:17:30] :@ [11:17:38] anyhow, analytics1047 running [11:19:11] Quarry: show time of execution in quarry - https://phabricator.wikimedia.org/T136266#2329589 (Ladsgroup) [11:27:33] elukey: spark cached timeout confirmed :) [11:29:48] wowwwww [11:41:41] * elukey lunch! [12:28:46] Hallo [12:29:08] A couple of weeks ago I got some queries from Ellery, [12:29:25] and I'm trying to run them in beeline now [12:29:35] they used to work then, but they don't work now [12:29:52] "ERROR : Failed with exception Unable to move source hdfs://analytics-hadoop/tmp/hive-staging_hive_2016-05-26_12-26-15_685_793302850668519493-4/-ext-10001 to destination hdfs://analytics-hadoop/user/hive/warehouse/ill.db/cross_wiki_navigation" [12:31:27] madhuvishy: milimetric ^ [12:33:17] ... actually, it probably worked on hive, and trying beeling is the new things [12:33:38] no, fails on hive, too. [12:46:59] hm, aharoni's problem looks like a permissions issue maybe [12:47:36] Analytics-Kanban, DC-Ops, Operations, ops-eqiad: I/O issues for /dev/sdd on analytics1047.eqiad.wmnet - https://phabricator.wikimedia.org/T134056#2329742 (elukey) [12:48:36] Analytics-Kanban, DC-Ops, Operations, ops-eqiad: I/O issues for /dev/sdd on analytics1047.eqiad.wmnet - https://phabricator.wikimedia.org/T134056#2253604 (elukey) a:Cmjohnson>elukey [13:18:56] hi halfak_ [13:19:30] o/ aharoni [13:19:32] (At hackathon. Might disappear suddenly) [13:22:05] Wifi is bad too [13:22:11] Analytics: hive / beeline query for interlanguage links fails - https://phabricator.wikimedia.org/T136295#2329877 (Amire80) [13:22:28] Analytics: hive / beeline query for interlanguage links fails - https://phabricator.wikimedia.org/T136295#2329890 (Amire80) [13:23:00] halfak_: would you, by any chance, have an idea how to fix the bug above? ^ https://phabricator.wikimedia.org/T136295 [13:24:06] the query seems to work. The error messages makes me suspect that it's some kind of a permission or server configuration problem. [13:25:38] ottomata: hi. can you maybe take a look at https://phabricator.wikimedia.org/T136295 ? [13:29:32] hi leila [13:29:37] [ I'm bugging everybody today ] [13:29:39] hey aharoni. [13:29:50] [haha. sure.] [13:29:53] can you maybe take a look at https://phabricator.wikimedia.org/T136295 ? [13:30:08] ottomata: o/ [13:30:14] just added https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Administration#Swapping_broken_disk [13:30:26] far from complete but it might be useful in the future [13:30:54] Analytics-Kanban, DC-Ops, Operations, ops-eqiad: I/O issues for /dev/sdd on analytics1047.eqiad.wmnet - https://phabricator.wikimedia.org/T134056#2329937 (elukey) Added some documentation in: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Administration#Swapping_broken_disk [13:33:07] aharoni: I'm not sure what's going on. It's Andrew's territory as far as I can tell. [13:33:40] ahhh internet.... [13:34:07] Andrew = ottomata1 , right? [13:34:21] aharoni: Hi [13:34:42] aharoni: Can you tell me your username on stat1002 please (for checking permissions and so0 [13:34:42] hi joal [13:35:08] joal : amire80 [13:35:49] ja hi, with you all shortly...having problems [13:35:57] np ottomata1 [13:36:07] aharoni: currently checking stuff [13:37:30] aharoni: my thoughts are that since the table already exists and was created by Ellery, you don't have persmission to use it [13:37:55] joal: maybe, but I thought that I dropped it and created a new one [13:37:56] aharoni: I suggest creating your own db: CREATE DATABASE amire80 [13:38:10] Am I supposed to create a new one every time? That's what I understood, but I might be wrong. [13:38:18] OH. [13:38:20] Let me try. [13:39:22] joal: running... [13:39:41] aharoni: forgot the ';' at the end [13:40:26] joal: I figured that out... [13:40:36] huhu :D [13:40:37] joal but CREATE TABLE failed with the same error [13:41:16] aharoni: definitely permissions issue - let's wait for ottomata when he's back to double check tou are in the correct group [13:42:05] elukey: nice work with megacli! [13:42:07] "Note for future readers: I had to create the "/var/lib/hadoop/data/hdfs" and "/var/lib/hadoop/yarn" directories " [13:42:20] doesn't puppet create them? [13:43:12] ottomata seems to be here [13:43:14] :) [13:43:16] mmmm now that I think about it I might have created it before running puppet [13:43:41] ottomata: can you double check user amire80 is in the correct group to create hive tables ? [13:43:49] ottomata: please :) [13:43:58] ottomata: ah yes I didn't run puppet before that because I wanted to check the hdfs daemeon [13:44:01] *daemon [13:44:10] aye cool [13:44:12] checking [13:44:45] joal: amire80 looks good [13:44:49] looking at that ticket [13:45:03] ottomata: he has permissions issues creating his own DB in hive :( [13:45:15] ottomata: I'm gonna create the folder and change rights, then retest [13:46:09] aharoni: Actually the database creation seems to have worked [13:46:41] joal: yes, but after that something fails, and I don't see it when I run `show tables;` [13:47:07] aharoni: have you changed the db name in the table creation query ? [13:47:42] joal: no [13:47:44] am looking, things seem normal, althouhg ill.db looks like it was created by ellery [13:47:56] aharoni: classical mistake :) [13:48:25] ottomata: looks like we should be good, sorry for having disturb ;) [13:48:30] oook :) [13:51:22] Analytics: hive / beeline query for interlanguage links fails - https://phabricator.wikimedia.org/T136295#2330065 (Amire80) Open>Resolved a:JAllemandou Resolved by creating my own database and creating the table in it. Thanks to @Ottomata and @JAllemandou! [13:51:53] aharoni: this could be of interest to you : https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive/Queries [13:53:15] joal: thanks. should it be hive or beeline? is there any difference? [13:53:33] aharoni: beeline is better but a bit newer [13:53:34] aharoni: there are difference, and you should go for beeline [13:53:59] only readon for which beeline might cause problem is when loading local jars [13:54:06] joal: OK. and how do I exit the prompt? [13:54:12] CTRL-D [13:54:53] aharoni: CTRL+D is the usual termination shortcut on CLI on linux [13:55:25] joal: That's what I did, but I wondered whether there's a Real Command. [13:55:47] hmm aharoni, I don't know [13:56:29] Found it: !quit [13:57:16] And actually aharoni: ! works on beeline (http://sqlline.sourceforge.net/) [14:00:36] elukey, ottomata : Any idea on what happens on misc? Still a lot of errors [14:01:30] Also, ottomata elukey : Shall I manually mark misc-2016-05-25T19:00Z as ok to be processed ? [14:02:46] joal lemme ask in #traffic [14:02:51] sure ottomata [14:03:24] yall already talk about the loss yesterday? 5.7%? [14:03:41] milimetric: that's exactly the topic :) [14:04:28] oh, right, I got lost in the other thread, so we don't know [14:05:00] milimetric: ottomata is asking in traffic, double checking those warning/errors are normal [14:05:29] what in the world is #traffic?! [14:05:52] wikimedia-traffic :) [14:05:55] milimetric: on the other hand, I just realized that merging load and refine jobs prevent me from rerunning that and forcing it - I need to modify the oozie code ! [14:06:26] milimetric: ops spin off for Varnish/Traffic related discussions [14:06:36] oh wow, I had no idea, thanks elukey [14:06:41] did it start recently? [14:07:00] more or less when ema joined, so January/February [14:07:20] cool, thx [14:09:00] woah... can someone else check out store.wikipedia.org? [14:09:19] it looks like an attack where someone hijacks my session and sends me to random places [14:09:34] asking to get my location, to install add-ons, etc. [14:09:42] milimetric: works for me [14:09:45] wondering if it's just my computer or everyone [14:09:55] woah working for me too now! [14:10:00] ... hmmmm [14:10:25] oh!! it was a typo!! store.wikipeda.org [14:10:43] srsly? [14:10:53] they're typo-squatting us!! [14:10:54] ew [14:11:06] Analytics-Kanban: Add a way to force oozie load job to run even if unsuccesful dataloss check. - https://phabricator.wikimedia.org/T136308#2330129 (JAllemandou) [14:11:10] what the hell.. [14:11:11] hahahahh [14:11:12] I thought that went out of fashion [14:11:36] is this the kind of thing we tell ops about? [14:11:47] milimetric: it'll never be out of fashion unfortunately :( [14:11:51] like, do we actively fight this by buying the domains? [14:12:05] I'll mention it just in case [14:12:18] maybe with security [14:12:24] reporting just in case [14:15:01] * joal is AFK for a few minutes [14:25:54] milimetric: joal, can't remember what we said for disk layout for the druid hosts [14:26:01] raid 10 across all 4 ssds? [14:26:47] that's what we said yea [14:27:15] k [14:27:17] disk seems unimportant in druid for two reasons [14:27:25] 1. if we can't memory map it it's no good anyway [14:27:32] 2. we keep backups of this in hdfs anyway [14:27:53] so as far as I'm concerned, we can do whatever for disk for now until we get more experience [14:29:44] mforns: you around? [14:30:16] joal: after a bunch of TSV / JSON craziness yesterday I'm convinced we should just use sqoop [14:30:28] you were pretty against it so do you wanna chat? [14:32:57] milimetric, hi! [14:33:26] hey mforns wondering if you thought more about sqoop since yesterday [14:33:32] milimetric, I'm testing the json serde [14:33:38] k [14:33:41] not yet started with swoop [14:33:49] *sqoop [14:39:53] k, I'll do that now then [14:40:02] Analytics-Kanban, RESTBase, Services, RESTBase-API, User-mobrovac: Enable rate limiting on pageview api - https://phabricator.wikimedia.org/T135240#2330259 (Nuria) @mobrovac I think last burst has UA as ruby, we know some popular tools on labs cause bursts too but there are many automated scr... [14:43:46] milimetric, ok [14:47:36] Analytics-Backlog, Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: More solid Eventlogging alarms for raw/validated {oryx} [8 pts] - https://phabricator.wikimedia.org/T116035#2330292 (ilevy) Resolved>Open It looks like a local fork, check_graphite_until_temp, is still in use.... [14:48:26] ottomata, joal: can you point me to the oozie snippet that calculates the loss alarm? [14:48:34] Just want to add all the info in the task [14:49:26] mmm maybe grepping webrequest-load-check_sequence_statistics in refinery [14:50:25] elukey: [14:50:25] https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webrequest/load/generate_sequence_statistics.hql [14:50:27] and then also [14:50:27] https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webrequest/load/generate_sequence_statistics_hourly.hql [14:51:01] thanks :) [14:51:15] :) [14:51:53] ottomata: I've got a permissions riddle revolving around stat1002,3,4 and /etc/mysql/conf.d/research-client.cnf [14:52:07] haha [14:52:10] yeah? [14:52:49] sorry, AC went weird [14:52:50] milimetric: [14:52:50] ls -l /etc/mysql/conf.d/ [14:52:52] can help [14:52:52] :) [14:53:00] what's the riddle? [14:53:08] ottomata: ok, so on 1003 everything works, file's there and I have access [14:53:14] on 1002 it's there but I have no access [14:53:18] on 1004 it's not there [14:53:40] right [14:53:41] ideally it'd be there on 1002 or 1004 because I'd like to sqoop from the dbs it gives me access to [14:53:50] also, on 1002 I have no access to my own home dir [14:53:53] even though it says I do [14:53:59] milimetric: on 1002 you can use [14:54:03] analytics-research-client.cnf [14:54:07] on 1004, it doesn't exist [14:54:14] but it could! [14:54:18] the analytics-research-client.cnf one [14:55:12] aha, gotcha [14:55:26] I'll try on 1002, ok last problem: [14:55:37] on 1002, I can go to my home directory and then: [14:55:40] touch ~/something [14:55:42] but if I try: [14:55:46] touch ~/test.tsv [14:55:47] it fails [14:55:51] haha [14:55:55] htat is weird [14:56:00] not so if I do touch ~/test2.tsv [14:56:02] that's fine [14:56:11] haha, does test.tsv exist? [14:56:15] with some weird perms? [14:56:15] I thought my home dir was broken but it's just that one file name! [14:56:16] no, it doesn't [14:56:27] or if it does someone else made it and I can't even see it in ls [14:56:35] ja it does [14:56:36] lrwxrwxrwx 1 milimetric wikidev 24 Nov 20 2015 test.tsv -> /home/joal/json_test.csv [14:57:25] oh! it's a link [14:57:43] want me to remove it? [14:57:57] hm those don't list when just doing "ls"... weird [14:58:04] they should that is weird [14:58:30] haa, uh i don't see any of your test. files! [14:58:32] in just ls [14:58:47] oh you just removed them? [14:58:48] Analytics-Kanban, Operations, Traffic: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2330347 (elukey) p:Triage>High [14:58:57] ottomata --^ [14:58:58] I just removed them [14:59:04] yeah, I tried rm test.tsv and it didn't work [14:59:11] but then I tried rm test* and it removed the link too [14:59:22] it's settled I shall never understand the command line [14:59:27] thx ottomata [15:02:55] :) [15:04:35] Quarry: puppet disabled on quarry-main-01 - https://phabricator.wikimedia.org/T136315#2330367 (valhallasw) [15:06:24] elukey: I think I have found a hack not to have to change the oozie for rerun things [15:06:43] milimetric: yes, let's talk around sqoop :) [15:07:06] joal: you can call it "elegant workaround" to make it more appealing :P [15:07:21] joal: k, cave [15:07:22] ottomata: any news on data loss? [15:07:26] OMW milimetric [15:07:37] joal: I opened a phab task, --^ [15:07:55] elukey: did that as well :) [15:08:02] ah snap [15:08:20] mine is https://phabricator.wikimedia.org/T136314 [15:08:41] elukey: https://phabricator.wikimedia.org/T136308 [15:09:49] ahhh that's related [15:09:57] okok non overlapping tasks [15:43:08] ottomata: do you know how ellery gets sqoop to work despite the problem that hadoop nodes don't have access to connect to the database? [15:44:02] milimetric: they dont'? [15:44:09] i think they do, no? i think we poked a hole in the firewall for that [15:44:32] hm... i get Access denied for user 'research'@'10.64.5.102' [15:44:38] I assumed that IP was a node [15:44:45] oh wait, access denied means it did connect [15:44:55] so does he use a different user or something? [15:45:16] *that IP was a hadoop node I mean [15:46:17] hm [15:46:18] that i'm not sure [15:55:02] ottomata: sorry, the --password-file option in sqoop was reading my password incorrectly [15:55:12] false alarm about the access / connection problem [15:55:18] cool [16:01:32] ottomata: the script yesterday was working fine - those misnamed files were just sticking around from some small bug before. the reason it wasn't adding the jars was that those jars already exist (i'm using versions that already exist in refinery to test) [16:03:43] ah! :) [16:29:03] Analytics-Kanban: Add a way to force oozie load job to run even if unsuccesful dataloss check. - https://phabricator.wikimedia.org/T136308#2330808 (JAllemandou) Open>Invalid [16:36:51] Analytics-Kanban, Operations, Traffic: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2330849 (ema) When are we seeing those inconsistencies? Any specific timeframes? [16:39:01] Analytics-Kanban: Extract edit oriented data from MySQL for small wiki - https://phabricator.wikimedia.org/T134790#2330852 (Milimetric) a:Milimetric [16:39:16] Analytics: Get pageview dataset ready to be loaded to druid to test querying capabilities - https://phabricator.wikimedia.org/T136330#2330856 (Nuria) [16:40:50] Analytics: Get pageview dataset ready to be loaded to druid to test querying capabilities - https://phabricator.wikimedia.org/T136330#2330873 (Nuria) 1. start with 1 day of data and test quqry capabilities 2. test a month later 3. test 3 months of data This is contingent on having druid on puppet and worki... [16:45:12] Analytics-Kanban, Operations, Traffic: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2330888 (elukey) [16:45:45] Analytics-Kanban, Operations, Traffic: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2330335 (elukey) >>! In T136314#2330849, @ema wrote: > When are we seeing those inconsistencies? Any specific timeframes? Need to query Hive and Hadoop... [16:46:58] Analytics: Enable pivot ui so non analytics engineers can query druid pageview data (poc) - https://phabricator.wikimedia.org/T136331#2330895 (Nuria) [16:48:24] Analytics: Get pageview dataset ready to be loaded to druid to test querying capabilities - https://phabricator.wikimedia.org/T136330#2330856 (Nuria) Format? Json or tsv [16:53:43] Analytics-Kanban: Get pageview dataset ready to be loaded to druid to test querying capabilities - https://phabricator.wikimedia.org/T136330#2330856 (Nuria) [17:04:45] Analytics: Check if we can deprecate legacy TSVs production (same time as pagecounts?) - https://phabricator.wikimedia.org/T130729#2330988 (Nuria) [17:06:50] Analytics: Check if we can deprecate legacy TSVs production (same time as pagecounts?) - https://phabricator.wikimedia.org/T130729#2330992 (Nuria) 1. Try to find users of files: - e-mail analytics@ - see users on unixgroup that has access to files and e-mail them individually - communicate with research-int... [17:09:20] Analytics-Kanban: Check if we can deprecate legacy TSVs production (same time as pagecounts?) - https://phabricator.wikimedia.org/T130729#2144772 (Nuria) [17:09:29] Quarry: puppet disabled on quarry-main-01 - https://phabricator.wikimedia.org/T136315#2330998 (yuvipanda) Yes, when it restarted it came up trying to use python3 instead of 2, and I disabled it and hand hacked to fix it. Must fix puppet... [17:19:31] Analytics, DBA: dbstore1002 crashed - https://phabricator.wikimedia.org/T136333#2331074 (jcrespo) [17:26:09] Analytics-Kanban, Operations, Traffic: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2330335 (Nuria) 1. On labs or perhaps prod: Generate lots of request and a sighup and see if all requests ids are present, try to find repro for dropping... [17:28:33] Analytics-Kanban, Operations, Traffic: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2331150 (Ottomata) https://gist.github.com/ottomata/7048012 [17:30:13] Analytics-Kanban: Check if we can deprecate legacy TSVs production (same time as pagecounts?) - https://phabricator.wikimedia.org/T130729#2331163 (Nuria) a:madhuvishy [17:39:27] Analytics-Kanban, EventBus, Patch-For-Review: Propose evolution of Mediawiki EventBus schemas to match needed data for Analytics need - https://phabricator.wikimedia.org/T134502#2331201 (Ottomata) @aaron can we find some mins to discuss above? I'm not sure if you are the right person, but if not you... [17:44:39] logging off a-team! [17:44:41] byeeee [17:47:24] laters! [18:09:44] Analytics-Kanban, RESTBase, Services, RESTBase-API, User-mobrovac: Enable rate limiting on pageview api - https://phabricator.wikimedia.org/T135240#2331321 (GWicke) Lets also document these limits for each API end point by adding a list item along `Stability`. [18:27:08] Analytics, DBA: dbstore1002 crashed - https://phabricator.wikimedia.org/T136333#2331371 (jcrespo) The slave was not using GTIDs, which meant that at least x1 got desynced. S1, however, was stopped. Skipping a single delete on x1 seemed to do the work. [18:27:44] Analytics, DBA: dbstore1002 crashed - https://phabricator.wikimedia.org/T136333#2331373 (jcrespo) p:Triage>Low Low because issues are no longer ongoing. [18:46:44] mforns: I'm back working on this export, didn't get a chance to do it during metrics because the db crashed [18:46:47] (jaime's message) [18:47:00] mforns: it does look like at least some of the fields are weird: select * from milimetric.simplewiki_archive limit 10; [18:47:14] those are deleted pages though so maybe they were deleted because they were nonse [18:47:17] *nonsense [18:47:21] milimetric, I see [18:47:26] or maybe they just didn't get exported / imported ok [18:47:28] let me execute [18:47:40] yep, I'll keep doing the other tables [18:48:36] milimetric, OK, I still have a lot of work with the queries, also started 20 mins ago [18:48:36] milimetric: hmm, have you checked the db crash was unrealted to the export ? [18:48:43] yes [18:48:49] jaime said it was fine [18:48:49] ok :) [18:48:58] but that queries will be slow from now on [18:48:58] just double checking\ :) [18:49:08] he was right, there was half as much data in archive as logging, but it took twice as long [18:49:31] np mforns I'll help you when I see how long these jobs are taking [19:12:07] milimetric: where is the advice of "limit of 500 reqs per second" for users posted to our docs? [19:12:28] milimetric: i updated our internal docs but cannot find this info on the external ones [19:12:29] nuria_: looking [19:12:35] milimetric: internal docs: https://wikitech.wikimedia.org/wiki/Analytics/AQS#2016-05-26 cc elukey [19:15:17] aha, nuria_ it's a more general thing: https://wikimedia.org/api/rest_v1/?doc [19:15:23] it's right in the intro text there [19:15:24] k [19:15:29] so that's nothing to do with us [19:15:51] right, i will document 429 on our wikitech docs though [19:16:09] milimetric: will send links to list with docs updated in a bit [19:16:26] sweet [19:16:29] thx! [20:04:59] Analytics, Analytics-Wikistats, Internet-Archive: Total page view numbers on Wikistats do not match new page view definition - https://phabricator.wikimedia.org/T126579#2331761 (ezachte) Ah good to see there is white list. > differences are between 1 and 4%, correct? Actually 1/10 of that, the ave... [20:17:42] milimetric: btw, I am prepping for puppetizing druid boxes in prod [20:17:53] gotta get some deb crap straight, but ja [20:18:06] ottomata: thanks, I think joseph will get that pageview data tomorrow so no worries [20:18:18] we're in the middle of some insane edit data juggling :) [20:18:20] it's kinda fun [20:19:05] :) [20:19:09] milimetric: I'll ask mforns tomorrow, I want to know more on data jugling :) [20:19:33] it's so crazy dude, I can't tell you now you'll dream about it [20:19:34] ottomata: I have a script to generate data [20:19:37] tty tomorrow :) [20:19:42] inideed [20:19:52] :] [20:20:22] ottomata: how should I format druid timestamp? timestamp in second, in millis, , other formt? [20:20:36] ISO8601! :) [20:21:07] 2015-09-01T00:00:00Z [20:21:11] i think druid wants them that way anywa [20:21:14] or likes them that way [20:21:32] ok ottomata, will do :) [20:22:11] ottomata: one last thing: can I have fields in the json that are not used ( I assume yes, but prefer to check|) [20:22:44] uhm, i don't know, that are not used in the indexing task? [20:22:45] yes [20:22:46] pretty sure [20:22:50] but, um [20:22:51] actually, i don't konw [20:23:15] ottomata: I need a date (2016-05-25) format to split data daily [20:23:29] I'm pretty sure as well I can have that and not use it [20:24:21] aye probably [20:26:52] ottomata: I'll daily and monthly data tomorrow :) [20:27:37] i might have some problems getting the hadoop deps into jessie in prod [20:27:49] not sure how to do it properly...and my european helpers are all offline :/ [20:28:08] hm, I can't really help I guess [20:28:26] if not tomorrow, then next week :) [20:49:03] joal: wanted to ask you - when we do the next refinery deployment, can we dry run the jenkins jobs? [20:49:09] actually [20:49:14] not dry run [20:49:24] for rea [20:49:26] madhuvishy: For sure ! [20:49:26] real* [20:49:41] Probably a dry-run first, then a real one :) [20:49:51] joal: ha ha yeah that's possible [20:50:26] ottomata: do the new archiva creds work, or are you still working on it? [20:50:49] they shoudl work [20:50:56] ok cool [20:51:38] ottomata: hdfs dfs -text /user/joal/pv_druid/2016-01-01/part-00031.gz | less [20:51:59] ottomata: then we still need https://gerrit.wikimedia.org/r/#/c/290639/8 to be merged, and https://phabricator.wikimedia.org/T136221 to be set up before doing the test run. No hurry etc [21:05:19] hue is really slow [21:05:28] i don't mean the underlying queries it performs -- just the web interface [21:27:59] :) it's much much faster than it used to be [21:28:05] but yeah, that's what you get with python [21:28:06] :P [23:21:08] Analytics, Pageviews-API, RESTBase-API: AQS: query multiple articles at the same time - https://phabricator.wikimedia.org/T118508#1802156 (GWicke) As with large & random time ranges, querying many titles in a single request makes individual requests more expensive & hard to effectively rate-limit, an... [23:52:32] milimetric: y still there?