[09:37:16] <wikibugs>	 Analytics: Pagecount-raw files missing since 27th at 22.00 - https://phabricator.wikimedia.org/T113931#1680156 (DianaArq) NEW
[09:57:40] <wikibugs>	 Analytics-Tech-community-metrics, Possible-Tech-Projects, Epic: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1680207 (Aklapper) >>! In T60585#1677452, @Simmimourya3107 wrote: > Is there any chance that this project gets proposed for...
[10:00:46] <wikibugs>	 Analytics-Engineering, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1680216 (Aklapper) @Analytics: Who should be the assignee of this task?  >>! In T113695#1676786, @Tgr wrote: > so that's probably not where the...
[10:25:19] <mforns>	 hi a-team!
[10:41:27] <bmansurov>	 mforns, hi
[10:41:33] <mforns>	 hi bmansurov!
[10:41:44] <bmansurov>	 mforns, do you know where i can get the list of top 1000 pages for all wikis at a certain date?
[10:41:59] <bmansurov>	 top in terms of the number of page views
[10:42:04] <mforns>	 bmansurov, aha
[10:42:35] <mforns>	 we are currenlty working on an API that will serve this kind of data
[10:42:50] <bmansurov>	 mforns, i see
[10:42:51] <mforns>	 it is a goal for this quarter actually
[10:43:20] <mforns>	 see: https://phabricator.wikimedia.org/T101792
[10:43:31] <bmansurov>	 mforns, I'm working on https://phabricator.wikimedia.org/T113236 and was wondering if that's possible using hive?
[10:43:39] <mforns>	 all tasks marked with {slug} in our kanban belong to this project
[10:43:48] <bmansurov>	 i see
[10:44:16] <bmansurov>	 mforns, when I run a query all I see is the Main_Page in different languages :))
[10:44:34] <mforns>	 aha
[10:44:51] <mforns>	 I guess hive is an option for one-off queries, yes
[10:45:30] <bmansurov>	 cool, thanks
[10:45:34] <mforns>	 seeing the main page in different languages makes sense
[10:45:46] <mforns>	 there was this service called stats.grok.se
[10:46:19] <mforns>	 it's still on air, but I ignore the update status of it
[10:46:35] <mforns>	 it probably has data for past months, but I think it will be outdated
[10:46:38] <bmansurov>	 looks cool
[10:47:14] <mforns>	 bmansurov, yes it looks outdated for 1+ year
[10:47:53] <bmansurov>	 i see
[10:47:53] <mforns>	 I would try to filter out the main page in your hive query, if this is easy?
[10:49:00] <mforns>	 or wait maybe 1 or 2 weeks (?) until our pageview API is released?
[10:49:04] <bmansurov>	 that's a great idea. i may even get the top 1000 pages per wiki and then combine the results
[10:49:30] <bmansurov>	 mforns, i'll certainly use the API in the future, but this task is time sensitive
[10:49:31] <mforns>	 yea, makes sense to me
[10:49:41] <mforns>	 aha, I understand
[11:09:04] <krrrit-wm>	 (PS1) Mforns: Fix inconsistencies when using --all-projects [analytics/aggregator] - https://gerrit.wikimedia.org/r/241620 (https://phabricator.wikimedia.org/T106554)
[11:09:39] <mforns>	 joal, hi!
[11:22:16] <wikibugs>	 Analytics-Engineering, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1680403 (JanZerebecki) Oh I missed that stats.wikimedia.org is served by stat1001 (according to the misc varnish config), so there is a 3rd hos...
[11:32:32] <wikibugs>	 Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Python Aggregator: Solve inconsistencies in data ranges when using --all-projects flag {musk} [5 pts] - https://phabricator.wikimedia.org/T106554#1680456 (mforns)
[11:37:48] <mforns>	 joal, yt?
[12:01:18] <wikibugs>	 Analytics-Engineering, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1680541 (Aklapper)
[12:01:27] <joal>	 Hi mforns
[12:02:07] <joal>	 I was travelling this morning, so not online
[12:03:19] <joal>	 Wassup ?
[12:09:34] <mforns>	 hi joal
[12:09:51] <mforns>	 do you have 5 mins for batcave?
[12:13:52] <joal>	 sure mfi
[12:13:54] <joal>	 sure mforns
[12:14:01] <mforns>	 :] owm
[12:33:26] <krrrit-wm>	 (PS3) Joal: Add oozie email sending subworkflow wrapper [analytics/refinery] - https://gerrit.wikimedia.org/r/240094 (https://phabricator.wikimedia.org/T113253)
[12:37:11] <krrrit-wm>	 (PS3) Joal: Add email sending on error in webrequest-load [analytics/refinery] - https://gerrit.wikimedia.org/r/240095 (https://phabricator.wikimedia.org/T113253)
[12:42:53] <krrrit-wm>	 (PS4) Joal: Add email sending on error in webrequest-load [analytics/refinery] - https://gerrit.wikimedia.org/r/240095 (https://phabricator.wikimedia.org/T113253)
[12:47:22] <krrrit-wm>	 (PS3) Joal: Add pageview quality check to pageview_hourly [analytics/refinery] - https://gerrit.wikimedia.org/r/240099 (https://phabricator.wikimedia.org/T109739)
[12:59:45] <wikibugs>	 Analytics-Engineering, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1680652 (ezachte) Some info on these Wikistats issues:  1 Wikistats runs on two servers  stat1002 for monthly reports, using stub dumps stat100...
[13:03:47] <joal>	 !log Errors on cluster, dome refine jobs have failed, investigating.
[13:44:33] <wikibugs>	 Analytics-Engineering, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1680851 (JanZerebecki) > So my understanding is those gerrit tasks are obsolete.  At least one I tested is not obsolete. https://gerrit.wikimed...
[13:50:58] <wikibugs>	 Analytics-Cluster, Datasets-Archiving, Datasets-Webstatscollector: Mediacounts stalled after 2015-09-24 - https://phabricator.wikimedia.org/T113956#1680877 (Hydriz) NEW a:Ottomata
[14:13:14] <nuria>	 holaa
[14:18:01] <nuria>	 joal,yt?
[14:28:16] <joal>	 hey nuria
[14:28:47] <nuria>	 one question, the sequence number in teh webrequest table comes from kafka host right?
[14:28:53] <nuria>	 *the
[14:29:06] <joal>	 sequence number is assigned by varnish-kafka I think
[14:29:22] <joal>	 Well in fact I am pretty sure :)
[14:29:48] <nuria>	 joal, k
[14:31:42] <wikibugs>	 Analytics, Varnish: Connect Hadoop records of the same request coming via different channels - https://phabricator.wikimedia.org/T113817#1680955 (Nuria) > Analytics/Data/Webrequest mentions a sequence field but doesn't say what that is.  This is the sequence number per host assigned by varnishkafka (have...
[14:34:29] <nuria>	 joal, do you still need CRs on the oozie e-mail code or did ottomata gave the go?
[14:35:29] <joal>	 I have a +1 on https://gerrit.wikimedia.org/r/#/c/240094/
[14:35:34] <joal>	 The next one is +2
[14:35:42] <joal>	 Then there is the whitelist one :)
[14:35:45] <joal>	 nuria: --^
[14:38:05] <nuria>	 joal: ok, so all those are ready to go then?
[14:38:45] <joal>	 I'd like to have some time to better test them, but didn't find it yet
[14:39:35] <nuria>	 ok, let's go back to those once they are tested then , right?
[14:40:00] <nuria>	 joaL; cause we alredy looked at methodology and such and it all looked good
[14:40:06] <nuria>	 sorry joal
[14:40:13] <nuria>	 *already
[14:40:15] <joal>	 np ;)
[14:40:17] <joal>	 yup
[14:40:42] <joal>	 for emails: I have not properly tested, so i'd like to do it
[14:41:20] <joal>	 for whitelist: same
[14:41:30] <joal>	 currently I am trying to have the cluster back in track
[14:48:34] <nuria>	 joal:ok, if i can help let me know
[14:48:48] <nuria>	 joal: either with cluster or testing some of your changes
[14:49:03] <joal>	 thx nuria, not easily shared work :)
[14:50:34] <joal>	 oh, testing would be awesome :)
[14:50:54] <wikibugs>	 Analytics-Cluster, Analytics-Kanban: Improve daily webrequest partition report {hawk} [5 pts] - https://phabricator.wikimedia.org/T113255#1681020 (mforns) a:mforns
[14:52:37] <joal>	 nuria: If you can try to send an email using the subworkflow I have created that would be awesome !
[14:55:37] <joal>	 nuria: actually, I can also do it, backfilling started :)
[14:56:12] <nuria>	 joal: ok, will try after standup
[14:56:13] <joal>	 !log backfilling various load jobs having failed at earlier stages than check_sequence_statistics
[14:57:36] <krrrit-wm>	 (CR) Nuria: "Looks good, thanks for writing a test." [analytics/aggregator] - https://gerrit.wikimedia.org/r/241620 (https://phabricator.wikimedia.org/T106554) (owner: Mforns)
[14:58:01] <krrrit-wm>	 (CR) Nuria: [C: 2 V: 2] Fix inconsistencies when using --all-projects [analytics/aggregator] - https://gerrit.wikimedia.org/r/241620 (https://phabricator.wikimedia.org/T106554) (owner: Mforns)
[14:58:08] <mforns>	 nuria, thx!
[14:59:15] <joal>	 thx mforns for having fixed that !
[14:59:28] <mforns>	 tech debt
[15:04:34] <joal>	 mforns: yup !
[15:14:37] <wikibugs>	 Analytics-Cluster, Analytics-Kanban: Create Kafka deployment checklist on wikitech {hawk} [5 pts] - https://phabricator.wikimedia.org/T111408#1681066 (kevinator) Open>Resolved
[15:18:43] <joal>	 ottomata: have we changed any conf on the cluster recently ?
[15:19:04] <joal>	 Some jobs are failing mysteriously, with an obscure exception
[15:19:48] <ottomata>	 not that I know of, in interview though...
[15:20:05] <joal>	 k let's talk after standup and other meetings
[15:23:59] <wikibugs>	 Analytics-Backlog, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1681107 (kevinator)
[15:45:02] <joal>	 hey ottomata, are you in the office (SF) ?
[16:00:26] <kevinator>	 ottomata: are you in the office right now?
[16:02:21] <ottomata>	 no, in interview!
[16:03:34] <kevinator>	 when does  it end?
[16:06:29] <ottomata>	 5 mins ago
[16:06:31] <ottomata>	 joinnig the thing
[16:11:14] <bearloga>	 Good morning! I have a really annoying Varnish problem. Every time I run a query in hive it outputs a lot of garbage (INFO and WARNING messages about parquet.hadoop.ParquetRecordReader and parquet.hadoop.InternalParquetRecordReader), followed by the actual results of the query, sometimes followed by some more of that same garbage. If I'm writing the output to a file, the
[16:11:14] <bearloga>	 results of the query AND the garbage both get written out. This doesn't seem to be a problem for Ironholds, only me. Please help :)
[16:12:17] <joal>	 hey bearloga, you can try hive -S -e "SELECT .... " > result_file.tsv
[16:12:32] <joal>	 -S usually silent hive logs
[16:13:15] <bearloga>	 thanks, will give that a try now
[16:14:43] <bearloga>	 joal: nope, still writes the messages out :\
[16:16:14] <joal>	 bearloga: can you detail the requst ?
[16:24:56] <bearloga>	 joal: naturally, as soon as I reach out to someone then the problem goes away. here's how some of the messages looked like:
[16:24:59] <bearloga>	 Sep 28, 2015 4:13:38 PM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
[16:24:59] <bearloga>	 Sep 28, 2015 4:13:38 PM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 1984 records.
[16:24:59] <bearloga>	 Sep 28, 2015 4:13:38 PM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block
[16:24:59] <bearloga>	 Sep 28, 2015 4:13:38 PM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory in 46 ms. row count = 1984
[16:25:11] <joal>	 We know thoses logs
[16:25:31] <joal>	 They happen regularly, and we have not yet managed to silent them :(
[16:26:28] <bearloga>	 joal: ah, okay, thanks.
[16:26:47] <joal>	 I thought hive -S would have worked, but sems not
[16:34:29] <nuria>	 joal: coming to tasking?
[16:34:41] <joal>	 Yes !
[16:38:25] <joal>	 bearloga: you can try beeline, I think looging is better in there
[16:38:36] <joal>	 bearloga: https://gist.github.com/jobar/8c8f9691996bd33ee4d9
[16:41:03] <bearloga>	 joal: I'll check that out, thanks!
[16:48:00] <wikibugs>	 Analytics-Backlog: Prepare data for Quarterly Review - https://phabricator.wikimedia.org/T113196#1681552 (kevinator)
[16:51:01] <wikibugs>	 Analytics-Backlog: Prepare data for Quarterly Review - https://phabricator.wikimedia.org/T113196#1681578 (kevinator)
[16:53:32] <wikibugs>	 Analytics-Backlog: Prepare data for Quarterly Review - https://phabricator.wikimedia.org/T113196#1681592 (kevinator)
[16:53:51] <wikibugs>	 Analytics-Backlog: Prepare data for Quarterly Review [13 pts] - https://phabricator.wikimedia.org/T113196#1681594 (kevinator)
[16:54:33] <wikibugs>	 Analytics-Backlog, Analytics-Cluster: Improve daily webrequest partition report {hawk} [5 pts] - https://phabricator.wikimedia.org/T113255#1681596 (mforns) a:mforns>None
[16:55:09] <wikibugs>	 Analytics-Kanban: Prepare data for Quarterly Review [13 pts] - https://phabricator.wikimedia.org/T113196#1681603 (mforns) a:mforns
[17:05:02] <wikibugs>	 Analytics, Analytics-Backlog, Analytics-Cluster: Setup pipeline for search logs to travel through kafka and camus into hadoop {hawk} [21 pts] - https://phabricator.wikimedia.org/T113521#1681666 (kevinator)
[17:15:42] <wikibugs>	 Analytics-Backlog: Spike: Found out what files does erik uses as feed to his definition - https://phabricator.wikimedia.org/T113981#1681720 (Nuria) NEW
[17:18:46] <wikibugs>	 Analytics-Backlog: Spike: Found out what files does erik uses as feed to his definition - https://phabricator.wikimedia.org/T113981#1681747 (Nuria) Note that erik's process consumes raw data for user agent and localization and its data source is not preagreggated so we likely need to create a dataset with our...
[17:20:27] <wikibugs>	 Analytics-Backlog: Spike: Found out what dump files does erik uses as feed to his definition - https://phabricator.wikimedia.org/T113981#1681774 (Nuria)
[17:21:12] <wikibugs>	 Analytics-Backlog: Spike: Found out what dump file format does erik uses as feed to his definition - https://phabricator.wikimedia.org/T113981#1681720 (Nuria)
[17:21:13] <wikibugs>	 Analytics-Backlog, Research management, Research-and-Data: Pipeline for data-intensive applications from research to productization to integration - https://phabricator.wikimedia.org/T105815#1681780 (Milimetric) (to make sure this is explicit: we moved this to our radar as it looks like something resea...
[17:25:36] <krrrit-wm>	 (PS1) Christopher Johnson (WMDE): adds aggregate data URI sources sets up prelim charts for latest stats adds developer tab and data table for getClaims usage [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/241697
[17:27:28] <wikibugs>	 Analytics-Backlog: Enable use of Python 3 in Spark - https://phabricator.wikimedia.org/T113419#1681833 (kevinator) [] verify that versions will work   - run py spark on current settings   - change setting, run same code   - run py spark Python 3 on new version of spark [] pupetize env variable in spark enviro...
[17:30:32] <wikibugs>	 Analytics-Backlog: Enable use of Python 3 in Spark {hawk} [8 pts] - https://phabricator.wikimedia.org/T113419#1681845 (kevinator)
[17:34:19] <wikibugs>	 Analytics-Cluster, Analytics-Kanban: webrequest_sequence_stats_hourly fails/continues Load job {hawk} [13 pts] - https://phabricator.wikimedia.org/T113252#1681872 (JAllemandou)
[17:34:28] <wikibugs>	 Analytics-Backlog: Move camus files from refinery to puppet - https://phabricator.wikimedia.org/T113990#1681873 (Aklapper)
[17:36:29] <wikibugs>	 Analytics-Backlog: Deploy the Analytics RESTBase {slug} - https://phabricator.wikimedia.org/T113991#1681884 (Milimetric) NEW a:Milimetric
[17:45:08] <joal>	 brb
[17:48:00] <nuria>	 madhuvishy:  have you use the lab hadoop cluster before?
[17:48:45] <madhuvishy>	 nuria: no, I haven't
[17:50:18] <ottomata>	 nuria:  it is mostly set up so you can just spawn your own. but i think you can use the one that is there
[17:50:30] <ottomata>	 log into hadoop-dev-worker1 and just use usual hdfs/ hadoop commands
[17:52:31] <nuria>	 ottomata: what code is the cluster setup from?  same one taht in prod?
[17:52:35] <ottomata>	 yes
[17:52:36] <nuria>	 *than
[17:53:12] <nuria>	 ottomata: so, there is nothing running:
[17:53:16] <nuria>	 https://www.irccloud.com/pastebin/TpmQmN2A/
[17:53:55] <nuria>	 ottomata: and nothing under /srv
[17:53:59] <nuria>	 https://www.irccloud.com/pastebin/0Lx0TzeJ/
[17:54:04] <ottomata>	 srv?
[17:54:07] <ottomata>	  /otto@hadoop-dev-worker1:~$ hdfs dfs -ls /
[17:54:07] <ottomata>	 Found 3 items
[17:54:07] <ottomata>	 drwxrwxrwt   - hdfs hdfs            0 2015-08-25 19:39 /tmp
[17:54:07] <ottomata>	 drwxrwxr-x   - hdfs hadoop          0 2015-08-21 19:26 /user
[17:54:07] <ottomata>	 drwxr-xr-x   - hdfs hdfs            0 2015-07-16 17:23 /var
[17:54:25] <ottomata>	  sudo ps aux | grep java | head -n 1
[17:54:25] <ottomata>	 hdfs      1032  0.2  6.2 1619932 128208 ?      Sl   Aug10 155:07 /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -Dproc_datanode -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-hdfs-datanode-hadoop-dev-worker1.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true 
[17:57:27] <nuria>	 ottomata: sorry,  i wanted to see the code deployed there i thought it will be under /srv/deployment
[17:58:10] <ottomata>	 naw, no code deployed, just hadoop
[17:58:16] <ottomata>	 you can clone erfinery
[17:58:18] <ottomata>	 refinery
[17:58:34] <nuria>	 ottomata: anywhere or under srv?
[17:58:58] <lzia>	 hi nuria. question about EL: is the timestamp collected in EL tables associated with page-impression, for example, is the exact same timestamp as we see in logs, or there may be some delays in EL timestamps?
[18:00:49] <nuria>	  ottomata excuse my thickness but ... ahem ....how does this hadoop get to run refinery code? do we need puppet also to start hadoop with refinery code, or do we do that by hand?
[18:01:01] <nuria>	 lzia: one sec, lemme confirm
[18:01:06] <lzia>	 thanks, nuria.
[18:02:40] <nuria>	 lzia: events are wrapped in timestamps as they flow through system so the timestamp will not match any other one exactly
[18:02:50] <nuria>	 lzia: but "approximately" it should
[18:02:59] <lzia>	 got it. thanks nuria.
[18:03:12] <nuria>	 ottomata: brb
[18:09:23] <nuria>	 joal: btw, i run the queries for checks and balances on the 'identity' reconstruction stuff
[18:10:36] <ottomata>	 nuria:  refinery is just jars and wrappers, etc.
[18:10:38] <ottomata>	 you can clone whereever
[18:10:41] <ottomata>	 and run jars using hadoop
[18:10:46] <ottomata>	 camus is launched by a cron
[18:10:49] <ottomata>	 just a mapreduce job
[18:10:51] <ottomata>	 you can do it by hand
[18:11:04] <ottomata>	 there is anice wrapper script in refinery to make it easier
[18:11:06] <ottomata>	 refinery/bin/camus
[18:11:26] <ottomata>	 joal:  am looking at cluster...things seem ok, but text refiner is behind
[18:11:52] <joal>	 yup, I have backfilled a few load on various partitions
[18:11:56] <joal>	 ottomata: --^
[18:12:21] <joal>	 I am currently ensuring other job dependencies catch up (legacy, pagecounts etc)
[18:12:31] <joal>	 nuria: awesome :)
[18:12:40] <ottomata>	 joal:  what happened, do you know?
[18:12:46] <ottomata>	 backfilled?
[18:12:48] <joal>	 Strange errors
[18:12:57] <ottomata>	 why'd you have to backfill?
[18:13:02] <ottomata>	 did the jobs halfway complete?
[18:13:07] <ottomata>	 and refine was started when it shoudln't have?
[18:13:14] <joal>	 rerun a few that were in incorrect state
[18:14:49] <joal>	 I use that : https://gist.github.com/jobar/518a5b966d61d684bba6 to determine incorrect states
[18:14:57] <joal>	 in load bundle
[18:15:00] <joal>	 ottomata: --^
[18:16:18] <ottomata>	 oh, joal you didn't need to rerun other dependent jobs, right?
[18:16:20] <joal>	 Also there have been a camus hickup on Friday ottomata, right ?
[18:16:30] <ottomata>	 joal:  don't know i was not working on friday...
[18:16:34] <joal>	 ottomata: some of them were in failed stats
[18:16:37] <ottomata>	 have barely checked my email yet was camping most of the weekend
[18:16:39] <ottomata>	 :)
[18:16:42] <joal>	 ottomata: from charts, looks like so
[18:16:48] <ottomata>	 oh, there was a namenode hiccup, remember,
[18:16:50] <ottomata>	 not sure what that was
[18:16:52] <joal>	 awesome campnig :)
[18:16:59] <joal>	 ok
[18:17:01] <ottomata>	 from kafka charts?
[18:17:04] <joal>	 yup
[18:17:31] <ottomata>	 probably if namenode was down for > 10 minutes, there camus would have failed during that time, and subsequent runs would have to load more data
[18:17:45] <joal>	 Yeah taht makes sense
[18:19:15] <joal>	 ottomata: The errors I had in load were at partition addition, or stats creation
[18:19:18] <joal>	 Which is weird
[18:19:33] <ottomata>	 joal:  got a paste of errors?
[18:20:45] <joal>	 I looked at hadoop logs for those jobs, here is an exampl: sudo -u hdfs yarn logs -applicationId application_1441303822549_55873
[18:21:15] <joal>	 every time the same exception
[18:21:39] <joal>	 I relaunched fhe faulty ones, and they caught up
[18:21:42] <ottomata>	 bwa
[18:21:44] <ottomata>	 that is weird
[18:21:44] <joal>	 ottomata: --^
[18:21:44] <ottomata>	 java.lang.NoClassDefFoundError: org/antlr/runtime/RecognitionException
[18:21:47] <joal>	 yeah
[18:21:53] <joal>	 Plus a gzip one
[18:22:04] <ottomata>	 don't see that
[18:22:34] <joal>	 after in the list of caused by exceotions
[18:23:48] <ottomata>	 ah zip, ja
[18:23:53] <ottomata>	 HMmmm
[18:23:59] <ottomata>	 was there a deploy around then?
[18:24:03] <ottomata>	 refinery deploy
[18:24:03] <ottomata>	 ?
[18:24:13] <ottomata>	 maybe the jar file wasn't fully written..>.>> hmmm noO
[18:24:14] <joal>	 didn't investigate more, took time to bacjfill
[18:24:31] <joal>	 Last deploy was early last week if I recall
[18:25:24] <ottomata>	 aye
[18:25:26] <ottomata>	 yeah strange.
[18:25:27] <ottomata>	 indeed.
[18:25:59] <joal>	 I am really wondering about cluster overeload ...
[18:26:11] <joal>	 I'll have a look at ganglia tomorrow
[18:39:14] <joal>	 Guys I'm off for today
[18:39:21] <joal>	 See you a-team tomorrow !
[18:39:38] <kevinator>	 ciao
[18:55:23] <nuria>	 ottomata: so the cluster on labs is a machine with hadoop deployed and .. something else? or just that?
[18:55:59] <ottomata>	 nuria, i think 2 or 3 hadoop nodes, and there should be some kafka nodes there
[18:56:00] <ottomata>	 um
[18:56:00] <ottomata>	 use
[18:56:14] <nuria>	 how i can know which one is which?
[18:56:30] <ottomata>	 i think kafka203 would be a good one to use
[18:56:36] <ottomata>	 nuria:  whatcha mean?
[18:56:38] <ottomata>	 hadoop or kafka?
[18:57:09] <nuria>	 right, how can i know what is deployed to where?
[18:57:11] <ottomata>	 btw, nuria this 'cluster' is really just a one off i've used for testing a couple of things, it isn't something maintained.
[18:57:20] <ottomata>	 all you need to test camus is a working hadoop and kafka cluster
[18:57:32] <ottomata>	 but, if you go to the configure page of a node
[18:57:36] <ottomata>	 you can see what boxes are checked :)
[18:57:41] <ottomata>	 and what classes are applied there
[18:57:41] <nuria>	 ok
[18:58:03] <nuria>	 ottomata: i was under the impression that this has puppet and such but it doesn't right?
[18:58:04] <ottomata>	 hmm
[18:58:08] <ottomata>	 it does have puppet
[18:58:18] <ottomata>	 it looks like i didn't use the puppet stuff for the kafka setups though
[18:58:24] <ottomata>	 i think i was testing the upgrade of kafka on those
[18:58:33] <nuria>	 puppet that sets up hadoop deployment?
[18:58:36] <ottomata>	 the hadoop but the hadoop nodes is puppet yes
[18:58:54] <ottomata>	 nuria:
[18:58:55] <ottomata>	 https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=configure&instanceid=70e04674-cc5f-4707-83a6-81751e32dd3a&project=analytics&region=eqiad
[18:59:15] <ottomata>	 you can see which boxes are checked, and the vars that are filled out
[19:01:14] <nuria>	 ottomata: and how do you see the nodes that 'make up' the cluster? by looking at the analytics page and node names?
[19:01:47] <ottomata>	 nuria:  yeah, nodes just join the cluster by pointing themselves at a master
[19:01:56] <ottomata>	 so there's no centralized config of worker nodes
[19:19:58] <nuria>	 ottomata: in kafka-jessie01 i see connections to kafka200.analytics but that second machine does not appear on https://wikitech.wikimedia.org/wiki/Nova_Resource:Analytics
[20:03:19] <nuria>	 https://www.irccloud.com/pastebin/eUpjk4L1/
[20:05:17] <madhuvishy>	 nuria: I think, we can just test on prod
[20:05:34] <madhuvishy>	 andrew says there's a test kafka topic
[20:05:35] <nuria>	 on the real cluster?
[20:05:41] <madhuvishy>	 so we can produce to it
[20:05:44] <madhuvishy>	 yes
[20:06:07] <nuria>	 madhuvishy: for kafka ok, but we still need the labs setup to add the camus avro consumer
[20:06:13] <nuria>	 correct?
[20:06:14] <madhuvishy>	 and we can point camus to write to tmp or /user/madhuvishy etc
[20:06:49] <nuria>	 in prod you mean?
[20:06:54] <madhuvishy>	 yes
[20:07:03] <madhuvishy>	 nuria: not necessary. we can run our own version of refinery etc
[20:07:17] <madhuvishy>	 i dont know all details yet but i'm sure we can do it
[20:08:09] <nuria>	 madhuvishy: mmm.. i thought the cluster will run the given version of refinery upon startup
[20:08:27] <nuria>	 madhuvishy: so you cannot override it , but maybe i am wrong
[20:08:42] <wikibugs>	 Analytics, Developer-Relations, MediaWiki-API, Research consulting, and 3 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1682517 (Spage)
[20:09:07] <nuria>	 madhuvishy: testing in prod seems kind of wrong to me specially if we need to add new dependencies, if no new dependencies are needed then it should be ok i guess
[20:09:20] <wikibugs>	 Analytics, Developer-Relations, MediaWiki-API, Research consulting, and 3 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1398018 (Spage)
[20:10:09] <madhuvishy>	 nuria: we have a fork of camus, so we can update Java code if needed and build our own version of the jar
[20:11:32] <madhuvishy>	 nuria: I can't imagine what new dependencies we'd need
[20:12:32] <nuria>	 madhuvishy: ok, so you think we should not use the labs cluster at all?
[20:13:25] <madhuvishy>	 nuria: I think the labs cluster is not set up fully, there's a task to do it - https://phabricator.wikimedia.org/T109859. Meanwhile, it will be easier for us to test on prod
[20:13:53] <nuria>	 madhuvishy: the part i do not get is how would we use our own version of camus ? hadoop already has loaded the jars
[20:13:59] <ebernhardson>	 fyi, i'm setting up a kafka instance in deployment-prep later today (if i find the time)
[20:15:06] <nuria>	 madhuvishy:  fast hangout?
[20:15:12] <nuria>	 madhuvishy: on cave?
[20:15:18] <madhuvishy>	 sure, joining
[20:26:58] <madhuvishy>	 nuria: kafka1012.eqiad.wmnet:9092
[20:29:50] <madhuvishy>	 nuria: conf1001.eqiad.wmnet,conf1002.eqiad.wmnet,conf1003.eqiad.wmnet/kafka/eqiad
[20:31:00] <wikibugs>	 Analytics-Backlog, Analytics-Wikistats, DevRel-October-2015: Clean the code review queue of analytics/wikistats - https://phabricator.wikimedia.org/T113695#1682614 (Aklapper) >>! In T113695#1680652, @ezachte wrote: > More things are not up to par for Wikistats, but Wikistats is going to be made obsolet...
[20:31:49] <wikibugs>	 Analytics-EventLogging, Analytics-Kanban: Research sending EventLogging validation logs to Logstash {oryx} [5 pts] - https://phabricator.wikimedia.org/T111412#1682621 (Ottomata) Is this done?  I missed standup!  I want to see! :)
[20:35:46] <wikibugs>	 Analytics-EventLogging, Analytics-Kanban: Research sending EventLogging validation logs to Logstash {oryx} [5 pts] - https://phabricator.wikimedia.org/T111412#1682640 (mforns) The research part is done.  WMF's Logstash has support for native Kafka consumer plugin. So, we'll go for it in the next task, the...
[20:35:49] <ottomata>	 madhuvishy: nuria i haven't used kafkacat for producing as much, but it should work well
[20:36:03] <ottomata>	 kafkacat -C -b kafka1012.eqiad.wmnet:9092 -t webrequest_mobile
[20:36:04] <ottomata>	 etc.
[20:36:08] <ottomata>	 look at
[20:36:10] <ottomata>	 kafkacat --help
[20:36:11] <ottomata>	 is good
[20:49:52] <milimetric>	 hey - my internet died for a while, lemme know if i missed anything
[20:49:57] <wikibugs>	 Analytics-Backlog, Developer-Relations, MediaWiki-API, Research consulting, and 3 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1682730 (kevinator)
[21:02:51] <wikibugs>	 Analytics-Cluster, Datasets-Archiving, Datasets-Webstatscollector: Mediacounts stalled after 2015-09-24 - https://phabricator.wikimedia.org/T113956#1682801 (Ottomata) Hm, I see that there are actually new files:    mediacounts.2015-09-27.v00.tsv.bz2                 28-Sep-2015 17:23           4581326...
[21:06:41] <wikibugs>	 Analytics, Varnish: Connect Hadoop records of the same request coming via different channels - https://phabricator.wikimedia.org/T113817#1682817 (Ottomata) > a request might pass through multiple varnishes (edge cluster front cache, edge cluster back cache, main datacenter back cache). Which one of these...
[21:27:31] <mforns>	 ottomata, yt?
[21:30:58] <ottomata>	 yes
[21:31:01] <ottomata>	 hiayaa
[21:31:08] <mforns>	 hello
[21:31:26] <mforns>	 I'm looking at logstash from kafka
[21:31:46] <mforns>	 and the logstash config asks me for the url and port of the zookeeper instance
[21:32:05] <mforns>	 do we have a kafka instance with zookeeper in beta cluster?
[21:32:19] <mforns>	 like the one we tested EventLogging in beta?
[21:33:35] <ottomata>	 yes
[21:33:36] <ottomata>	 same one
[21:33:41] <ottomata>	 uh i think it is
[21:33:58] <ottomata>	 deployment-zookeeper.deployment-prep.eqiad.wmflabs:2181/kafka/deployment-prep
[21:34:14] <mforns>	 ok!
[21:34:21] <mforns>	 I'll try that
[21:34:26] <mforns>	 thanks!
[21:34:30] <ottomata>	 yup!
[21:37:05] <mforns>	 ottomata, are you sure it's this url? isn't it deployment-zookeeper01.eqiad.wmflabs ?
[21:37:53] <mforns>	 I can not ssh into deployment-zookeeper.deployment-prep.eqiad.wmflabs
[21:38:22] <ebernhardson>	 i connect just fine
[21:38:34] <mforns>	 mmm
[21:38:35] <ebernhardson>	 although, i could swear i asked otto 2 hours ago and he said no kafka in labs yet :P
[21:38:48] <ottomata>	 oh, mforns that is it
[21:38:49] <ottomata>	 it is 01
[21:38:50] <ottomata>	 sorry
[21:38:53] <mforns>	 np!
[21:39:15] <ottomata>	 ebernhardson: its running in deployment prep, sure!  but you can set up your own instance too
[21:39:26] <mforns>	 I think this kafka is running there occasionally but, still not official kafka in beta cluster
[21:39:28] <mforns>	 oh, ok
[21:39:43] <ottomata>	 yeah, its running in beta mainly just for eventlogging, but, because of that it has to be kinda stable
[21:39:52] <ottomata>	 but, ja we want to have a larger 'beta' analytics cluster in labs
[21:39:54] <ottomata>	 that doesn't exist yet
[21:40:11] <ottomata>	 also, ebernhardson i think i was referring to the ease of setting one up for yourself, and wondering how good my docs were
[21:40:20] <ottomata>	 if you want one in your own labs project, for instance
[21:41:56] <ebernhardson>	 ottomata: ahh, i will probably just point beta cluster to that one, should suffice for making sure the code i'm putting into prod works
[21:43:49] <ottomata>	 k cool
[21:43:50] <ottomata>	 yeha
[21:45:16] <wikibugs>	 Analytics-Backlog: Deploy the Analytics RESTBase {slug} [13 pts] - https://phabricator.wikimedia.org/T113991#1683023 (Milimetric)
[21:45:27] <wikibugs>	 Analytics-Kanban: Deploy the Analytics RESTBase {slug} [13 pts] - https://phabricator.wikimedia.org/T113991#1681884 (Milimetric)
[21:57:57] <wikibugs>	 Analytics-Tech-community-metrics, MediaWiki-Extension-Requests, Possible-Tech-Projects: A new events/meet-ups extension - https://phabricator.wikimedia.org/T99809#1683121 (Ragesoss) There could be many possible ways to tackle the needs described here.  I think the Dashboard would be one good way, but...
[22:18:58] <nuria>	 madhuvishy: back, looking into camuys
[22:19:01] <nuria>	 *camus
[22:22:21] <madhuvishy>	 nuria: cool
[22:36:49] <nuria>	 ottomata: was going to be camus jar but turns out some deps are missing on 1002
[22:37:00] <nuria>	 [ERROR] Plugin org.apache.maven.plugins:maven-compiler-plugin:3.1 or one of its dependencies could not be resolved: Failed to read artifact descriptor for org.apache.maven.plugins:maven-compiler-plugin:jar:3.1: Failure to find org.apache.maven.plugins:maven-compiler-plugin:pom:3.1 in https://archiva.wikimedia.org/repository/mirrored/ was cached in the local
[22:37:01] <nuria>	 repository, resolution will not be reattempted until the update interval of system-wide-wmf-mirrored-default has elapsed or updates are forced -> [Help 1
[22:42:00] <Krinkle>	 ottomata: What is the problem with kafka/eventlogging and precise?
[22:42:08] <Krinkle>	 Is there a reason it can't work on precise?
[22:42:15] <Krinkle>	 RE: https://phabricator.wikimedia.org/T112660
[22:50:23] <wikibugs>	 Analytics-Tech-community-metrics, Possible-Tech-Projects, Epic: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1683422 (Dicortazar) Hi there,  I'm having some discussion at https://phabricator.wikimedia.org/conpherence/200/ about this...
[22:52:35] <ottomata>	 Krinkle:  putting my remembering hat on, i think its just python client dep, it might work if i built python-kfak for precise
[22:52:49] <ottomata>	 will do now...
[23:00:45] <ottomata>	 Krinkle: hm, not fully sure about that, the most immediate problem is ferm
[23:00:52] <ottomata>	 hafnium has a public IP
[23:01:15] <Krinkle>	 ottomata: I'm not sure I see the connection
[23:01:24] <Krinkle>	 how is having a public IP related to consuming kafka
[23:01:27] <ottomata>	 can't connect to zookeeper on conf1001
[23:01:32] <ottomata>	 cause ferm rules there don't allow it
[23:01:44] <Krinkle>	 I was wondering about terbium, not hafnium.
[23:01:47] <ottomata>	 but, that is a superficial issue (as they probably all are)
[23:01:47] <ottomata>	 oh
[23:01:52] <Krinkle>	 is kafka not available on hafnium?
[23:01:52] <ottomata>	 ok checking there
[23:01:54] <Krinkle>	 How is it working then?
[23:02:02] <Krinkle>	 I thought that was already done and taken care of
[23:02:10] <ottomata>	 the stuff on hafnium is consuming from the zmq stream that we are maintining for yall :)
[23:02:11] <Krinkle>	 Oh, it's still using ZMQ
[23:02:13] <Krinkle>	 right
[23:02:13] <ottomata>	 ya
[23:02:28] <ottomata>	 is eventlogging deployed on terbium?
[23:02:29] <ottomata>	 no hm.
[23:04:03] <ottomata>	 Krinkle: i think it might work there, if i can make ding dang reprepo import this precise package...
[23:10:09] <Krinkle>	 not yet (el on terbium) but there's a task for it
[23:10:27] <Krinkle>	 it is installed on tin but, 1) that el package is outdated, 2) doesn't have kafka, 3) we should use terbium instead of tin for scripts
[23:11:32] <ottomata>	 Krinkle: ok, let's do like we did on stat1002 then, that ok?  deploy it there but not globally install it?
[23:12:00] <Krinkle>	 I still don't see why.
[23:12:08] <Krinkle>	 It makes it unattractive to use.
[23:12:12] <Krinkle>	 What's the reason again?
[23:12:27] <ottomata>	 its an extra deploy step, and I don't like installing these things globally in general without using debian package
[23:12:46] <ottomata>	 i don't want to have to log into 5 servers and run a command (or do it via salt, or whatever) every time it is run
[23:12:56] <Krinkle>	 Yeah, but that can be puppetised.
[23:12:58] <ottomata>	 deployed*
[23:13:02] <ottomata>	 naw, its deployed via git deploy
[23:13:03] <Krinkle>	 Anyway I'm fine with that, as long as it's documented.
[23:13:10] <Krinkle>	 We should probably use it that way by default
[23:13:17] <ottomata>	 yeah i would prefer that too
[23:13:19] <Krinkle>	 and not install it globally in general.
[23:13:34] <ottomata>	 i think the only reason its not is becuase on hafnium and on eventlog1001 it is daemonized via upstart
[23:13:44] <Krinkle>	 So that there's an example on wikitech that includes setting the os.path and then conneting
[23:14:03] <ottomata>	 yeah, that's ok if you are writing your own script
[23:14:07] <ottomata>	 i think setting PYTHONPATH is easier though
[23:25:13] <ottomata>	 AHH Krinkle i am now remembering why this doesn't work on precise.
[23:25:19] <ottomata>	 The following packages have unmet dependencies:
[23:25:19] <ottomata>	  python-pykafka : Depends: python-kazoo but it is not installable
[23:25:20] <ottomata>	                   Depends: python-tabulate but it is not installable
[23:25:49] <nuria>	 ottomata: nevermind camus build issues i can scp jar easy enough
[23:26:06] <ottomata>	 cool (nuria I must have missed a message from you)
[23:26:13] <nuria>	 ottomata: np at all
[23:31:52] <nuria>	 ottomata: let's talk tomorrow but looks to me that schemas for avro for search can either be on 1) schema registry  or 2) on our source (as in gerrit commit)
[23:32:06] <ottomata>	 what is 1) though, in camus context
[23:32:08] <ottomata>	 its not confluent.
[23:32:10] <ottomata>	 right?
[23:32:19] <nuria>	 ottomata: nah doesn't have to be
[23:32:46] <nuria>	 ottomata: you can implement an interface and retrieve schema from anywhere
[23:33:16] <gwicke>	 ottomata: hey, are you in the office?
[23:33:56] <nuria>	 ottomata: for tests this is handy: https://github.com/linkedin/camus/blob/master/camus-example/src/main/java/com/linkedin/camus/example/schemaregistry/DummySchemaRegistry.java
[23:34:02] <nuria>	 but obviously not the best
[23:34:09] <nuria>	 ottomata: will brief you tomorrow
[23:34:51] <nuria>	 go and talk to gwicke  outside the internets
[23:35:21] <ottomata>	 gwicke:  yes!
[23:35:34] <gwicke>	 ahh, nice!
[23:35:43] <gwicke>	 me too, so lets chat?
[23:36:19] <ottomata>	 yeah!  let's  want to now?  not sure if i'm ready for a debate or deciding on anything session, but all for chattin!
[23:37:05] <gwicke>	 sure
[23:37:14] <gwicke>	 I just wrote another response, so it's all paged in
[23:37:21] <gwicke>	 where are you?
[23:37:31] <ottomata>	 ja saw that, on 3rd, i think you are too, i come to you
[23:37:43] <gwicke>	 kk