[00:57:54] <wikibugs>	 Analytics-Tech-community-metrics, Developer-Relations, Community-Tech-Sprint: Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? - https://phabricator.wikimedia.org/T125459#2311335 (Earwig) This question was asked and answered above for me. I don't think Eranbot...
[06:39:04] <wikibugs>	 Analytics-General-or-Unknown, Language-Engineering, Mobile-Apps, Wikipedia-Android-App-Backlog, and 2 others: there should be a comparison of clicks count on interlanguage links on different platforms - https://phabricator.wikimedia.org/T78351#2311679 (Nikerabbit)
[06:39:43] <wikibugs>	 Analytics-General-or-Unknown, Language-Engineering, Mobile-Apps, Wikipedia-Android-App-Backlog, and 2 others: there should be a comparison of clicks count on interlanguage links on different platforms - https://phabricator.wikimedia.org/T78351#843122 (Nikerabbit) @Amire80 Can you provide update o...
[07:22:18] <wikibugs>	 Analytics, Pageviews-API: 20160431 produces "end timestamp is invalid, must be a valid date in YYYYMMDD format" - https://phabricator.wikimedia.org/T135812#2311749 (Nemo_bis)
[08:06:46] <elukey>	 joal: morning!
[08:06:59] <joal>	 heya elukey :)
[08:07:18] <elukey>	 if you are ok I would upgrade aqs100[23]
[08:08:34] <joal>	 elukey: I have not noticed any issue
[08:08:38] <joal>	 please go :)
[08:10:33] <elukey>	 joal: all right! Also, this morning the link between esams and eqiad went down, so there might be some data loss
[08:12:46] <moritzm>	 elukey: did you start already? It would be great to bundle that with the openjdk-8 security update (which also requires a restart of cassandra)
[08:13:43] <elukey>	 moritzm: ouch yes already started, but I can upgrade java too, it shouldn't be a problem. Now AQS won't explode anymore if one cassandra node is down :)
[08:14:09] <moritzm>	 ok, let me quickly install upgrade openjdk on aqs*, then
[08:14:16] <moritzm>	 ok, let me quickly upgrade openjdk on aqs*, then
[08:15:36] <elukey>	 moritzm: would you mind to wait a bit for me to finish the migration?
[08:16:02] <elukey>	 then I'll do the rolling restart, maybe this afternoon.. I'd like to have only few changes in flight if possible
[08:16:32] <moritzm>	 ok
[08:16:44] <elukey>	 thanks!
[08:16:49] <moritzm>	 just ping me when the time is right
[08:18:31] <icinga-wm>	 PROBLEM - Analytics Cassanda CQL query interface on aqs1002 is CRITICAL: Connection refused
[08:18:58] <elukey>	 (bootstrapping)
[08:20:49] <elukey>	 this one takes longer
[08:21:11] <elukey>	 INFO  [MemtableFlushWriter:4] 2016-05-20 08:20:22,715 Memtable.java:347 - Writing Memtable-data@777323105(884.077MiB serialized bytes, 18540448 ops, 16%/26% of on/off-heap limit)
[08:21:58] <elukey>	 INFO  [main] 2016-05-20 08:21:47,409 StorageService.java:1715 - Node /10.64.32.175 state jump to NORMAL
[08:22:01] <elukey>	 nice
[08:23:00] <icinga-wm>	 RECOVERY - Analytics Cassanda CQL query interface on aqs1002 is OK: TCP OK - 0.004 second response time on port 9042
[08:30:17] <elukey>	 !log aqs1002 migrated to cassandra 2.1.13
[08:30:19] <analytics-logbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master
[08:32:22] <joal>	 elukey: o:
[08:32:30] <joal>	 elukey: o/ sorry
[08:33:16] <elukey>	 joal: ah you're off right! Have a good weekend :)
[08:33:39] <joal>	 elukey: Tried my first cassandra loading job on new cluster: failed because of not being able to connect from hadoop to aqs1004-a.eqiad.wmnet
[08:34:03] <elukey>	 ah snap
[08:34:04] <joal>	 elukey: didn't manage to work enough yesterday; so putting a bit on today ;)
[08:34:18] <joal>	 will be gone at 11:30
[08:34:48] <joal>	 elukey: we should discuss with moritzm, I think it might comme firewall
[08:35:25] <moritzm>	 joal: from which host was that?
[08:35:31] <joal>	 I remember we had to open some ports for loading on current cluster
[08:36:02] <joal>	 moritzm: hm, analytics1049, but coud be any hadoop node
[08:36:35] <moritzm>	 let me setup some logging rules and I'll ask you to repeat that query in a few minutes, ok?
[08:36:48] <joal>	 sure moritzm, thanks a lot !
[08:38:06] <moritzm>	 joal: k, feel free to re-run the query whenever you're ready
[08:38:35] <joal>	 moritzm: launching (don't know from which hadoop it will happen however)
[08:41:07] <elukey>	 moritzm: would be super interested to know how you are setting up the logging rules
[08:41:12] <moritzm>	 there's indeed dropped packets, having a look
[08:41:36] <moritzm>	 elukey: I added these:
[08:41:39] <moritzm>	 iptables -N LOGGING
[08:41:40] <moritzm>	 iptables -A INPUT -j LOGGING
[08:41:42] <moritzm>	 iptables -A LOGGING -m limit --limit 2/min -j LOG --log-prefix "iptables-dropped: " --log-level 4
[08:41:43] <moritzm>	 iptables -A LOGGING -j DROP
[08:42:21] <elukey>	 ahhhh directly to iptables, all right thanks :)
[08:42:25] <moritzm>	 so sampled dropped packets are logged to syslog with the prefix "iptables-dropped:" to easy grepping
[08:42:39] <moritzm>	 these are not persistent, the next time ferm is restarted, they are gone
[08:42:57] <moritzm>	 I have a Phab task to enable this in general, but needs a little more work
[08:44:06] <joal>	 moritzm: query failed again (as expected) - I see you saw dropped packets?
[08:44:38] <moritzm>	 joal: yeah, currently trying to figure out what is going wrong :-)
[08:44:57] <joal>	 ok letting you investigqte, thanks a lot !
[08:50:16] <elukey>	 joal: upgrading 1003
[08:53:31] <moritzm>	 the problem is the following: the firewall rules for port 9042 / CQL currently only allow access from the cassandra seeds, i.e. the IP addresses assigned to the Cassandra sub instances
[08:53:35] <elukey>	 !log cassandra upgraded to 2.1.13 on aqs1003
[08:53:36] <analytics-logbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master
[08:53:57] <moritzm>	 but aqs1004 is receiving the request from aqs1005's actual IP address (i.e. of the host)
[08:55:12] <moritzm>	 let me validate that with a quick hack
[08:56:41] <joal>	 moritzm: from a 'cassandra inner world', system seems to work fine (could access using cqlsh)
[08:56:56] <joal>	 Problem occured when trying to connect from hadoop
[08:57:12] <moritzm>	 joal: could you please retry?
[08:57:48] <joal>	 sure !
[09:00:06] <joal>	 moritzm: same errors: gpg --keyserver pgpkeys.mit.edu --recv-key  1397BC53640DB551
[09:00:26] <joal>	 woo, sorry moritzm pasting again
[09:00:37] <joal>	 Error: java.lang.RuntimeException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: aqs1004-a.eqiad.wmnet/10.64.0.126:9042
[09:05:11] <moritzm>	 joal: ok, so the problem on aqs1004 is fixed at least, your query now didn't flag a dropped packet. so there's a second problem to track down
[09:05:59] <moritzm>	 on which host did the above exception occur?
[09:07:28] <joal>	 moritzm: closed a wrong tab, relaunching to get the info
[09:10:15] <joal>	 moritzm: just failed from analytics10[53|37|32]
[09:10:37] <joal>	 and others actually moritzm
[09:11:43] <elukey>	 I am able to ping from hadoop to aqs1004-a so the basic routes are working, but maybe something related to accepting stuff on port 9042
[09:12:25] <moritzm>	 joal: is there a way to make it use a specific hadoop node? I'd like to avoid adding the logging code on all hadoop nodes
[09:12:26] <elukey>	 but it should be fixed that, ah reading the backlog
[09:13:04] <moritzm>	 elukey: my fix for the aqs cassandra access is currently a local hack on aqs1004, I've disabled puppet there
[09:13:16] <moritzm>	 will make a proper puppet patch from that
[09:13:21] <elukey>	 sure sure :)
[09:14:07] <joal>	 moritzm: I can restrict queries to use only one node instead of multiple, but can't choose which one :(
[09:16:45] <joal>	 moritzm: I need to log off in a couple minutes and will be off this afternoon and Monday, so won't be able to help on that anymore :(
[09:17:23] <moritzm>	 joal: ok, which query did you use?
[09:17:34] <elukey>	 joal: this is interesting - https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system?panelId=12&fullscreen
[09:17:36] <moritzm>	 so that myself or Luca can pick this up?
[09:18:11] <joal>	 moritzm: I use a oozie job launching a hadoop job which makes queries
[09:19:08] <joal>	 elukey: I think it's related to a compaction status that has been dropped
[09:19:25] <moritzm>	 joal: ok, I'll wrap what I found earlier in a puppet patch first and then we can pick this up on Tuesday. have a nice weekend
[09:20:13] <joal>	 moritzm: if elukey want to play, here is the query I use: https://gist.github.com/anonymous/41781b19ebcba233d029bcab5e76c304
[09:20:29] <joal>	 s/query/command
[09:20:49] <joal>	 moritzm: but it can wait Tuesday :)
[09:20:59] <joal>	 Thanks elukey and moritzm, have a good weekend !
[09:21:21] <elukey>	 thanks joal! you too
[09:27:52] <moritzm>	 elukey: I've re-enabled puppet on aqs1004, will make a patch later
[09:28:24] <elukey>	 moritzm: super thanks
[09:32:01] <elukey>	 also moritzm you can install openjdk whenever you prefer, I'll restart aqs this afternoon
[09:33:41] <moritzm>	 ok
[10:25:12] <mforns>	 hi team :]
[10:40:18] <elukey>	 hey mforns !
[10:40:22] * elukey lunch!
[10:40:25] <mforns>	 hi elukey
[10:40:27] <mforns>	 :]
[10:54:17] <grrrit-wm>	 (CR) Mforns: [C: -1] "Looks awesome!" (4 comments) [analytics/analytics.wikimedia.org] - https://gerrit.wikimedia.org/r/289062 (https://phabricator.wikimedia.org/T134506) (owner: Nuria)
[11:20:25] <elukey>	 mforns trolling: looks awesome! ==> -1 :P
[11:20:43] <mforns>	 xD
[11:20:58] <mforns>	 but it does look awesome :]
[11:21:20] <mforns>	 it's just a typo
[11:21:51] <elukey>	 ahahahahaha
[11:21:58] <elukey>	 yes yes I was kidding
[11:22:05] <mforns>	 hehehehe
[11:22:36] <elukey>	 how's the weather in your hometown mforns ?
[11:23:45] <mforns>	 elukey, today it's fine, like 23 deg, sunny
[11:24:02] <mforns>	 in a couple weeks it will be hoooooot
[11:25:15] <mforns>	 and in Torino?
[11:43:40] <wikibugs>	 Analytics, MediaWiki-API, Pageviews-API, RESTBase, RfC: RFC: Update profile URLs in content types to point to format documentation - https://phabricator.wikimedia.org/T128609#2312233 (Pchelolo) Open>Resolved This was merged and deployed a long time ago, profiles now point to specs on...
[11:48:24] <elukey>	 mforns: yeah same in here (Bologna), veeery nice and warm, but it'll get hot soon sadly
[11:48:50] <mforns>	 elukey, oh Bologna, sorry :P
[11:48:58] <elukey>	 :P
[12:37:30] <wikibugs>	 Analytics-Tech-community-metrics, Developer-Relations, Community-Tech-Sprint: Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? - https://phabricator.wikimedia.org/T125459#2312374 (Compassionate727) Yeah, I've pretty much stopped using the tool for now. Hopeful...
[12:40:53] <elukey>	 urandom: morning! Whenever you have a minute I have one question about Cassandra. When I installed the new package version I forgot to execute nodetool drain first, so memtables didn't flush gracefully before the restart.. This shouldn't be a big deal since writes for AQS are happening regularly by the hour, so not continously, but just wanted to double check
[12:41:06] <elukey>	 the only side effect that I noticed has been https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system?panelId=12&fullscreen
[12:41:37] <elukey>	 so a reduction in the disk space used, but I guess that is due to me triggering recompaction
[12:41:59] <elukey>	 (the last small drops are due to cassandra restarts for java upgrades
[12:47:59] <elukey>	 ah no I am seeing a lot of org.apache.cassandra.db.commitlog.CommitLogReplayer INFO in logstash for the time of the package upgrade, so I guess that what I ended up loosing without nodetool drain has been recovered via commit log
[12:48:10] * elukey speculates about cassandra
[12:50:23] <elukey>	 !log aqs100[123] restarted for openjdk upgrades
[12:50:24] <analytics-logbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master
[12:50:44] <elukey>	 (this time with nodetool drain first :P)
[12:51:09] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312406 (phuedx) @BBlack: If Varnish is the part of the stack that this is to be done, have you taken a look at [libvmod-abtest](https://github.com/Destination/libvmod-abtest)...
[12:55:04] <wikibugs>	 Analytics, Pageviews-API: 20160431 produces "end timestamp is invalid, must be a valid date in YYYYMMDD format" - https://phabricator.wikimedia.org/T135812#2311749 (Danny_B) April has only 30 days, not 31...
[13:15:42] <wikibugs>	 Analytics-Kanban, Datasets-Webstatscollector, RESTBase-Cassandra, Patch-For-Review: Better response times on AQS (Pageview API mostly) {melc} - https://phabricator.wikimedia.org/T124314#2312456 (elukey)
[13:15:45] <wikibugs>	 Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2312454 (elukey) Open>Resolved
[13:55:12] <urandom>	 elukey: hi!
[13:55:23] <elukey>	 o/
[13:55:28] <urandom>	 elukey: so regarding shutdown, there are a couple ways of thinking about this
[13:55:58] <urandom>	 and one is that not every shutdown is going to be clean
[13:56:14] <urandom>	 say if the machine crashes, there is a power failure, etc
[13:56:45] <urandom>	 and one of Cassandra's strengths is that it is resilient in the face of such failures
[13:56:45] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312587 (BBlack) >>! In T135762#2312406, @phuedx wrote: > @BBlack: If Varnish is the part of the stack that this is to be done, have you taken a look at [libvmod-abtest](https...
[13:57:31] <urandom>	 elukey: so i'm not sure i think it's Bad to not drain first
[13:57:59] <urandom>	 elukey: that freed space is interesting tho
[13:58:08] <elukey>	 yeah it happened also in the past
[13:58:18] <elukey>	 but it went down a lot :)
[13:58:53] <urandom>	 so... there have been bugs in the not too distant past related to the freeing of file descriptors
[13:59:08] <urandom>	 2.1.12 might be an affected version
[13:59:51] <urandom>	 that would be my guess here, that compaction had a bunch of working files that hadn't been freed, and as soon as you restarted, they all went away
[14:02:06] <urandom>	 the way to tell, would be to look at the *-tmp-* files in the column family directories before and after a restart
[14:02:24] <urandom>	 though w/ 2.1.13 you might not see this anymore
[14:03:48] * elukey nodes
[14:03:51] <elukey>	 *nods
[14:06:04] <wikibugs>	 Analytics-Tech-community-metrics, Developer-Relations, Community-Tech-Sprint: Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? - https://phabricator.wikimedia.org/T125459#2312641 (eranroz) Eranbot uses only Turnitin. Based on the last days it does around 1400...
[14:11:13] <urandom>	 elukey: notice how disk space changed, but sstable disk load did not? https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system
[14:11:46] <urandom>	 sstable disk load is just the tally of all the active sstable file sizes
[14:11:58] <urandom>	 but it does not count those *-tmp-* files
[14:11:59] <elukey>	 ahhhh nice!
[14:12:13] <elukey>	 they reached the same values!
[14:12:19] <urandom>	 so yeah, it freed up a bunch of leaked temp files on restart
[14:12:20] <urandom>	 yeah
[14:12:46] <elukey>	 \o/
[14:13:10] <elukey>	 something like 1TB per host
[14:13:11] <elukey>	 woa
[14:13:51] <urandom>	 maybe it's been a while since your last restart
[14:22:11] <elukey>	 one more thing to keep an eye on, need to familiarize more with the cassandra dashboards
[14:40:01] <wikibugs>	 Analytics-Kanban, Patch-For-Review: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2312786 (elukey) Verified that we have Ganglia/Graphite metrics for the new hosts (we also have metrics for every cassandra instance).  @MoritzMuehlenhoff created https://gerrit.wik...
[15:19:17] <grrrit-wm>	 (PS6) Nuria: Initial content of analytics.wikimedia.org [analytics/analytics.wikimedia.org] - https://gerrit.wikimedia.org/r/289062 (https://phabricator.wikimedia.org/T134506)
[15:20:04] <grrrit-wm>	 (CR) Nuria: Initial content of analytics.wikimedia.org (2 comments) [analytics/analytics.wikimedia.org] - https://gerrit.wikimedia.org/r/289062 (https://phabricator.wikimedia.org/T134506) (owner: Nuria)
[15:26:06] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312919 (Jdlrobson) This all sounds great and I love that its generic and can be reused again!  A few clarifications - if I'm understanding correctly experiments would be conf...
[15:33:05] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312939 (dr0ptp4kt) Quick question: how does this guarantee bucketing across browser restart?
[15:34:22] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312967 (Nuria) Non session cookies are kept after browser restarts, with an expiration set of 30 days (like last access cookie) the cookie is available.
[15:35:42] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312971 (dr0ptp4kt) I should note in the concrete cases: persistence across browser restart is probably not as important for lazy loaded images, whereas persistence across bro...
[15:36:43] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312972 (dr0ptp4kt) @nuria, should we add that to the Description as acceptance criteria?
[15:36:56] <grrrit-wm>	 (CR) Mforns: [C: 2 V: 2] "LGTM!" [analytics/analytics.wikimedia.org] - https://gerrit.wikimedia.org/r/289062 (https://phabricator.wikimedia.org/T134506) (owner: Nuria)
[15:37:23] <wikibugs>	 Analytics-Kanban: Enable rate limiting on pageview api - https://phabricator.wikimedia.org/T135240#2312977 (Nuria) @GWicke: >If you would like us to set up & deploy a config for you, then I think we can do that after setting one up for the regular RESTBase install. The puppet work for >that is likely to happ...
[15:37:55] <wikibugs>	 Analytics-Cluster, Analytics-Kanban: Configure Spark YARN Dynamic Resource Allocation - https://phabricator.wikimedia.org/T101343#2312980 (elukey)
[15:39:12] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312984 (BBlack) >>! In T135762#2312919, @Jdlrobson wrote: > A few clarifications - if I'm understanding correctly experiments would be configured in puppet?  That would be th...
[15:39:38] <nuria_>	 ottomata: Would you be so kind as to take a look at this puppet change, let me know if it is totally off and i can fix it: https://gerrit.wikimedia.org/r/#/c/289676/
[15:43:57] <elukey>	 nuria_: hola! Looks good from https://puppet-compiler.wmflabs.org/2860/, the only thing that we'd need to double check are permissions imho
[15:44:21] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2312986 (BBlack) >>! In T135762#2312939, @dr0ptp4kt wrote: > Quick question: how does this guarantee bucketing across browser restart?  >>! In T135762#2312967, @Nuria wrote: >...
[15:44:37] <nuria_>	 elukey: WOW, i forgot that EXISTED
[15:45:37] <elukey>	 it saves the day a lot :P
[15:47:16] <ottomata>	 looks good nuria_, i don't remember, you should check if/how git::clone manages the $directory
[15:47:27] <ottomata>	 if it also declares a file resource for it, those will conflict
[15:47:30] <ottomata>	 i think there might be a setting
[15:47:56] <ottomata>	 also, if it doesn't manage it (which is fine), you should move the file { $document_root ... above the git::clone
[15:47:59] <ottomata>	 and add to git::clone
[15:48:04] <ottomata>	 require => File[$document_root]
[15:48:22] <ottomata>	 (i think...)
[15:48:31] <elukey>	 good point
[15:48:39] <ottomata>	 :)
[15:48:45] <ottomata>	 milimetric: AHHHHH!  even in local mode, druid is trying to save deep storage in hdfs
[15:48:46] <ottomata>	 GAHHH
[15:51:52] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2313004 (Nuria) >Since the binning is done independently of actual experiments (the binning is live all the time for all cookie-enabled agents), this actually is a problem, I...
[16:00:12] <nuria_>	 a-team: standduppp
[16:01:31] <nuria_>	 ottomata: you start!, can you hear?
[16:12:59] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2313016 (BBlack) Right, but if the user deletes cookies or goes incognito, that's probably a rare event for most, and possibly associated with not re-using browser cache acros...
[16:15:34] <wikibugs>	 Analytics, ContentTranslation-Analytics, MediaWiki-extensions-ContentTranslation, Operations, Ops-Access-Requests: Add kartik to analytics-privatedata-users group - https://phabricator.wikimedia.org/T135704#2307853 (madhuvishy) Noting that analytics-privatedata-users also gives Hadoop access...
[16:15:41] <wikibugs_>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2313020 (Krinkle) I think 1-100 might be a bit small. Especially considering our scale and considering most of our experiments will not have been load tested very much.  For i...
[16:17:17] <elukey>	 ottomata: I *think* I might need to add a jar to all the hadoop nodes for the spark dynamic resource allocation, so I'll need some introduction from you about archiva I guess (whenever you have time next week)
[16:17:25] <elukey>	 beers will be offered in Berlin I promise
[16:18:34] <ottomata>	 oook! :) a jar that doesn't come in  a cdh package?
[16:19:28] <elukey>	 might be already in there, did check only super quickly on one node, buuut the introduction would be useful anyway :P
[16:19:52] <elukey>	 I mean, I can find
[16:19:53] <elukey>	 elukey@analytics1048:/usr/lib/hadoop$ find | grep shuffle
[16:19:53] <elukey>	 ./client/hadoop-mapreduce-client-shuffle.jar
[16:19:54] <elukey>	 ./client/hadoop-mapreduce-client-shuffle-2.6.0-cdh5.5.2.jar
[16:20:01] <elukey>	 but I need the spark shuffle
[16:20:22] <madhuvishy>	 ottomata: what is the difference between statistics-users and researchers?
[16:20:27] <elukey>	 elukey@analytics1048:/usr/lib$ find | grep shuffle
[16:20:31] <elukey>	 ./hadoop-yarn/lib/spark-yarn-shuffle.jar
[16:20:38] <elukey>	 guess that I won't need it :P
[16:20:54] <ottomata>	 madhuvishy:  both give access to stat003, researchers allows access to the mysql conf file with the researchers password
[16:21:09] <ottomata>	 https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups
[16:21:53] <madhuvishy>	 ottomata: ah cool
[16:23:38] <ottomata>	 elukey:  yay!  that's in hadoop-yarn?  interesting..
[16:25:43] <wikibugs>	 Analytics-Cluster, Analytics-Kanban: Configure Spark YARN Dynamic Resource Allocation - https://phabricator.wikimedia.org/T101343#2313079 (elukey) Looks like the big pre-requisite is already met:    > Locate the spark-<version>-yarn-shuffle.jar.    ``` elukey@analytics1048:/usr/lib$ find | grep shuffle ....
[16:26:31] <elukey>	 ottomata: yeah it is the shuffle service for Yarn, atm is the only one available.. There will be another one for mesos in the future probably
[16:26:44] <elukey>	 anyhow, it looks that this task should be resolved with a puppet change!
[16:27:05] <elukey>	 whenever you have time though let's chat about archiva, super interested
[16:27:21] <ottomata>	 anytime elukey, whatcha wanna know?
[16:27:26] <wikibugs_>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2313080 (BBlack) >>! In T135762#2313020, @Krinkle wrote: > [...] > An experiment could start at 1 bucket (0.01%) and work its way up to 10 (0.1%). And if the experiment no lon...
[16:27:46] <elukey>	 ottomata: general intro, where we use it, how to upload a jar, etc..
[16:27:58] <elukey>	 I will read the docs before of course :)
[16:28:13] <elukey>	 very ignorant about archiva/mavent/etc..
[16:29:12] <ottomata>	 ja elukey https://wikitech.wikimedia.org/wiki/Archiva is good place to start
[16:31:10] <elukey>	 yep thanks!
[16:31:17] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2313083 (Nuria) >For comparison, our entire Navigation Timing data used to be based on 0.01% sampling. It is now tuned up to 0.1% (1:1000 sample; >$wgNavigationTimingSamplingF...
[16:31:29] <elukey>	 going offline a-team, have a good weekend!
[16:31:45] <mforns>	 nice weekend elukey!
[16:32:53] <nuria_>	 elukey: ciao
[16:38:31] <wikibugs>	 Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2313093 (BBlack) >>! In T135762#2313083, @Nuria wrote:. > I know you know this but just clarifying that we do not have these restrictions here though,  the restrictions come f...
[18:27:03] <ottomata>	 nuria_: https://analytics.wikimedia.org/
[18:27:19] <nuria_>	 ottomata: OOOOHHHH
[18:27:53] <nuria_>	 ottomata: well, we will wait for milimetric to be back and review dashiki chnages before announcing it and removing teh labs domain for browsers cc mforns_gym
[18:27:59] <nuria_>	 ottomata: thanks for all your help
[18:28:21] <nuria_>	 and reminder toself i need to add a favicon
[18:29:47] <ottomata>	 :)
[18:29:56] <ottomata>	 also noticed the github link at the bottom just points to github.com/wikimedia
[18:29:57] <ottomata>	 not the repo
[19:02:26] <nuria_>	 ottomata: right, could not find repo cause i think it is created once it has content
[19:02:32] <nuria_>	 ottomata: will fix those two
[19:15:03] <mforns>	 right nuria_, cool
[19:57:13] <nuria_>	 the fact that all my typos are archived for ever kills me
[20:02:26] <grrrit-wm>	 (PS1) Nuria: Adding favicon and correcting link to github depot [analytics/analytics.wikimedia.org] - https://gerrit.wikimedia.org/r/289913 (https://phabricator.wikimedia.org/T134506)
[20:30:07] <wikibugs>	 Analytics-Kanban: Enable rate limiting on pageview api - https://phabricator.wikimedia.org/T135240#2313750 (Nuria) >Ahem.. i though the rate limiting for the whole api was ready to be used, is there any puppet needed? answering my own question seems that if the distributed hash table that will store the thro...
[20:54:04] <mforns>	 bye team! have a nice weekend!
[20:59:41] <wikibugs>	 Analytics-Tech-community-metrics, Developer-Relations, Community-Tech-Sprint: Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? - https://phabricator.wikimedia.org/T125459#2313903 (DannyH) @eranroz Oh, good -- so we don't have to worry about Eranbot, at least....
[22:06:26] <wikibugs>	 Analytics-Tech-community-metrics, Developer-Relations, Community-Tech-Sprint: Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? - https://phabricator.wikimedia.org/T125459#2314178 (kaldari) WMF Legal is reviewing the Google API terms of service.
[23:04:36] <dapatrick>	 It's late in the day, so I totally expect anyone to be around to help with this, but I'd like to query for requests from May 19th originating from a range of IP addresses. I need to examine the actual request (method, path, query), rather than count the number of requests made.
[23:05:39] <dapatrick>	 Adam pointed me to https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest and said that I should ask here because "it seems to be really important to restrict to only a few partitions."
[23:07:32] <dapatrick>	 Which I'm interpreting to mean that my search should be as narrow as possible, correct?
[23:16:41] <bd808>	 dapatrick: the webrequest data is big (like really big). Works best if you can narrow down to an hour, but it may be possible to get what you want for a whole day without blowing out of memory
[23:17:03] <bd808>	 Do you have access to stat1002 or another hive node?
[23:40:21] <dapatrick>	 bd808: Yep, I have access to stat1002. I'm logged in now, and am just ready Analytics wiki pages before I fire off my query.
[23:41:01] <bd808>	 cool. the worst thing that will happen is that it takes a long time and then crashes with an OOM
[23:42:06] <dapatrick>	 The client crashes, or some daemon that I lack adequate permission to start?
[23:42:16] <dapatrick>	 s/start/restart/
[23:42:37] <bd808>	 the job itself on the hadoop grid. nothing damaging to the world
[23:42:55] <bd808>	 "you can only hurt yourself" :)
[23:43:56] <dapatrick>	 Got it. Okay, I feel empowered.
[23:43:57] <dapatrick>	 Thanks.