[01:54:32] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2992028 (10Dzahn)
[01:54:46] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2991994 (10bd808) +1 from me. Getting @MusikAnimal access to both Hadoop and the DB replicas will help with many projects in #c...
[01:55:07] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2992033 (10MusikAnimal)
[02:40:06] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2992076 (10Milimetric) +1 @MusikAnimal should have this access.  But I think you need a +1 from Danny too.  cc @DannyH
[03:31:19] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2992146 (10bd808) Actually he needs @Kaldari to sign off as his manager.
[05:47:24] <wikibugs>	 10Analytics: Add namespace ID to webrequest and pageview_hourly - https://phabricator.wikimedia.org/T156993#2992233 (10Tbayer)
[06:45:21] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2992309 (10kaldari) +1 from me.
[07:07:16] <wikibugs>	 10Analytics, 10DBA: Drop m3 from dbstore servers - https://phabricator.wikimedia.org/T156758#2992314 (10Marostegui) m3 has been removed from dbstore2001 as per: T156905#2991826
[07:24:10] <elukey>	 o/
[07:24:26] <elukey>	 running errand for a couple of hours, will be back at ~10:30 CEST 
[07:24:57] <elukey>	 aqs1008-a seems good, even though it still needs hours to complete :(
[07:25:25] <elukey>	 hopefully today I'll be able to bootstrap aqs1008-b
[07:27:40] <wikibugs>	 10Analytics, 10Pageviews-API: Pageview API: Better filtering of bot traffic on top enpoints - https://phabricator.wikimedia.org/T123442#2992321 (10mahmoud) Meant to post this earlier, but great work @MusikAnimal! I'm eager to see this codified into some sort of anti-spam correction, but I'm concerned by articl...
[07:36:56] <wikibugs>	 10Analytics, 10DBA, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#2992331 (10Marostegui) In addition to what @jcrespo mentioned, in general, we are not completely happy if we decommission servers before running pt-table-checksum (T154485) wh...
[09:42:16] <elukey>	 joal: https://dataworkssummit.com/munich-2017/agenda/
[09:42:17] <elukey>	 :D
[09:43:49] <elukey>	 it costs a lot
[09:51:17] <joal>	 Hi elukey :)
[10:13:42] <joal>	 !log Killed-Restarted last access uniques monthly jobs to pick up new config -0097552-161121120201437-oozie-oozi-C
[10:13:43] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[10:14:15] <joal>	 HaeB: This setting is indeed usefull for any job that would use a big chunk of resources
[10:14:55] <joal>	 HaeB: Some more on that: When using a lot of resources, there are chances it will involve some resource competition
[10:15:39] <joal>	 And, there is no point to compete for resources that are not core to the criticial path (mappers BEFORE reducers)
[10:17:09] <joal>	 So while it is interesting to early-start some reducers when resource competition is small, on the contrary it is counter-productive when resource competition is high
[10:28:50] <icinga-wm>	 PROBLEM - Hadoop HistoryServer on analytics1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer
[10:29:01] <joal>	 mwarf :(
[10:29:05] <joal>	 elukey: around for --^
[10:29:07] <joal>	 ?
[10:32:03] <moritzm>	 oom with java heap space
[10:32:25] <joal>	 moritzm: Thanks - I guess you've restarted it?
[10:33:01] <moritzm>	 I'm currenzly trying to figure out which service that is :-)
[10:33:35] <moritzm>	 is it /etc/init.d/hadoop-mapreduce-historyserver?
[10:33:46] <joal>	 It's bizzarre: We never had issues with history server before, and since a few weeks, problems have started
[10:33:54] <joal>	 moritzm: I think it is
[10:34:04] <moritzm>	 i had been looking at /etc/init before since I didn't expect a sysvinit script
[10:34:07] <joal>	 moritzm: This service is not critical, so we can wasit for elukey 
[10:34:08] <elukey>	 moritzm: working on it
[10:34:15] <elukey>	 just seen the alarm
[10:34:36] <moritzm>	 already started it a second ago
[10:34:50] <icinga-wm>	 RECOVERY - Hadoop HistoryServer on analytics1001 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer
[10:35:11] <elukey>	 ok double restart :P
[10:35:16] <elukey>	 thanks
[10:35:26] <moritzm>	 so if one crashes, we're still good!
[10:35:43] <elukey>	 well there is only one of them! :P
[10:36:03] <elukey>	 but it is not critical.. we are trying to add jmx metrics to know what's happening
[10:36:12] <elukey>	 never showed an issue up to now
[10:36:13] <elukey>	 sigh
[10:36:15] <moritzm>	 backtrace is here: https://phabricator.wikimedia.org/P4868
[10:37:14] <elukey>	 thanks!
[10:42:10] <joal>	 thanks elukey and moritzm !
[10:55:42] <jand_wmde>	 hi, could one briefly point me to the location of the jupyter notebook instance we run? I can’t find the link anymore, got lost in cyberspace. 
[10:56:16] <jand_wmde>	 ahrg. Sorry, NOW I found it…
[11:30:47] <wikibugs>	 (03PS1) 10Joal: Rollback temporary variable in oozie workflow [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335628 (https://phabricator.wikimedia.org/T156668)
[11:31:08] <wikibugs>	 (03CR) 10Joal: [V: 032 C: 032] "Self merging bug correction." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335628 (https://phabricator.wikimedia.org/T156668) (owner: 10Joal)
[11:32:57] <joal>	 mforns: I'm sorry I lead you in a wrong direction :(
[11:33:21] <joal>	 mforns: I tested again the temporary variable in workflow, and it doesn't work (weirdly though)
[11:36:03] <joal>	 elukey: sorry, I broke scap :(
[11:36:14] <joal>	 elukey: stat1002 disk full :(
[11:36:24] <joal>	 elukey: I didn't check before deploying
[11:37:28] <elukey>	 joal: ah the space!
[11:38:32] * joal is having a tough beginning of day :(
[11:38:49] <elukey>	 joal: just freed some space with apt-get clean, now we have 1GB.. I am in the middle of a complex thing with memcached, will brb in 20 mins ok?
[11:39:23] <joal>	 elukey: I'll wait for you before deploying - please finish what you're doing
[11:49:56] <elukey>	 joal: here I am
[11:50:07] <joal>	 Hey elukey - I'm sorry for breaking stuff ...
[11:50:30] <joal>	 elukey: I noticed a patch I submited yesterday was actually not working, so wanted to correct it
[11:50:35] <elukey>	 you didn't break anything! it is git-fat that does this weirdness (and its friend scap)
[11:50:53] <joal>	 I know why, I just should have checked
[11:51:03] <joal>	 elukey: should I try to deploy again ?
[11:51:08] <elukey>	 joal: did you try to deploy and then hit the limit?
[11:51:09] <elukey>	 ah okok
[11:51:14] <elukey>	 so let me see if we can free space
[11:51:20] <joal>	 sure
[11:54:22] <wikibugs>	 (03CR) 10Joal: [C: 04-1] "I noticed a non-working pattern. See inline." (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331794 (https://phabricator.wikimedia.org/T155141) (owner: 10Mforns)
[11:57:47] <elukey>	 joal: you are good to deploy
[11:58:16] <joal>	 elukey: ok trying (please stay close)
[11:59:56] <joal>	 Yay elukey, seemed to work !
[12:00:23] <joal>	 Thanks a lot elukey 
[12:00:34] <elukey>	 :)
[12:01:07] <elukey>	 joal: going afk for lunch, all good?
[12:01:28] <joal>	 all good elukey, deploying and restarting jobs
[12:01:50] <elukey>	 goooood
[12:01:55] <elukey>	 brb in 30 mins
[12:01:55] <wikibugs>	 (03PS3) 10Nschaaf: (in progress) Store sanitized data for WDQS [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335211 (https://phabricator.wikimedia.org/T146915)
[12:03:25] <wikibugs>	 (03CR) 10Nschaaf: [C: 04-1] "The implementation is still pending on the discussion in the phab task, but I removed the drop script in favor of using refinery-drop-hour" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335211 (https://phabricator.wikimedia.org/T146915) (owner: 10Nschaaf)
[12:03:46] <joal>	 !log Deployed refinery to correct bug introduced in https://gerrit.wikimedia.org/r/#/c/335067/
[12:03:48] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[12:07:07] <joal>	 !log Restarted daily and monthly pageview druid loading jobs
[12:07:08] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[12:27:01] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2992913 (10JAllemandou) @jcrespo / @Marostegui : Questions for you guys: - On the list of projets listed above that are not present in labsd-analytics (checked this morning, list is...
[12:27:07] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2992915 (10Milimetric) Sorry, should've looked that up
[12:50:12] <wikibugs>	 (03PS1) 10Joal: Update pageview definition to remove previews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/335639 (https://phabricator.wikimedia.org/T156628)
[12:50:29] <mforns>	 hi team
[12:50:36] <joal>	 Hi mforns 
[12:52:16] <wikibugs>	 (03PS4) 10Joal: Update sqoop script with labsdb specificity [analytics/refinery] - 10https://gerrit.wikimedia.org/r/334042 (https://phabricator.wikimedia.org/T155658)
[12:53:01] <joal>	 mforns: I'm sorry about the wrong idea I had for oozie config
[12:53:54] <mforns>	 joal, reading it right now... really strange!
[12:54:01] <joal>	 indeed
[12:54:23] <mforns>	 joal, but wait... the daily job worked fine
[12:54:37] <joal>	 mforns: That's really bizzare :(
[12:54:44] <joal>	 mforns: it failed me at delete stage
[12:54:52] <mforns>	 hmmmm
[12:55:10] <joal>	 cd ..
[12:55:13] <joal>	 oops
[12:55:19] <mforns>	 xD
[12:55:32] * joal is in parent directory
[12:56:11] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993000 (10Marostegui) Hello!!  >>! In T155658#2992913, @JAllemandou wrote: > @jcrespo / @Marostegui : Questions for you guys: > - On the list of projets listed above that are not p...
[12:57:22] <mforns>	 joal, ok, will un-factor out the path
[12:57:40] <joal>	 mforns: sorry :(
[12:58:01] <mforns>	 joal, not at all, if I had seen it, I would have done the same :]
[12:58:45] <joal>	 mforns: I actually broke the druid pageview druid loading job with the same patch ...
[12:59:24] <mforns>	 aha
[13:01:50] <wikibugs>	 (03PS1) 10Joal: Update jar/version in refinery and pageview jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335642 (https://phabricator.wikimedia.org/T156628)
[13:11:15] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993035 (10JAllemandou) >>>! In T155658#2992913, @JAllemandou wrote: >> @jcrespo / @Marostegui : Questions for you guys: >> - On the list of projets listed above that are not presen...
[13:23:01] <wikibugs>	 (03PS11) 10Mforns: Add banner activity oozie jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331794 (https://phabricator.wikimedia.org/T155141)
[13:28:23] <joal>	 taking a break a-team - later
[13:28:33] <mforns>	 see ya in a bit ;]
[13:38:29] <wikibugs>	 (03PS12) 10Mforns: Add banner activity oozie jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331794 (https://phabricator.wikimedia.org/T155141)
[13:50:18] <wikibugs>	 (03CR) 10Mforns: [C: 031] Update jar/version in refinery and pageview jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335642 (https://phabricator.wikimedia.org/T156628) (owner: 10Joal)
[13:52:42] <wikibugs>	 (03CR) 10Mforns: [C: 031] Update pageview definition to remove previews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/335639 (https://phabricator.wikimedia.org/T156628) (owner: 10Joal)
[13:56:43] <wikibugs>	 06Analytics-Kanban: Document the difference in aggregate data on wikistats and wikistats 2.0 - https://phabricator.wikimedia.org/T150963#2993127 (10Elitre) OK, I'll try and book a slot with you. :)
[14:38:21] <elukey>	 joal: https://github.com/Netflix/dynomite is very interesting
[14:38:39] <elukey>	 (not for us but in general)
[14:44:50] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993299 (10Marostegui) >>! In T155658#2993035, @JAllemandou wrote: >>>>! In T155658#2992913, @JAllemandou wrote: >>> @jcrespo / @Marostegui : Questions for you guys: >>> - On the li...
[15:07:59] <wikibugs>	 10Analytics: CDH upgrade. Value proposition: new spark for edit reconstruction - https://phabricator.wikimedia.org/T152714#2993432 (10Ottomata) Previous CDH 5.5 upgrade task: T119646  etherpad process for that upgrade: https://etherpad.wikimedia.org/p/analytics-cdh5.5
[15:12:01] <halfak>	 Hey folks.  I'm running into a problem with large files in a repo for some of my deployments.  Are any of ya'll using git-fat or anything like that?
[15:13:14] <ottomata>	 yeah we use git-at
[15:13:15] <ottomata>	 git-fat
[15:13:18] <elukey>	 yep we use it with scap for the refinery
[15:13:36] <elukey>	 ottomata: o/
[15:14:13] <ottomata>	 hiii
[15:14:19] <ottomata>	 halfak: https://wikitech.wikimedia.org/wiki/Archiva#Deploy_artifacts_using_scap3
[15:14:28] <ottomata>	 those instructions are for archiva and jars though
[15:14:30] <ottomata>	 but mostly should work
[15:14:40] <halfak>	 OK.  Thanks.  Will have a look
[15:14:44] <ottomata>	 i think people only use it for archiav
[15:14:46] <ottomata>	 hm
[15:15:04] <ottomata>	 so remote = archiva.wikimedia.org::archiva/git-fat
[15:15:17] <ottomata>	 should work for you, and i don't see any reason why, (other than the name) that you shouldn't use it
[15:15:23] <ottomata>	 so, if you want  something that just works, I think it would be fine
[15:15:28] <ottomata>	 maybe just add a note to that wiki page that you are using it
[15:16:05] <ottomata>	 hmm, but can you push?  hm.
[15:16:10] <ottomata>	 not certain.
[15:16:28] <ottomata>	 halfak:  we may need to talk to releng to set up a dedicate git fat remote store somewhere.  its easy to do, since its just rsync
[15:16:30] <ottomata>	 but ya hm
[15:16:50] <halfak>	 Oh.  So no versioning. 
[15:17:10] <ottomata>	 oh yeah there's versinoing
[15:17:17] <ottomata>	 but by sha
[15:17:22] <ottomata>	 there's no diffing or anything
[15:17:47] <ottomata>	 each time you change the file it will upload a whole new one
[15:24:38] <wikibugs>	 (03CR) 10Mforns: [C: 031] "Awesome patience and endurance working on this patch!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/325312 (https://phabricator.wikimedia.org/T141548) (owner: 10Joal)
[15:40:22] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993556 (10JAllemandou) >>! In T155658#2993299, @Marostegui wrote: >>>! In T155658#2993035, @JAllemandou wrote: >>>>>! In T155658#2992913, @JAllemandou wrote: >>>> @jcrespo / @Maros...
[15:42:34] <wikibugs>	 (03CR) 10Nuria: "Yes, please , let's make sure we dry run even small changes like this one." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335628 (https://phabricator.wikimedia.org/T156668) (owner: 10Joal)
[15:43:35] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Pageview Jobs: Make workflows easier to maintain using a variable instead of repeating some complex value accross the files - https://phabricator.wikimedia.org/T156668#2993572 (10Nuria) 05Resolved>03Open
[15:46:30] <joal>	 hi nuria - I actually deployed it after patching, in order to restart failing jobs (T156668)
[15:46:31] <stashbot>	 T156668: Pageview Jobs: Make workflows easier to maintain using a variable instead of repeating some complex value accross the files - https://phabricator.wikimedia.org/T156668
[15:47:09] <nuria>	 joal: thank you
[15:47:22] <joal>	 nuria: well, I created the mess, better solve it !
[15:49:50] <halfak>	 ottomata, is git-fat versioned?  
[15:50:23] <ottomata>	 halfak:  yes?  
[15:50:26] <ottomata>	 in that
[15:50:36] <ottomata>	 each time you chagne a file, that file has a new sha
[15:50:37] <ottomata>	 hash
[15:50:49] <ottomata>	 and that is what the file is uploaded to the remote
[15:50:51] <ottomata>	 so
[15:50:56] <halfak>	 Is it associated with version of your repo?
[15:50:59] <ottomata>	 yeah
[15:50:59] <ottomata>	 so
[15:51:02] <ottomata>	 if you change a file
[15:51:04] <ottomata>	 and do git add
[15:51:14] <ottomata>	 it will not do a real git add
[15:51:20] <halfak>	 Gotcha.  This is good. 
[15:51:34] <ottomata>	 it will add a little text file with a description of the file, its size, and its sha
[15:51:40] <ottomata>	 that will be added to git
[15:51:58] <ottomata>	 the real file will be pushed to the remote with the filename changed to the sha
[15:52:04] <ottomata>	 then when you git pull elsewhere
[15:52:07] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993591 (10jcrespo) > testing how much load it puts on the machines  I would block that until we have gtid on the slaves. If for some reason the slaves crash before gtid_domain_id i...
[15:52:10] <halfak>	 OK this makes sense. 
[15:52:20] <ottomata>	 git fat will read the actual file in git, and find the filename-sha in the rsync remote
[15:52:23] <ottomata>	 and then rsync it down
[15:52:47] <ottomata>	 so, each git commit just changes the little text file, and that is versioned like normal in git
[15:52:57] <ottomata>	 so if you check out a previous commit, you'll get a different text file
[15:53:03] <ottomata>	 then git fat pull will pull down the file you are supposed to have
[15:53:59] <halfak>	 How do you push the artifacts to archiva?
[15:54:07] <halfak>	 o/ Amir1 
[15:55:36] <halfak>	 It looks like there's a web interface :S
[15:58:46] <ottomata>	 halfak:  yeah that's for archiva, 
[15:58:50] <ottomata>	 if you were just pushing your own files
[15:59:13] <ottomata>	 git fat push
[15:59:19] <ottomata>	 but you'd need an rsync remote you could rsync to
[15:59:29] <ottomata>	 i'm not sure archiva's will let you do that
[15:59:39] <ottomata>	 so that's the part we might need to talk to releng about
[15:59:44] <ottomata>	 not sure if there is already one out there
[16:01:23] <ottomata>	 ah!
[16:01:55] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993643 (10JAllemandou) Updated list of wikis not yet imported onto new servers, and where they'll be:  | fawiki | s2 | | frwiktionary | s2 | | nlwiki | s2 | | nowiki | s2 | | ptwik...
[16:02:25] <wikibugs>	 10Analytics, 10EventBus, 10Reading-Web-Trending-Service, 13Patch-For-Review, and 2 others: Compute the trending articles over a period of 24h rather than 1h - https://phabricator.wikimedia.org/T156411#2993644 (10Fjalapeno) I'm ok with integers. The ones we use will be limited in practice and I wouldn't exp...
[16:05:03] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993660 (10Marostegui) >>! In T155658#2993591, @jcrespo wrote: >> testing how much load it puts on the machines >  > I would block that until we have gtid on the slaves. If for some...
[16:05:46] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993678 (10Marostegui) This is the related task: T149418
[16:12:04] <moritzm>	 elukey: I upgraded firejail on thorium, ok to restart pivot?
[16:13:18] <elukey>	 yep!
[16:15:10] <moritzm>	 ok, done
[16:15:46] <wikibugs>	 10Analytics, 10EventBus, 10Reading-Web-Trending-Service, 13Patch-For-Review, and 2 others: Compute the trending articles over a period of 24h rather than 1h - https://phabricator.wikimedia.org/T156411#2993685 (10mobrovac) >>! In T156411#2993644, @Fjalapeno wrote: > I'm ok with integers. The ones we use wil...
[16:21:04] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993716 (10jcrespo) There are no plans to import labswiki or labstest wiki, those are special wikis that are not part of the main cluster.  No plans does not mean it will never happ...
[16:27:40] <wikibugs>	 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2993751 (10JAllemandou) Thanks a lot again @Marostegui  and @jcrespo for your answers. I understand the GTID thing (at least how it can impact), and I'm completely happy to wait unt...
[16:30:21] <wikibugs>	 (03CR) 10Joal: [C: 031] "LGTM again mforns :) Let me know when you want to merge." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331794 (https://phabricator.wikimedia.org/T155141) (owner: 10Mforns)
[16:31:53] <nuria>	 fdans: we are in batcave
[16:37:16] <wikibugs>	 10Analytics: CDH upgrade. Value proposition: new spark for edit reconstruction - https://phabricator.wikimedia.org/T152714#2993803 (10Ottomata) Debian Jessie vs CDH upgrade plan # Upgrade whole cluster to CDH 5.10 as is. # Get new Hadoop nodes, install those as Debian Jessie with CDH 5.10 # Incrementally reinsta...
[16:40:35] <wikibugs>	 10Analytics: CDH upgrade. Value proposition: new spark for edit reconstruction - https://phabricator.wikimedia.org/T152714#2993808 (10Nuria) Testing steps include loading data on labs & upgrade & testing refinery jobs before starting cluster migration
[16:52:51] <wikibugs>	 10Analytics: CDH upgrade. Value proposition: new spark for edit reconstruction - https://phabricator.wikimedia.org/T152714#2993836 (10Milimetric) p:05Triage>03Normal a:03Ottomata
[16:53:01] <wikibugs>	 06Analytics-Kanban: CDH upgrade. Value proposition: new spark for edit reconstruction - https://phabricator.wikimedia.org/T152714#2857818 (10Milimetric)
[17:00:02] <wikibugs>	 10Analytics: Create purging script for analytics-slave data - https://phabricator.wikimedia.org/T156933#2993865 (10Nuria) Whitelist of fields/tables that should not be deleted https://gerrit.wikimedia.org/r/#/c/298721/
[17:05:30] <wikibugs>	 10Analytics: Create purging script for analytics-slave data - https://phabricator.wikimedia.org/T156933#2993871 (10Milimetric) a:05jcrespo>03None
[17:08:20] <wikibugs>	 10Analytics: Add namespace ID to webrequest and pageview_hourly - https://phabricator.wikimedia.org/T156993#2992233 (10Nuria) Should already be there at least for some pages? https://wikitech.wikimedia.org/wiki/X-Analytics
[17:08:28] <wikibugs>	 10Analytics: Add namespace ID to webrequest and pageview_hourly - https://phabricator.wikimedia.org/T156993#2992233 (10Milimetric) Thought during grooming: this might already be done by Ori but is not working.
[17:09:58] <wikibugs>	 10Analytics, 10Analytics-Cluster, 06Operations, 10Ops-Access-Requests: Requesting access to analytics-privatedata-users for musikanimal - https://phabricator.wikimedia.org/T156986#2991994 (10Nuria) Approved on my end.
[17:11:16] <wikibugs>	 10Analytics: Add namespace ID to webrequest and pageview_hourly - https://phabricator.wikimedia.org/T156993#2992233 (10Ottomata) Ja, it should exist, see:  https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/d08504b26b943c2cd6da85e3b1eded89f8a9b056/WikimediaEventsHooks.php#L25-L57
[17:14:39] <wikibugs>	 10Analytics: Remove user_agent_map from pageview_hourly long term - https://phabricator.wikimedia.org/T156965#2991303 (10Milimetric) Currently the pipeline is:  webrequest (60 days) -> pageview_hourly (indefinite)  Proposal is:  webrequest (60 days) -> pageview_hourly (90 days) -> pageview_hourly_sanitized (inde...
[17:14:52] <wikibugs>	 10Analytics: Remove user_agent_map from pageview_hourly long term - https://phabricator.wikimedia.org/T156965#2993930 (10Milimetric) p:05Triage>03Normal
[17:21:00] <wikibugs>	 10Analytics, 10DBA: Drop m3 from dbstore servers - https://phabricator.wikimedia.org/T156758#2985275 (10Milimetric) We don't know what m3 is.  So far we don't know of a use for Phabricator databases on the analytics slaves.
[17:24:07] <wikibugs>	 10Analytics, 10DBA: Json_extract available on analytics-store.eqiad.wmnet - https://phabricator.wikimedia.org/T156681#2983518 (10Milimetric) It was the first time we tried to use it, it would be very useful if we could get it to work though.
[17:29:54] <wikibugs>	 10Analytics: Utility that creates pageview dumps should escape new lines - https://phabricator.wikimedia.org/T156656#2982769 (10Milimetric) See parent task and see if there's anything to change on the pageview definition (but not fixing mediawiki's problem of returning 200s for malformed requests).
[17:30:37] <wikibugs>	 10Analytics: Review parent task for any potential pageview definition improvements - https://phabricator.wikimedia.org/T156656#2982769 (10Milimetric)
[17:36:14] <wikibugs>	 10Analytics: Add namespace ID to webrequest and pageview_hourly - https://phabricator.wikimedia.org/T156993#2992233 (10JAllemandou) This field is already populated in webrequest: x_analytics_map['ns'] (see https://wikitech.wikimedia.org/wiki/X-Analytics).  This task is then about adding the field to pageview_hou...
[17:41:10] <wikibugs>	 10Analytics: Add namespace ID to pageview_hourly - https://phabricator.wikimedia.org/T156993#2994086 (10JAllemandou)
[17:49:30] <wikibugs>	 (03PS1) 10Joal: Add explicit namespace to webrequest and pageview [analytics/refinery] - 10https://gerrit.wikimedia.org/r/335679 (https://phabricator.wikimedia.org/T156993)
[17:53:28] <fdans>	 awww https://drive.google.com/drive/folders/0B_GP_R6AguBITU0tRVZIcTU1MUE
[17:58:53] <wikibugs>	 10Analytics, 13Patch-For-Review: Add namespace ID to pageview_hourly - https://phabricator.wikimedia.org/T156993#2994137 (10Tbayer) Right, this task was written with pageview_hourly in mind; I added webrequest only as an afterthought without checking thoroughly - I kind of assumed from the fact that T92875 is...
[18:03:23] <HaeB>	 joal: (reduce slowstart) interesting, thanks for the explanation!
[18:12:11] <elukey>	 going afk team!
[18:12:17] <elukey>	 nuria: thanks for the pic, awesome!
[18:12:23] <elukey>	 o/
[18:12:27] <HaeB>	 joal: speaking of performance: the timeout tip seems to have worked in the sense that the query now doesn't fail - but it's taking a long time (has been at "Stage-3 map = 100%,  reduce = 100%" for about 21 hours now)
[18:12:29] * elukey afk!
[18:13:42] <HaeB>	 maybe hive gets confused performance-wise by two windowing terms in the same SELECT ? (wild guess)
[18:17:01] <joal>	 HaeB: That was expected (the timeout tip) ... Not sure how to overcome (or if not even buggy)
[18:17:37] <joal>	 HaeB: I have rewritten the request using a single window - same issue
[18:19:36] <joal>	 HaeB: I can actually tell you the query will fail
[18:20:21] <joal>	 HaeB: MapReduce has a retry policy of 4 attempts, and your request is at its 2nd 
[18:20:36] <joal>	 HaeB: First has been killed because of timeout
[18:20:40] <joal>	 hm
[18:20:48] <joal>	 It's weird it's not working at all
[19:01:28] <HaeB>	 joal: this seems someone with a similar problem (but no solution) http://grokbase.com/t/hive/user/144b76yzb2/reducer-wont-finnish-window-function
[19:02:57] <HaeB>	 and another one http://grokbase.com/t/hive/user/144xxn5t04/bug-in-hive-partition-windowing-functions ("it appears stuck in an infinite loop")
[19:06:37] <HaeB>	 joal: in any case, feel free to kill it... seems clear that it is going nowhere
[19:06:49] <joal>	 HaeB: ok, will kill
[19:07:45] <joal>	 HaeB: Can you paste your query again please
[19:08:51] <HaeB>	 https://www.irccloud.com/pastebin/ZCMZ474B/cumulative%20pageviews%20distribution%20eswiki%202017-01-31%20reducer%20timeout%201h
[19:08:57] <joal>	 Thank you
[19:40:13] <ottomata>	 ah!
[19:40:15] <ottomata>	 elukey: , joal!
[19:40:31] <ottomata>	 cdh puppet module has accidentally been reverted to an older version for 5 weeks!
[19:40:32] <ottomata>	 https://gerrit.wikimedia.org/r/#/c/316577/
[19:41:08] <ottomata>	 which reverted the nofiles_ulimit increase for spark stuff
[19:41:12] <ottomata>	 surprised that hasn't bitten us
[19:41:23] <ottomata>	 but maybe it will now with elukey's recent nodemanager restarts :o
[19:45:29] <chasemp>	 sorry guys!
[19:49:31] <chasemp>	 ottomata: what do you want to do w/ that? 
[19:54:34] <ottomata>	 chasemp:  s'ok, i only noticed because I was bumping the submodule anyway
[19:54:50] <ottomata>	 we'll watch, i'm not 100% certain anything bad will happen :)
[19:57:15] <ottomata>	 nuria: milimetric did yall get a link to my google doc eventstreams draft email?
[20:11:56] <nuria>	 ottomata: yes
[20:12:03] <nuria>	 ottomata: will look at that today
[20:12:27] <nuria>	 ottomata: need to update AB testing docfor my next meeting with Ellery
[20:15:05] <ottomata>	 k
[20:51:57] <wikibugs>	 10Analytics, 10Analytics-Wikistats, 10Android-app-Bugs, 06Wikipedia-Android-App-Backlog: Gõychi Konknni's English Wikistats translation is incorrect - https://phabricator.wikimedia.org/T156814#2986727 (10Milimetric) Translations for this project are handled on translatewiki.  I tried to find details of how...
[22:18:33] <milimetric>	 o/ for tonight
[22:18:43] <wikibugs>	 10Analytics, 10Analytics-Wikistats, 10Android-app-Bugs, 06Wikipedia-Android-App-Backlog, 07I18n: Gõychi Konknni's English Wikistats translation is incorrect - https://phabricator.wikimedia.org/T156814#2994877 (10Niedzielski) @Milimetric, hm maybe this is it except it looks correct: https://translatewiki....
[22:41:02] <wikibugs>	 06Analytics-Kanban:  CDH 5.10 upgrade - https://phabricator.wikimedia.org/T152714#2994963 (10Ottomata)
[23:03:27] <wikibugs>	 06Analytics-Kanban: CDH 5.10 upgrade - https://phabricator.wikimedia.org/T152714#2995005 (10Ottomata) Did the upgrade in labs today:  - Went smoothly. - Except I broke Hue.  I think this was not caused by the upgrade though.  Will investigate more. - I was able to run a Jessie worker node on CDH 5.10 alongside a...
[23:51:34] <wikibugs>	 06Analytics-Kanban, 06Research-and-Data: Coordinate with research to vet metrics calculated from the data lake - https://phabricator.wikimedia.org/T153923#2995119 (10ezachte) I learned from @Neil_P._Quinn_WMF yesterday that the data lake doesn't know about redirects. If indeed that is the case, I'm curious: ho...
[23:57:59] <halfak>	 We might have to allow work on redirects to count as valuable work in Wikipedia.  
[23:58:00] <halfak>	 ;)