[07:43:32] <wikibugs>	 10Analytics, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10jcrespo) This is not something we handle- we don't decide on the table structure (this refactoring, comment storage, was owned by Platform team), wh...
[10:48:28] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10User-Elukey: Update to CDH 6 or other up-to-date Hadoop distribution - https://phabricator.wikimedia.org/T203693 (10elukey)
[10:48:42] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10User-Elukey: Update to CDH 6 or other up-to-date Hadoop distribution - https://phabricator.wikimedia.org/T203693 (10elukey)
[10:55:15] <joal>	 Hi elukey - I'm rwad for upgrade when you want :)
[10:55:42] <elukey>	 joal: o/
[10:56:05] <elukey>	 I upgraded all the packages (there was a subtle thing that I wasn't aware) in apt, ready as well
[10:56:15] <joal>	 \o/ :)
[10:56:22] <elukey>	 I also discovered today that cdh releases source packages!!!!!
[10:56:39] <elukey>	 this is definitely a game changer for CDH6
[10:56:46] <joal>	 elukey: Currently double checking MWH jobs etc, and will probably provide a patch for datasource update in AQS
[10:57:20] <joal>	 elukey: meaning given we have the source of CDH6, we could modify them for us
[10:57:38] <elukey>	 joal: yeah and rebuild the ubuntu ones for stretch
[10:57:45] <elukey>	 applying security patches etc..
[10:58:33] <elukey>	 about the upgrade - I announced 14 CEST so we'd probably need to wait a bit before starting
[10:58:39] <elukey>	 what do you think?
[10:58:48] <elukey>	 otherwise I can send an email now saying that we anticipate
[10:59:01] <elukey>	 but maybe given the hour better to wait post-lunch?
[10:59:08] <elukey>	 I don't have any real preference
[10:59:20] <joal>	 elukey: 14CEST is good for me, already lunched, but things to do :)
[11:00:05] <elukey>	 ack then, let's touch base at 13:something to start draining the clustee
[11:00:08] <elukey>	 *cluster
[11:00:20] <joal>	 +1
[11:00:27] <joal>	 I'll be here :)
[11:00:34] <elukey>	 super
[11:31:28] <wikibugs>	 10Analytics: stats.wikimedia.org home page should link to wikistats 2 - https://phabricator.wikimedia.org/T191555 (10fdans) This task is solved as of T203128. Wikistats 2 is now featured in the front page of Wikistats, replacing the Report Card image and link.
[11:33:50] <wikibugs>	 (03CR) 10Fdans: [C: 032] Wikistats2: Added eslintrc [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/471322 (https://phabricator.wikimedia.org/T208697) (owner: 10John Erling Blad)
[12:08:54] <joal>	 elukey: Heya :) Cluster drain?
[12:09:03] <elukey>	 yep!
[12:09:42] <joal>	 elukey: camus / spark-refine  - Anything else I forget?
[12:09:46] <elukey>	 ether pad is https://etherpad.wikimedia.org/p/analytics-cdh5.15
[12:09:59] <elukey>	 report updater
[12:10:14] <joal>	 YES - I always forget that guy
[12:12:17] <elukey>	 stopped the crons
[12:12:42] <elukey>	 so all the systemd timers are scheduled in 4h
[12:12:46] <elukey>	 or more
[12:12:52] <joal>	 elukey: We are the beginning of 11-UTC webrequest refinement, so we can assume we'll be about right to start the upgrade for 14CEST
[12:12:53] <elukey>	 so I'd say that we can leave them
[12:13:09] <elukey>	 ack
[12:15:17] <elukey>	 report updater jobs stopped on stat1007 (only hdfs stuff)
[12:15:32] <joal>	 ok
[12:32:10] <elukey>	 joal: in the meantime, do you want to test the new mw datasource on aqs1004?
[12:34:12] <joal>	 elukey: +1 !
[12:37:05] <fdans>	 joal helloooo if you're not in the middle of something, can I have a word on the bc?
[12:37:15] <joal>	 Hi fdans - We can talk :)
[12:37:22] <fdans>	 omw
[12:40:20] <elukey>	 joal: aqs1004 ready (depooled)
[12:40:37] <elukey>	 dsaez: o/
[12:40:40] <elukey>	 are you online?
[12:44:04] <elukey>	 sent an email to the owners of the spark shells, even if they are probably doing nothing
[12:44:59] <elukey>	 starting to downtime/disable-puppet/etc..
[12:46:14] <joal>	 ok elukey 
[12:49:07] <elukey>	 done!
[12:56:48] <joal>	 elukey: the spark-shells we see are coming from notebooks and should be easily relaunchable, let's not wait for answers on those
[12:57:08] <elukey>	 oh yes yes (Tiziano already replied btw)
[13:00:15] <joal>	 elukey: webrequest-text should be finished soon ,and then we have the flow of pageview etc to go for
[13:02:31] <elukey>	 yep
[13:11:40] <elukey>	 joal: there are only discovery jobs now
[13:14:43] <joal>	 elukey: yes - The job is weekly, and after the one currently running the data is loaded with Spark.
[13:15:04] <joal>	 elukey: I will disable the ES-Load one, to prevent having to kill-relaunch, ok ?
[13:15:25] <joal>	 s/disable/suspend sorry
[13:18:19] <joal>	 !log Suspend discovery 0060527-180705103628398-oozie-oozi-C coordinator for it not to block upgrade
[13:18:21] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[13:18:23] <elukey>	 okok
[13:19:10] <joal>	 elukey: we've been VERY unlucky about the timing: the discovery job happen once a week, on monday, after hour 11UTC
[13:20:11] <elukey>	 ahahhaha
[13:21:49] <elukey>	 grabbing a coffee then
[13:25:47] <joal>	 elukey: WE'RE READY :)
[13:26:37] <elukey>	 gooood
[13:26:53] <elukey>	 joal: bc?
[13:26:57] <joal>	 elukey: something else I noticed while babysitting cluster: webrequest-refine take ~1h for webrequest_text, and the following jobs take less than 15minutes (all of them, the full flow) 
[13:27:10] <joal>	 There might be some improvement to look for in refine
[13:27:18] <joal>	 Batcave indeed
[13:31:28] <elukey>	 we are upgrading the cluster folks
[14:34:40] <mforns>	 heya team
[14:37:31] <fdans>	 joal elukey we're all gonna get clusterphobic :D
[14:37:33] <fdans>	 :(
[14:37:35] <fdans>	 sorry
[14:42:05] <elukey>	 hahaha
[14:42:15] <elukey>	 mforns: hola Marcelo, we are in bc, almost done with the upgrade
[14:42:25] <mforns>	 hey elukey 
[14:42:33] <mforns>	 omw
[14:44:17] <moritzm>	 elukey: furud and flerovium still have the old CDH packages installed, BTW
[14:44:35] <elukey>	 moritzm: yes! need to follow up on these as well, thanks!
[14:44:46] <moritzm>	 ack
[15:01:33] <ottomata>	 o/ elukey you cdh upgrading?
[15:02:26] <elukey>	 ottomata: hola yes almost completed, hue upgraded to 4 and seems not working fine now, the rest looks good
[15:02:37] <elukey>	 saw the failure on an1039 but it seems due to read only fs
[15:04:06] <ottomata>	 right ok, started looking at that and noticed upgrade in progress
[15:04:09] <ottomata>	 wanted to check with you first
[15:04:10] <ottomata>	 will wait
[15:04:12] <ottomata>	 :)
[15:04:13] <ottomata>	 nice btw!
[15:04:21] <ottomata>	 lemme know if i can help
[15:04:32] <elukey>	 thanks! aren't you on holidays now?
[15:05:27] <elukey>	 if not we are on bc :)
[15:05:28] <elukey>	 otherwise I'll call you if anything goes on fire :D
[15:07:27] <ottomata>	 today is a holiday technically but i'm going to work today (or at least some of it) and save it for later
[15:07:35] <elukey>	 ahhh nice!
[15:08:02] <ottomata>	 The next couple of weeks are gonna be crazy for me with both a friends thanksgiving and a family thanksgiving, so i'll need (and have) some saved up extra days off :)
[15:09:35] <elukey>	 :)
[15:21:53] <elukey>	 ok people hue should work now
[15:22:04] <elukey>	 the "old" version is at https://hue.wikimedia.org/oozie/list_oozie_bundles/
[15:22:17] <elukey>	 (basically it is sufficient to remove '/hue/' from the path)
[15:27:30] <fdans>	 thank you lucaaaaa
[15:31:00] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update to cloudera 5.15 - https://phabricator.wikimedia.org/T204759 (10elukey) Cluster upgraded!
[15:33:20] <bearloga>	 ottomata: morning! sorry to bother you. when you have a brief moment, would you be able to reset my venv on notebook1003 and notebook1004, please?
[15:36:46] <milimetric>	 o/
[15:40:01] <fdans>	 !log Restarting per project family unique generation jobs (daily and monthly)
[15:40:02] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[15:40:36] <ottomata>	 bearloga:  ha oook, give me a few.  did you bork it all up?! :p
[15:40:37] <milimetric>	 joal: need any help with mwh?
[15:42:52] <wikibugs>	 10Analytics, 10Analytics-EventLogging: eventlogging Dockerfile doesn't work - https://phabricator.wikimedia.org/T208679 (10Ottomata) Ha yea Petr doesn't really have anything to do with this.  Hm.  I'm more inclined to get rid of the EventLogging Docker stuff.  None of us use it, and it doesn't seem particularl...
[15:45:09] <wikibugs>	 10Analytics, 10Analytics-EventLogging: eventlogging Dockerfile doesn't work - https://phabricator.wikimedia.org/T208679 (10Addshore) >>! In T208679#4739776, @Ottomata wrote: > I'm more inclined to get rid of the EventLogging Docker stuff.  None of us use it, and it doesn't seem particularly useful.  If it's no...
[15:45:23] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Ottomata) I'd prefer if we used 'analytics' instead of 'data lake'.  Can we do cloudvirtanXXXX?  cloudvirt-anXXXX?
[15:48:17] <wikibugs>	 10Analytics, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Nuria) Pinging @bd808 and @Fjalapeno and @tstarling per above comment.
[15:49:14] <wikibugs>	 10Analytics, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Milimetric) @Nuria, I'm catching up on the task that Jaime recommended and will comment there.
[15:50:44] <wikibugs>	 10Analytics, 10Analytics-EventLogging: eventlogging Dockerfile doesn't work - https://phabricator.wikimedia.org/T208679 (10Ottomata) Could you get it up and running in just a python virtualenv instead of Docker?  Or, mediawiki-vagrant has an eventlogging role!
[15:52:43] <nuria>	 milimetric: do you know today is WMF holiday?
[15:53:01] <wikibugs>	 10Analytics, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Pintoch) I have taken the liberty to remove "Cloud Services" as a subscriber to this ticket as I do not think every toollabs user wants to receive n...
[15:53:02] <milimetric>	 nuria: yes, I'm sorry I didn't realize early enough to tell you Friday, but my plan was to work and take off Friday instead
[15:53:10] <nuria>	 milimetric: sounds good
[15:53:13] <nuria>	 joal: yt?
[15:53:15] <milimetric>	 because I have to bake a whole bunch of things for ottomata's party :)
[15:53:23] <nuria>	 nice
[15:53:32] <ottomata>	 :D
[15:54:55] <wikibugs>	 10Analytics, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Cyberpower678) Why am I getting emails to this task?
[15:55:34] <bearloga>	 ottomata: I borked it up very badly on notebook1004 and then on notebook1003 I just want it reset because I previously used virtualenv in a way where it installs into venv's site-packages, not the separate environment
[15:56:35] <ottomata>	 i should make some reset my venv button/script for people to use
[15:56:36] <ottomata>	 hmm
[15:59:19] <joal>	 team - my wife is late home, will miss the beginning of standup
[16:00:28] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: bothersome output in hive when querying events database - https://phabricator.wikimedia.org/T208550 (10Nuria)
[16:01:11] <ottomata>	 bearloga:  did you remove your venv on notebook1004
[16:01:19] <ottomata>	 ?
[16:01:47] <wikibugs>	 10Analytics, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10jcrespo) Nuria apparently subscribed 120 cloud users to this task by mistake- please be careful when using Phabricator to not annoy (with spam) our...
[16:03:38] <wikibugs>	 (03PS6) 10Mforns: Add bin/refinery-drop-older-than [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471279 (https://phabricator.wikimedia.org/T199836)
[16:04:29] <nuria>	 ping fdans 
[16:05:07] <fdans>	 nuria: sorryyyy lost track of time, be there in 2 min!!
[16:08:44] <bearloga>	 ottomata: I accidentally did :( sorry!!!
[16:09:14] <bearloga>	 ottomata: also +Inf to that reset button/script idea :D
[16:12:04] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update to cloudera 5.15.0 - https://phabricator.wikimedia.org/T204759 (10elukey)
[16:12:54] <bearloga>	 ottomata: afterward I ran `python3 -m venv vent`, then `venv/bin/pip install --upgrade --no-index --ignore-installed --find-links=/srv/jupyterhub/deploy/artifacts/stretch/wheels --requirement=/srv/jupyterhub/deploy/frozen-requirements.txt` (from https://wikitech.wikimedia.org/wiki/SWAP#Updating_user_virtualenvs) but got: "FileNotFoundError: [Errno 2] No such file or directory: '/srv/home/bearloga/toree-0.2.0.tar.gz'"
[16:14:43] <bearloga>	 ooooh new Hue
[16:17:31] <ottomata>	 hmmmm
[16:17:53] <ottomata>	 ah that makes sense
[16:18:03] <ottomata>	 actually bearloga that will be run (in a slightly different way) when you log in.
[16:18:07] <ottomata>	 i just shut down your notebook
[16:18:08] <ottomata>	 on 1004
[16:18:11] <ottomata>	 can you try logging in there?
[16:18:14] <ottomata>	 let's see what happens
[16:18:19] <bearloga>	 thanks! will do!
[16:18:20] <ottomata>	 logging into jupyterhub ^
[16:19:10] <bearloga>	 ottomata: "Spawner failed to start [status=3]. The logs for bearloga may contain details."
[16:19:48] <ottomata>	 bearloga:  ok i'm watching try again
[16:20:01] <bearloga>	 ottomata: I'm really sorry, I didn't expect this to be longer than "a moment" :\
[16:20:17] <ottomata>	 hm not much info
[16:20:18] <ottomata>	 hm
[16:20:18] <ottomata>	  ok
[16:21:09] <bearloga>	 okay it went into spawning screen and then "Failed to reach your server. Please try again later. Contact admin if the issue persists."
[16:21:57] <ottomata>	 hm
[16:22:30] <ottomata>	 bearloga:  ok try again too
[16:23:02] <bearloga>	 btw I don't have anything of value on notebook1004 rn so if you just wanna nuke my whole dir so puppet recreates it from scratch that's a-OK :)
[16:23:21] <icinga-wm>	 PROBLEM - Check the last execution of refinery-drop-webrequest-raw-partitions on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-drop-webrequest-raw-partitions
[16:23:25] <ottomata>	 naw puppet won't do it, the venv is created on login to jupyterhub
[16:23:26] <ottomata>	 kinda weird
[16:23:34] <bearloga>	 ottomata: and it's working! :D
[16:23:38] <ottomata>	 great!
[16:23:41] <bearloga>	 ottomata: thank you so much!!!
[16:23:58] <ottomata>	 ok now for 1003
[16:24:16] <elukey>	 so refinery drop partitions fails with
[16:24:17] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]: Traceback (most recent call last):
[16:24:21] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]:   File "/srv/deployment/analytics/refinery/bin/refinery-drop-webrequest-partitions", line 130, in <module>
[16:24:25] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]:     for partition_spec in hive.partition_specs(table):
[16:24:28] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]:   File "/srv/deployment/analytics/refinery/python/refinery/util.py", line 271, in partition_specs
[16:24:32] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]:     for p in partition_descs
[16:24:35] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]:   File "/srv/deployment/analytics/refinery/python/refinery/util.py", line 322, in partition_spec_from_partition_desc
[16:24:39] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]:     (key,value) = p.split('=')
[16:24:42] <elukey>	 Nov 12 16:15:13 an-coord1001 refinery-drop-webrequest-partitions[115260]: ValueError: need more than 1 value to unpack
[16:25:29] <ottomata>	 bearloga:  try 1003 now
[16:26:23] <bearloga>	 ottomata: perfect, thank you so much!!!
[16:26:30] <icinga-wm>	 PROBLEM - Check the last execution of refinery-drop-apiaction-partitions on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-drop-apiaction-partitions
[16:26:51] <ottomata>	 great, added https://wikitech.wikimedia.org/wiki/SWAP#Resetting_user_virtualenvs for future admins
[16:28:00] <icinga-wm>	 PROBLEM - Check the last execution of refinery-drop-cirrussearchrequestset-partitions on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-drop-cirrussearchrequestset-partitions
[16:30:02] <elukey>	 this is surely due to the upgrade
[16:30:18] * bearloga bakes virtual cookies for ottomata
[16:33:31] <icinga-wm>	 RECOVERY - Check the last execution of refinery-drop-webrequest-raw-partitions on an-coord1001 is OK: OK: Status of the systemd unit refinery-drop-webrequest-raw-partitions
[16:34:15] <ottomata>	 elukey:  hm i bet the output format of some hdfs dfs -ls or -stat has changed
[16:34:25] <elukey>	 probably, I am checking
[16:34:26] <ottomata>	 the scripts are just parsing the output
[16:34:38] <elukey>	 show partitions indeed work
[16:35:13] <elukey>	 it returns 
[16:35:14] <elukey>	 webrequest_source=upload/year=2018/month=10/day=17/hour=5
[16:35:14] <elukey>	 webrequest_source=upload/year=2018/month=10/day=17/hour=6
[16:35:15] <elukey>	 etc..
[16:36:20] <bearloga>	 ottomata: sorry to bother you again. one minor request. there's still stat1003:/home/bearloga/venv_backup that has a bunch of files owned by root from when madhu was helping me with a similar issue last year. can you please delete that dir?
[16:36:29] <ottomata>	 sure
[16:36:35] <ottomata>	 on notebook1003?
[16:36:36] <ottomata>	 you mean?
[16:37:12] <ottomata>	 bearloga: ?
[16:37:27] <bearloga>	 oh, d'oh. yeah notebook1003
[16:37:28] <bearloga>	 sorry
[16:37:40] <ottomata>	 done
[16:37:44] <bearloga>	 thanks!
[16:38:51] <bearloga>	 OMFG YAS Hue 4 fixed the problem with downloading CSVs and now includes .csv extension yesssssss
[16:39:02] <elukey>	 :)
[16:39:40] <bearloga>	 I do not have the words to describe that annoying that was and how happy I am that it's no longer a thing
[16:39:59] <bearloga>	 thanks for the upgrade, y'all
[16:40:09] <bearloga>	 much love and cookies <3
[16:44:50] <icinga-wm>	 PROBLEM - Check the last execution of refinery-drop-webrequest-raw-partitions on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-drop-webrequest-raw-partitions
[16:48:30] <icinga-wm>	 PROBLEM - Check the last execution of refinery-drop-webrequest-refined-partitions on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-drop-webrequest-refined-partitions
[16:50:03] <elukey>	 I am trying to add logger.infos to refinery utils on the host
[16:50:50] <elukey>	 Nov 12 16:49:53 an-coord1001 refinery-drop-webrequest-partitions[144824]: 2018-11-12T16:49:53 INFO   {'mediawiki_private_cu_changes': {}, 'WARN: The method class org.apache.commons.loggin,etc..
[16:51:42] <elukey>	 WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
[16:51:48] <elukey>	 interesting
[16:51:54] <elukey>	 so hive returns garbage indeed
[16:53:07] <ottomata>	 ahhh hm
[16:53:18] <ottomata>	 elukey:  is that maybe related to the hive cli logging change we made?
[16:55:14] <joal>	 elukey, ottomata - Could be, but I didn't notice - I'm very sorry if it is :(
[16:55:31] <elukey>	 nah it started to alarm after the upgrade
[16:55:40] <elukey>	 so hive must now complain about something
[16:56:56] <joal>	 :(
[16:59:13] <mforns>	 team, back from electrician, are we doing tasking still?
[17:03:20] <nuria>	 ping fdans 
[17:16:13] <wikibugs>	 10Analytics, 10Analytics-Kanban: Final steps to expose project family unique devices data - https://phabricator.wikimedia.org/T167539 (10fdans) Before adding project families we have to backfill the other two fields for per domain data. Then we'll load data to cassandra and move on to T205665.
[17:17:29] <wikibugs>	 10Analytics, 10Analytics-Kanban: Update log_namespace, page_namespace from bigint to int - https://phabricator.wikimedia.org/T209179 (10fdans)
[17:17:49] <wikibugs>	 10Analytics, 10Analytics-Kanban: Update log_namespace, page_namespace from bigint to int - https://phabricator.wikimedia.org/T209179 (10fdans) p:05Triage>03High
[17:18:17] <wikibugs>	 10Analytics, 10Analytics-Kanban: Update log_namespace, page_namespace from bigint to int - https://phabricator.wikimedia.org/T209179 (10fdans) a:03mforns
[17:18:49] <wikibugs>	 10Analytics, 10Analytics-Kanban: Long term solution for sqooping comments - https://phabricator.wikimedia.org/T209178 (10fdans) a:03JAllemandou
[17:18:57] <wikibugs>	 10Analytics, 10Analytics-Kanban: Long term solution for sqooping comments - https://phabricator.wikimedia.org/T209178 (10fdans) p:05Triage>03High
[17:20:37] <wikibugs>	 10Analytics: unique devices monthly should be configured  with default "monthly" granularity in turnilo - https://phabricator.wikimedia.org/T209103 (10fdans) a:03Nuria
[17:21:03] <wikibugs>	 10Analytics: unique devices monthly should be configured  with default "monthly" granularity in turnilo - https://phabricator.wikimedia.org/T209103 (10fdans) p:05Triage>03Normal
[17:22:33] <wikibugs>	 10Analytics, 10Product-Analytics, 10Reading-analysis: [EventLogging Sanitization] Update EL sanitization white-list for field renames in EL schemas - https://phabricator.wikimedia.org/T209087 (10mforns)
[17:25:29] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10herron) p:05Triage>03Normal
[17:26:38] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10fdans) a:03JAllemandou
[17:27:05] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10fdans) p:05Triage>03High
[17:27:31] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10herron) @Ottomata and @elukey what do you think?
[17:29:00] <wikibugs>	 10Analytics, 10Research: Cannot connect to Spark with Jupyter notebook on stat1007 - https://phabricator.wikimedia.org/T208896 (10fdans) a:03fdans
[17:29:29] <wikibugs>	 10Analytics, 10Research: Cannot connect to Spark with Jupyter notebook on stat1007 - https://phabricator.wikimedia.org/T208896 (10fdans) p:05Triage>03High
[17:29:55] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research: Cannot connect to Spark with Jupyter notebook on stat1007 - https://phabricator.wikimedia.org/T208896 (10fdans)
[17:30:34] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: [EventLoggingToDruid] Add explicit types to numeric dimensions so that they are ingested as such - https://phabricator.wikimedia.org/T208872 (10fdans) 05Open>03declined
[17:30:56] <wikibugs>	 (03Abandoned) 10Mforns: Add explicit types to numeric dimensions in DataFrameToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/472013 (https://phabricator.wikimedia.org/T208872) (owner: 10Mforns)
[17:30:57] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10Ottomata) I believe we had this problem  (and discussion) before...and we de...
[17:32:41] <wikibugs>	 10Analytics: Consider changing Phab projects to sub-projects - https://phabricator.wikimedia.org/T208798 (10fdans) a:03Milimetric
[17:35:20] <wikibugs>	 10Analytics: ReadingDepth schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209051 (10fdans) a:05fdans>03None
[17:37:35] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Get sane column names in hive queries without table name - https://phabricator.wikimedia.org/T208774 (10fdans) 05Open>03Resolved
[17:37:55] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10herron) >>! In T209300#4740111, @Ottomata wrote: > I believe we had this pro...
[17:39:30] <elukey>	 so the issue is that on line 321 we need to parse
[17:39:31] <elukey>	  WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
[17:39:40] <elukey>	 in (key,value) = p.split('=')
[17:40:07] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10Ottomata) I'm fairly certain there shouldn't be any streth hosts using 0.9.3...
[17:41:36] <wikibugs>	 10Analytics: Consider changing Phab projects to sub-projects - https://phabricator.wikimedia.org/T208798 (10Milimetric) @Aklapper ok, team agrees that this is ok to do, so here is the list of projects that we're fine changing to sub-projects.  How do we proceed from here?  #analytics-dashiki  #analytics-eventlog...
[17:42:29] <wikibugs>	 (03PS1) 10Joal: Set int namespace inhive mediawiki schemas [analytics/refinery] - 10https://gerrit.wikimedia.org/r/473052
[17:42:34] <elukey>	 joal: do you think that we could try to quicky remove the logging change that you did on an-coord1001 to see if it is the culprit?
[17:42:44] <joal>	 sure we can
[17:43:08] <wikibugs>	 (03PS2) 10Joal: Set int namespace in hive mediawiki schemas [analytics/refinery] - 10https://gerrit.wikimedia.org/r/473052
[17:43:26] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10MoritzMuehlenhoff) >>! In T209300#4740139, @Ottomata wrote: > I'm fairly cer...
[17:44:30] <elukey>	 joal: so I'd try removing /etc/hive/conf/java-logging.properties, do you think it would be enough?
[17:44:32] <wikibugs>	 (03CR) 10Ottomata: [C: 031] "Hm, too bad that doesn't work in spark.  Also too bad we can't just cast it to bigint when sqooping.  Oh well, this is fine, these are mos" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/473052 (owner: 10Joal)
[17:44:42] <milimetric>	 a-team: take a look at https://phabricator.wikimedia.org/T208798#4740142 in case you disagree with any of the projects I'm suggesting to convert to sub-projects
[17:44:57] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10MoritzMuehlenhoff) >>! In T209300#4740111, @Ottomata wrote: > I believe we h...
[17:45:14] <mforns>	 lgtm
[17:46:09] <joal>	 elukey: not enough I think - The file is referenced is hive-end.sh
[17:46:10] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10Ottomata) Oh hm.  There are no prod services running on the stat boxes.  We...
[17:46:17] <joal>	 Thanks ottomata for the quick review
[17:46:40] <elukey>	 ahhh okok
[17:46:59] <elukey>	 I'll remove it from there
[17:47:16] <joal>	 Yup I think it's the correct way elukey
[17:47:50] <elukey>	 nope seems not to be the issue
[17:47:58] <joal>	 :(
[17:48:01] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10Ottomata) The only one there that we should check on for sure is wdqs1009.eq...
[17:48:31] <elukey>	 but on the internet there seems to be a ton of people complaining for it
[17:49:04] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10MoritzMuehlenhoff) I think there are also some inconsistencies in the applic...
[17:49:06] <joal>	 elukey: https://community.hortonworks.com/questions/34311/warning-message-in-hive-output-after-upgrading-to.html
[17:49:47] <elukey>	 https://www.ericlin.me/2017/12/hive-cli-prints-slf4j-error-to-standard-output/
[17:49:53] <elukey>	 aahhaah
[17:50:16] <elukey>	 "Luckily, the latest CDH release, in fact from CDH 5.12.0, Cloudera has backported an upstream JIRA HIVE-12179, which added a checking for environment variable called “HIVE_SKIP_SPARK_ASSEMBLY”. So we can use this variable to disable the loading of Spark JARs for Hive CLI if you do not need to use Hive on Spark."
[17:51:19] <joal>	 +1 for trying that elukey 
[17:52:47] <elukey>	 works!!
[17:52:52] <joal>	 \o/ !!!!
[17:53:03] <joal>	 Well done elukey :)
[17:55:20] <icinga-wm>	 RECOVERY - Check the last execution of refinery-drop-webrequest-raw-partitions on an-coord1001 is OK: OK: Status of the systemd unit refinery-drop-webrequest-raw-partitions
[17:58:01] <aharoni>	 hallo
[17:58:12] <aharoni>	 milimetric: around? can you please take another look at https://gerrit.wikimedia.org/r/#/c/analytics/limn-language-data/+/469390/ ?
[17:59:00] <wikibugs>	 (03PS3) 10Joal: Set int namespace in hive mediawiki schemas [analytics/refinery] - 10https://gerrit.wikimedia.org/r/473052 (https://phabricator.wikimedia.org/T209179)
[17:59:38] <milimetric>	 hi aharoni 
[18:00:17] <joal>	 elukey: can we merge the wikitext timer patch?
[18:00:27] <wikibugs>	 (03CR) 10Milimetric: [C: 032] Add scheduling for Content Translation MT engine data [analytics/limn-language-data] - 10https://gerrit.wikimedia.org/r/469390 (https://phabricator.wikimedia.org/T207765) (owner: 10Amire80)
[18:00:43] <milimetric>	 looks good aharoni, merged
[18:01:06] <joal>	 also elukey, can we finalize aqs dpeloy?
[18:01:07] <elukey>	 joal: yep, I am about to roll restart aqs first
[18:01:08] <elukey>	 :)
[18:01:18] <joal>	 Right :) Better order :)
[18:01:23] <joal>	 Thanks :)
[18:03:29] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10Ottomata) > It has pros and cons: The downside of using backports is that it...
[18:03:47] <aharoni>	 milimetric: oh, that was easy :)
[18:04:07] <aharoni>	 so will you deploy it? I think I'll be able to set up the Dashiki JSON page myself
[18:04:29] <milimetric>	 aharoni: once merged, reportupdater changes are deployed automatically
[18:04:32] <elukey>	 joal: aqs updated
[18:04:38] <milimetric>	 oh, wait, is this a new repository, sorry
[18:05:00] <milimetric>	 ah, yes, sorry aharoni, I have to add a puppet definition for it, will do that soon
[18:05:09] <aharoni>	 milimetric: not a new Gerrit repo. a new directory in an existing repo.
[18:05:20] <milimetric>	 aharoni: yeah, that counts as a new job
[18:05:50] <milimetric>	 aharoni: btw, you might want to consider consolidating these folders, seems like they're measuring similar things
[18:06:07] <aharoni>	 milimetric: yeah, we'll probably do a cleanup some time soon.
[18:06:09] <milimetric>	 and if you deploy a new report in an existing folder, it's automatic
[18:06:20] <milimetric>	 k, I'll do the puppet change for this then
[18:06:47] <aharoni>	 milimetric: oh, OK. didn't know it.
[18:07:30] <icinga-wm>	 RECOVERY - Check the last execution of refinery-drop-apiaction-partitions on an-coord1001 is OK: OK: Status of the systemd unit refinery-drop-apiaction-partitions
[18:07:48] <elukey>	 ottomata,joal - https://gerrit.wikimedia.org/r/#/c/operations/puppet/cdh/+/473053/
[18:08:02] <elukey>	 this fixes the recent issue after the upgrade
[18:08:08] <elukey>	 lemme know if it is ok
[18:08:18] <elukey>	 if so I am going to merge now
[18:08:21] <elukey>	 (brb)
[18:09:00] <icinga-wm>	 RECOVERY - Check the last execution of refinery-drop-cirrussearchrequestset-partitions on an-coord1001 is OK: OK: Status of the systemd unit refinery-drop-cirrussearchrequestset-partitions
[18:09:10] <icinga-wm>	 RECOVERY - Check the last execution of refinery-drop-webrequest-refined-partitions on an-coord1001 is OK: OK: Status of the systemd unit refinery-drop-webrequest-refined-partitions
[18:11:17] <milimetric>	 k aharoni, added the change https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/473056/ and when someone in ops merges it, the job will start creating data (unless there's something wrong I missed in which case do take a look and submit a patch)
[18:12:12] <elukey>	 milimetric: do you want me to merge?
[18:12:38] <milimetric>	 elukey: sure, thanks!
[18:15:23] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Copy monthly XML files from public-dumps to HDFS - https://phabricator.wikimedia.org/T202489 (10ArielGlenn) Excuse me for butting in at this late date but these files are already available from labstore1006,7 to labs instances and on stats100...
[18:15:35] <elukey>	 milimetric: Notice: /Stage[main]/Profile::Reportupdater::Jobs::Mysql/Reportupdater::Job[mt_engines]/Cron[reportupdater_limn-language-data-mt_engines]/ensure: created
[18:15:39] <elukey>	 on stat1006
[18:15:49] <milimetric>	 elukey: thanks very much
[18:16:01] <milimetric>	 aharoni: this means your job will start within an hour or so ^
[18:17:12] <elukey>	 joal: your patch for xml dumps is merged but puppet is disabled on an-coord, I'll re-enable once andrew/you review the hive-env.sh change
[18:17:23] <elukey>	 I am going to run now for ~1h, then I'll recheck :)
[18:18:11] * elukey off
[18:21:33] <joal>	 Looks like we have an error on pageview-hourly job
[18:23:04] <joal>	 I recall having seen that error when we were testing with Luca
[18:25:56] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Anomie) Note the `actor` view will likely turn out to have similar issues.  As suggested in T209031#4736006, one solution woul...
[18:32:54] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Copy monthly XML files from public-dumps to HDFS - https://phabricator.wikimedia.org/T202489 (10JAllemandou) >>! In T202489#4740221, @ArielGlenn wrote: > Excuse me for butting in at this late date but these files are already available from la...
[18:44:15] <ottomata>	 a-team, i'm 2nd guessing what I just said about time field naming conventions
[18:44:24] <ottomata>	 maybe we shouldn't bother having a convention here
[18:44:35] <ottomata>	 it gets a little awkard
[18:44:43] <milimetric>	 sounds fine, yeah, like, maybe suggest that each team try to stay consistent
[18:44:49] <mforns>	 fine by me too
[18:46:32] <ottomata>	 mforns:  in El2druid, you assume everything is  a dimension, yes?
[18:46:35] <ottomata>	 unless marked as a measure?
[18:48:44] <joal>	 something is failing between oozie and hive for pageviews :(
[18:50:00] <joal>	 I successfully ran the query manually in hive and beeline, so it must be something else
[18:51:06] <nuria>	 joal: how can help? are those the errors we are seeing on e-mail?
[18:51:20] <joal>	 They are the ones we see on email
[18:51:54] <nuria>	 joal: i see, we did do the update today for 5.15  right?
[18:52:00] <joal>	 nuria: correct
[18:52:01] <ottomata>	 i can help in a bit...
[18:52:08] <joal>	 nuria: I assume it is related
[18:52:48] <nuria>	 joal: ok, did the "real" error in application Id  had any useful info?
[18:52:48] <fdans>	 joal nuria: I'm looking at it too
[18:53:04] <joal>	 https://hue.wikimedia.org/jobbrowser/jobs/job_1542030691525_0212/single_logs
[18:53:26] <fdans>	 nuria joal this is the weird serialization error right?
[18:54:10] <joal>	 could be fdans - It seems to ring a bell for you :)
[18:54:30] <fdans>	 joal: nono I was looking at the logs and just checking that we're on the same page
[18:55:11] <ottomata>	 hm, does refinery need rebuilt with new version of hive deps?
[18:55:46] <joal>	 ottomata: webrequest succeeds, and it uses a lot more UDFs than pageview
[18:55:59] <ottomata>	 OH, joal... maybe oozie sharelib?...?
[18:56:08] <ottomata>	 you say this query is fine in general?
[18:56:27] <joal>	 ottomata: we thought about that with elukey, but said it was acutally not needed because there was not global oozie upgrade
[18:56:33] <joal>	 ottomata: maybe it is the case
[18:56:55] <joal>	 ottomata: However I don't understand how webrequest-refine works, and not pageview ???
[18:56:56] <ottomata>	 right but not oozie upgrade, if hive version changed then maybe protocol changed too?
[18:57:10] <ottomata>	 the fact that this is some thrift serialization error hints to me that there is some version mismatch somewhere
[18:57:17] <joal>	 k
[18:57:34] <ottomata>	 joal:  maybe it just just some particular part of the thrift protocl that changed that webrequest refine doesn't encounter?
[18:57:35] <ottomata>	 dunno
[18:57:42] <nuria>	 joal: are pageview and webrequest using the same jar version for refine?
[18:57:45] <ottomata>	 you say this query works outside of oozie, right?
[18:57:46] <nuria>	 joal: let me look
[18:57:58] <joal>	 correct ottomata - query works outside of oozie
[18:58:03] <ottomata>	 hm, right wait, these are just oozie hive queries
[18:58:07] <ottomata>	 no refinery source here
[18:58:12] <joal>	 exact
[18:59:48] <joal>	 ottomata: Sonething interestingly related - medicounts-load doesn't fail
[19:00:20] <joal>	 It is the exact same job typology
[19:00:28] <ottomata>	 hm
[19:00:40] <ottomata>	 does this job consistently fail or is it just this one?
[19:00:51] <joal>	 ottomata: fails since dpeloy
[19:01:03] <joal>	 s/deploy/upgrade
[19:01:36] <nuria>	 joal: i see, pageview hourly does not have  a refinery version 
[19:02:22] <joal>	 First thing - Trying to rerun one of the jobs (in
[19:02:25] <joal>	  case)
[19:03:29] <ottomata>	 nuria:  ya it doesn't use refinery source
[19:03:35] <ottomata>	 its just an oozie job + a hive query
[19:04:18] <ottomata>	 joal:  i'm thikning this is an oozie sharelib problem.  the hive jars in the oozie sharelib are different than the ones in /usr/lib/hive
[19:04:35] <joal>	 ottomata: Let's try
[19:04:56] <nuria>	 ottomata: and how do we rebuild the jars?
[19:05:23] <ottomata>	 its not a rebuild, its an oozie sharelib update
[19:05:34] <ottomata>	 nuria:  there is a directory in hdfs /user/oozie/share/...
[19:05:41] <ottomata>	 that oozie uses to load shared deps when it runs
[19:05:53] <ottomata>	 there is an oozie cli tool that helps
[19:05:59] <ottomata>	 but i'm looking for the right thing to do here...
[19:06:25] <ottomata>	 i can't tell if the hive version itself has changed tho
[19:06:36] <ottomata>	 let's check that, if it hasn't this shouldn't be the problem... 'shouldn't'
[19:06:46] <ottomata>	 joal:  do you know which version of cdh we were on before?
[19:06:49] <joal>	 ottomata: not sure about the version, but a schema update (small) was needed
[19:06:58] <ottomata>	 a schema update for what?  hive?
[19:07:00] <joal>	 ottomata: we were on 5.10
[19:07:03] <joal>	 yes ottomata 
[19:07:09] <joal>	 schema update for hive
[19:07:35] <joal>	 ottomata: technically, how do you want to proceed - Suspend oozie?
[19:07:37] <ottomata>	 hm, weird.  afaict the hive version hasn't changed
[19:07:39] <ottomata>	 still 1.1.0
[19:07:48] <ottomata>	 joal:  not sure yet...
[19:07:51] <ottomata>	 still reading/googling
[19:07:56] <joal>	 k
[19:08:03] <joal>	 nuria: 1-1 in the meatime?
[19:09:14] <nuria>	 joal: sure
[19:13:48] <ottomata>	 joal, i think we should make a new oozie sharelib version using the cli, and then also re run our spark2_oozie_sharelib_install
[19:14:03] <ottomata>	 this step needs to happen during upgrades it hink:
[19:14:03] <ottomata>	 https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_ig_oozie_configure.html#concept_zpq_5j5_cn
[19:14:06] <ottomata>	 $ sudo oozie-setup sharelib create -fs <FS_URI> -locallib /usr/lib/oozie/oozie-sharelib-yarn
[19:14:17] <ottomata>	 and then since we install a custom spark2 lib, we do that too after we make the new one
[19:14:29] <ottomata>	 good news is that running this command will create a new (timestamp versioned) lib dir
[19:14:36] <ottomata>	 so the old one will still be around if we need to roll back
[19:14:43] <ottomata>	 lemme know when you are ready and we can try
[19:14:59] <joal>	 ottomata: it was in our plan with elukey, but we thought it would not be needed
[19:15:37] <ottomata>	 i could be wrong, it might be needed, but it is probably always a good idea
[19:15:45] <ottomata>	 to make sure that the jars oozie uses are the same ones we  use
[19:15:46] <joal>	 +1 elukey 
[19:19:11] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Copy monthly XML files from public-dumps to HDFS - https://phabricator.wikimedia.org/T202489 (10ArielGlenn) I mean, it's fine, but maybe it's better to just provide them as is done on stat100? (5? 7?) via nfs mount from labstore1006 (7?). Loo...
[19:23:26] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Copy monthly XML files from public-dumps to HDFS - https://phabricator.wikimedia.org/T202489 (10Ottomata) @ArielGlenn, they need to be copied into HDFS inside of Hadoop, not just available on a regular filesystem.  @JAllemandou, I think it wo...
[19:23:33] <ottomata>	 joal:  lemme know whne you wanna...
[19:23:39] <ottomata>	 or, we can wait for elukey
[19:24:17] <elukey>	 I am here!
[19:24:23] <elukey>	 sorry just got back from the run
[19:24:26] <elukey>	 reading 
[19:24:55] <joal>	 I'm here ottomata 
[19:26:03] <elukey>	 ah it makes sense! I didn't see that part
[19:26:25] <elukey>	 if it doesn't work we can merge my change
[19:26:30] <ottomata>	 elukey:  what is your change?
[19:26:51] <elukey>	 ottomata: you +1ed it https://gerrit.wikimedia.org/r/473053 :D
[19:27:16] <elukey>	 I haven't merged it yet, but only tried it manually
[19:27:18] <elukey>	 on an-coord
[19:27:20] <elukey>	 and it seems to work
[19:27:35] <elukey>	 (the drop jobs got unblocked)
[19:28:14] <elukey>	 it is on an-coord now (puppet disabled)
[19:28:20] <elukey>	 (brb in 10 mins)
[19:28:22] <ottomata>	 ph yes yes
[19:28:25] <ottomata>	 not related cool
[19:28:48] <ottomata>	 ok joal i'm going to run the sharelib update stuff..
[19:28:55] <joal>	 k ottomata 
[19:30:15] <ottomata>	 !log running oozie-setup sharelib create and then spark2_oozie_sharelib_install
[19:30:16] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[19:32:50] <ottomata>	 joal:  done, looks good i think
[19:32:56] <aharoni>	 milimetric: thanks!
[19:32:57] <ottomata>	 spark 2.3.1 was recreated as expected too
[19:33:09] <ottomata>	 try rerunnign job and cross your fingers!
[19:33:28] <joal>	 Thanks ottomata - rerunin 1 job to test
[19:35:27] <elukey>	 all rigt merging!
[19:36:42] <joal>	 ottomata: SUCCESS !
[19:36:43] <elukey>	 ah sorry ottomata I lost the pageview reference that joal was mentioning, sorry for the noise :(
[19:37:01] <joal>	 plenty thanks agains :)
[19:38:28] <ottomata>	 great, glad that was it!
[19:38:33] <ottomata>	 was kinda just a guess but felt right!~
[19:39:46] <ottomata>	 a-team, i made changes to the schema guidelines doc, mostly just removed measure_ naming convention
[19:40:08] <mforns>	 k
[19:40:22] <elukey>	 ottomata: is it a thing that needs to be done for each upgrade? if so, can you add it to https://etherpad.wikimedia.org/p/analytics-cdh5.15 ?
[19:40:47] <elukey>	 joal: refinery-import-page-history-dumps.service deployed
[19:40:54] <joal>	 Thanks elukey !
[19:40:59] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Ottomata) Let's make this happen!  @Andrew are you ok with cloudvirt-anXXXX?  @Cmjohnson would prefer to coordinate racki...
[19:41:19] <joal>	 elukey: We configured it to run at 3am, so we'll see tomorrow morning :)
[19:41:35] <elukey>	 ack
[19:41:41] <elukey>	 going to dinner people!
[19:41:49] <elukey>	 sorry for the oozie shlib issue :(
[19:41:55] <joal>	 Bye elukey - Ssame for me - diner
[19:42:02] <ottomata>	 laters!
[19:43:04] <milimetric>	 ottomata: looks good to me, like how it mentions the Druid ingestion now, I think it's ready to not be a "Draft" now
[19:53:14] <wikibugs>	 (03CR) 10Ottomata: "This is great, nice code Marcel!" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471279 (https://phabricator.wikimedia.org/T199836) (owner: 10Mforns)
[19:53:43] <wikibugs>	 (03CR) 10Ottomata: "Ah yes Joal let's talk about this I think I don't understand." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/471693 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal)
[19:57:51] <nuria>	 elukey: added steps to doc
[19:57:57] <nuria>	 elukey: for lib creation
[20:06:46] <nuria>	 joal: we have not yet deployed snapshot to aqs right? 
[20:06:51] <nuria>	 joal: https://wikimedia.org/api/rest_v1/metrics/bytes-difference/net/aggregate/wikidata/all-editor-types/all-page-types/daily/20180901/20181001
[20:07:00] <nuria>	 joal: as this returns data up to sep
[20:14:23] <wikibugs>	 (03CR) 10Framawiki: [C: 032] view.css: add space between resultsets [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/472870 (owner: 10Framawiki)
[20:14:41] <wikibugs>	 (03CR) 10Framawiki: [C: 032] default_config.yaml: set default maintenance msg [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/472871 (https://phabricator.wikimedia.org/T205221) (owner: 10Framawiki)
[20:15:17] <wikibugs>	 (03Merged) 10jenkins-bot: view.css: add space between resultsets [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/472870 (owner: 10Framawiki)
[20:15:30] <wikibugs>	 (03Merged) 10jenkins-bot: default_config.yaml: set default maintenance msg [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/472871 (https://phabricator.wikimedia.org/T205221) (owner: 10Framawiki)
[20:16:06] <wikibugs>	 (03PS11) 10Framawiki: Handle bad output_result endpoint params [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/468866 (https://phabricator.wikimedia.org/T205222)
[20:16:34] <wikibugs>	 10Quarry, 10Patch-For-Review: Add a visual differentiation between prod and dev - https://phabricator.wikimedia.org/T205221 (10Framawiki) 05Open>03Resolved a:03Framawiki
[20:19:07] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10herron) >>! In T209300#4740170, @Ottomata wrote: > The only one there that w...
[20:19:51] <wikibugs>	 10Analytics, 10Operations, 10Wikimedia-Logstash, 10Core Platform Team Backlog (Watching / External), and 2 others: Review and make librdkafka-0.11.6 installable from stretch-wikimedia - https://phabricator.wikimedia.org/T209300 (10Ottomata) Ok +1
[20:20:42] <wikibugs>	 (03CR) 10Framawiki: [C: 032] Handle bad output_result endpoint params [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/468866 (https://phabricator.wikimedia.org/T205222) (owner: 10Framawiki)
[20:20:49] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: bothersome output in hive when querying events database - https://phabricator.wikimedia.org/T208550 (10Nuria) This worked great and bogus output is no longer there.
[20:20:58] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: bothersome output in hive when querying events database - https://phabricator.wikimedia.org/T208550 (10Nuria) 05Open>03Resolved
[20:21:14] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Deprecate reportcard: https://analytics.wikimedia.org/dashboards/reportcard/ - https://phabricator.wikimedia.org/T203128 (10Nuria) 05Open>03Resolved
[20:21:25] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update to cloudera 5.15.0 - https://phabricator.wikimedia.org/T204759 (10Nuria) 05Open>03Resolved
[20:21:35] <wikibugs>	 (03Merged) 10jenkins-bot: Handle bad output_result endpoint params [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/468866 (https://phabricator.wikimedia.org/T205222) (owner: 10Framawiki)
[20:25:55] <wikibugs>	 10Quarry, 10Patch-For-Review: The absence of resultset/unknown number case is not handled - https://phabricator.wikimedia.org/T205222 (10Framawiki) 05Open>03Resolved
[20:31:09] <wikibugs>	 10Analytics-Kanban, 10Patch-For-Review: Fix refinery-source jenkins build/release jobs - https://phabricator.wikimedia.org/T208377 (10Nuria)
[20:31:20] <wikibugs>	 10Analytics-Kanban: Update alert email address in oozie mediawiki-load job - https://phabricator.wikimedia.org/T208294 (10Nuria) 05Open>03Resolved
[20:32:36] <nuria>	 joal: ah no, wait data is deployed now for october
[20:37:17] <joal>	 right nuria - new snapshot has been dpeloyed
[20:43:47] <wikibugs>	 (03PS7) 10Framawiki: app.py: EXPLAIN needs to be executed on the good server [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/462286 (https://phabricator.wikimedia.org/T205214)
[20:43:51] <wikibugs>	 (03CR) 10Framawiki: app.py: EXPLAIN needs to be executed on the good server (032 comments) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/462286 (https://phabricator.wikimedia.org/T205214) (owner: 10Framawiki)
[20:44:33] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] app.py: EXPLAIN needs to be executed on the good server [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/462286 (https://phabricator.wikimedia.org/T205214) (owner: 10Framawiki)
[20:46:35] <wikibugs>	 10Quarry: Create a beta host - https://phabricator.wikimedia.org/T209119 (10Framawiki) Created a "quarry-beta-01.quarry.eqiad.wmflabs" instance, I'll manually populate it w/o puppet nor nfs for a first test.
[20:56:46] <ottomata>	 milimetric:  if you have some moments, would love to brain bounce with ya some stream intake stuff
[20:57:01] <milimetric>	 ottomata: yeah, cave?
[20:57:32] <ottomata>	 ya 5 mins...
[20:57:46] <milimetric>	 k, I'll wait in there
[20:58:03] <ottomata>	 (making some tea and a snack...)\
[21:05:00] <joal>	 milimetric: May I take 5 minutes of uour time?
[21:05:12] <milimetric>	 hey joal in cave with otto
[21:05:18] <joal>	 cool, joining
[21:21:06] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10JAllemandou) Thanks @Anomie. We (analytics team) also had thought of a third potential solution. I list the 3 solutions below...
[21:22:49] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Copy monthly XML files from public-dumps to HDFS - https://phabricator.wikimedia.org/T202489 (10JAllemandou) @Ottomata , @ArielGlenn  - I'm ok with copying the dumps from a stat machine. Let's see what @elukey thinks of it
[21:30:21] <wikibugs>	 (03CR) 10Zhuyifei1999: app.py: EXPLAIN needs to be executed on the good server (031 comment) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/462286 (https://phabricator.wikimedia.org/T205214) (owner: 10Framawiki)
[22:49:00] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Andrew) I'd prefer without the dash -- just cloudvirtan1XXX if cloudvirtanalytics1xxx won't fit.
[22:53:09] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Ottomata) Ok, @Cmjohnson your call then: we'd prefer cloudvirtanalytics1xxx, but if that is too long, then use cloudvirta...
[23:03:07] <wikibugs>	 10Quarry: Create a beta host - https://phabricator.wikimedia.org/T209119 (10Krenair) It may be worth making the puppet manifest that can be told which environment it's in and choose appropriately.
[23:16:30] <wikibugs>	 (03PS7) 10Mforns: Add bin/refinery-drop-older-than [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471279 (https://phabricator.wikimedia.org/T199836)
[23:21:41] <wikibugs>	 (03CR) 10Mforns: "OK, unit tests are complete." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471279 (https://phabricator.wikimedia.org/T199836) (owner: 10Mforns)