[00:38:49] <AndyRussG>	 Hey... anyone know about how quickly the Hive EventLogging tables get populated? Thanks!!! :)
[00:45:07] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4188820 (10Denny) Re Pintoch:  No, I was seriously not aware that we are uploading datasets   > I think it is fair to say that this is not exactly an isolated case (but I am surprised t...
[00:50:28] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4188828 (10Denny) Re Psychoslave:  Having a statement in Wikidata with a reference, where the referenced work is not published under CC-0, is entirely fine in my understanding.  As a co...
[01:19:55] <milimetric>	 AndyRussG: I'm not sure off the top of my head, but you can just select the max timestamp from the table you're interested to see
[01:27:10] <icinga-wm>	 PROBLEM - Number of segments reported as unavailable by the Druid Coordinators -Analytics cluster- on einsteinium is CRITICAL: 16 gt 10 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All
[01:27:50] <AndyRussG>	 milimetric: ah hey great idea, thx :)
[01:28:29] <AndyRussG>	 (The table I'm really interested in is not being populated exactly now, but I'm sure I can find one that is, to get an idea...)
[01:29:51] <milimetric>	 AndyRussG: oh I see, yeah, let me know when you find out, should be a few hours I'm assuming
[01:35:47] <AndyRussG>	 milimetric: I imagine it's pretty similar to webrequest, eh?
[01:36:29] <milimetric>	 AndyRussG: no, that I know more intimately, but the process here is very different
[01:36:33] <AndyRussG>	 At this point I see just a 1hr 30 minute delay about
[01:36:40] <AndyRussG>	 oh hmm interesting
[01:36:41] <milimetric>	 it's a totally different pipeline because it comes through python
[01:36:50] <milimetric>	 yeah, so webrequest has at least 2 hours delay
[01:39:30] <AndyRussG>	 milimetric: I guess lemme bother you quickly for some more questions regarding this task? T186048
[01:39:31] <stashbot>	 T186048: Adapt Druid banenr_activity jobs to EventLogging-based impression recording - https://phabricator.wikimedia.org/T186048
[01:39:54] <AndyRussG>	 Basically we're porting our old special snowflake banner logging to EL
[01:40:23] <milimetric>	 ok
[01:40:27] <AndyRussG>	 :)
[01:40:51] <AndyRussG>	 So the existing Druid ingress pipeline queries webrequest
[01:40:56] <AndyRussG>	 in Hive
[01:41:43] <AndyRussG>	 https://github.com/wikimedia/analytics-refinery/blob/master/oozie/banner_activity/druid/daily/generate_daily_druid_banner_activity.hql
[01:42:05] <AndyRussG>	 So the idea is just to port this over to pull data from the EL Hive tables instead
[01:42:13] <milimetric>	 oh cool, ok
[01:43:00] <AndyRussG>	 Though we could instead continue to query webrequest, since EL events are logged via an HTTP request, in any case
[01:43:19] <AndyRussG>	 I assume the recommended path is to go to the EL Hive tables, though
[01:43:40] <AndyRussG>	 That means there's less data for the Druid ingress jobs to sift through
[01:44:04] <AndyRussG>	 So I guess less Hive query cycles consumed
[01:44:32] <milimetric>	 yeah, querying webrequest is definitely a needle in a haystack
[01:44:43] <AndyRussG>	 I mean, it works so far
[01:44:56] <AndyRussG>	 Though we don't have much visibility into how much of a draw it may be
[01:44:57] <milimetric>	 a lot of people querying webrequest at the same time for different things is definitely going to end up in a lot of people getting stabbed with needles
[01:45:13] <AndyRussG>	 heheh awww
[01:45:18] <AndyRussG>	 It looks like all the data we need is in the EL Hive tables
[01:45:20] <milimetric>	 it's terrabytes of data and ridiculous amounts of CPU
[01:45:39] <milimetric>	 you had mentioned something on the task about some fields not being available, that's all sorted out?
[01:45:59] <AndyRussG>	 Yeah all the fields we really need are there
[01:46:25] <AndyRussG>	 There were a few tidbits of data points in webrequest that I had my eye on for future Druid dimensions
[01:46:41] <milimetric>	 ok, great, yeah, this will help a lot, plus it should be a lot more agile for you because we actively discourage sifting through webrequest but you can build more confidently on top of the EL pipeline
[01:46:58] <AndyRussG>	 Yeah that's what I imagined
[01:47:17] <milimetric>	 well, as long as you plan changes to your schema that are backwards compatible, you can just evolve the schema.  Like, only adding and not renaming fields
[01:47:26] <AndyRussG>	 right
[01:48:40] <AndyRussG>	 nocookie, proxy, other fancy queries for filtering out bots, looked nice for future stuff to add into Druid... However, as you suggest, better to build on the EL pipeline, no?
[01:48:51] <AndyRussG>	 as in, if we need more stuff, try to get those added to the EL Hive tables
[01:51:09] <wikibugs_>	 (03PS1) 10Milimetric: Update geowiki aggregation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T191343)
[01:51:11] <wikibugs_>	 (03PS1) 10Milimetric: Rename geowiki to geoeditors, step 1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431689
[01:51:13] <wikibugs_>	 (03PS1) 10Milimetric: Rename geowiki to geoeditors, step 2 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431690
[01:51:16] <AndyRussG>	 make sens?
[01:51:19] <AndyRussG>	 sense
[01:51:26] <AndyRussG>	 thx!
[01:51:51] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Vet new geo wiki data - https://phabricator.wikimedia.org/T191343#4188995 (10Milimetric) Also renamed geowiki to geoeditors in https://gerrit.wikimedia.org/r/#/c/431690/
[01:52:31] <milimetric>	 AndyRussG: yeah, exactly
[01:53:13] <AndyRussG>	 :)
[01:53:17] <milimetric>	 and AndyRussG, we're going to do a big project soon where we look at identifying bots in a more sophisticated way, so everyone will be able to benefit from that, it may not be worth it for you all to di it separately
[01:53:36] <AndyRussG>	 oooh fun
[01:54:06] <AndyRussG>	 route out those spammy scrapers!
[01:56:18] <AndyRussG>	 yeah we had to dig into that a few times when we saw anomalous banner impression rates (i.e. banner impressions/pageviews for a specific segments of users)
[01:56:41] <icinga-wm>	 RECOVERY - Number of segments reported as unavailable by the Druid Coordinators -Analytics cluster- on einsteinium is OK: (C)10 gt (W)5 gt 1 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All
[01:58:00] <milimetric>	 AndyRussG: yeah, it's a tough problem in general, but in specific cases you can kind of build decent heuristics
[01:58:19] <milimetric>	 k, nitey nite time for me, ping me on the task if you have more questions
[01:58:38] <AndyRussG>	 milimetric: thx again, cya :)
[02:52:33] <wikibugs_>	 10Analytics, 10Discovery-Analysis, 10Product-Analytics: Get 'sparklyr' working on stats1005 - https://phabricator.wikimedia.org/T139487#4189097 (10GoranSMilovanovic) @Ottomata In relation to T139487#4161142 ("Anyone can kill YARN jobs that they own"), from **stat1005**:  ```yarn kill -applicationId applicati...
[03:04:45] <wikibugs_>	 10Analytics, 10Discovery-Analysis, 10Product-Analytics: Get 'sparklyr' working on stats1005 - https://phabricator.wikimedia.org/T139487#4189106 (10chelsyx) @GoranSMilovanovic This may work: `yarn application -kill application_1523429574968_100322`
[03:11:58] <wikibugs_>	 10Analytics, 10Discovery-Analysis, 10Product-Analytics: Get 'sparklyr' working on stats1005 - https://phabricator.wikimedia.org/T139487#4189109 (10GoranSMilovanovic) @chelsyx Thanks! I'll try it out.
[03:52:50] <icinga-wm>	 PROBLEM - Number of segments reported as unavailable by the Druid Coordinators -Analytics cluster- on einsteinium is CRITICAL: 36 gt 10 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All
[04:20:20] <icinga-wm>	 RECOVERY - Number of segments reported as unavailable by the Druid Coordinators -Analytics cluster- on einsteinium is OK: (C)10 gt (W)5 gt 5 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All
[06:20:43] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4189219 (10Psychoslave) >>! In T193728#4188828, @Denny wrote: > Re Psychoslave: >  > Having a statement in Wikidata with a reference, where the referenced work is not published under CC...
[06:26:11] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4189225 (10Psychoslave) >>! In T193728#4188820, @Denny wrote: > I will start a discussion on the Project Chat, in order to gather more visibility.  Thank you very much for that. I see t...
[06:41:10] <elukey>	 !log rolling restart of druid-historicals on druid100[456] due to half of the segments not avaiable
[06:41:11] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:52:10] <elukey>	 still recovering, it happened the same on the analytics cluster before
[07:08:59] <elukey>	 the metric looks good, the coordinator's UI is now showing less than 100 segments to load (~19G)
[07:09:36] <elukey>	 now the logs are clear
[07:09:54] <elukey>	 exactly the same (hopefully) one time problem that happened before
[07:10:22] <elukey>	 I am pretty sure the segments format change from 0.9 to 0.10 has something to do with this
[07:10:46] <elukey>	 because after a rolling restart the historicals start to realize about missing segments, the need for a delete/reload, etc..
[07:10:52] <elukey>	 and then they reach a clear state
[07:16:25] <elukey>	 in https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&orgId=1&var-cluster=druid_public&var-druid_datasource=All&from=now-24h&to=now&var-datasource=eqiad%20prometheus%2Fanalytics&panelId=46&fullscreen
[07:16:30] <elukey>	 I added also the load queue count
[07:19:06] <elukey>	 mw-history-reduced is now 100% loaded on the coordinator's UI console \o/
[07:27:13] <icinga-wm>	 RECOVERY - Number of segments reported as unavailable by the Druid Coordinators -Public cluster- on einsteinium is OK: (C)10 gt (W)5 gt 5 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_public&var-druid_datasource=All
[07:27:20] <joal>	 Hi elukey 
[07:27:24] <joal>	 Thanks for the rolling restart
[07:27:32] <elukey>	 morning!
[07:27:42] <joal>	 elukey: I don' underdstand the segment format change thing though :(
[07:28:04] <elukey>	 ah me too, it is only a speculation from the log reading, might be something else entirely
[07:28:19] <joal>	 mwarf :(
[07:29:42] <joal>	 elukey: Have you managed to test a batch indexation on druid 0.11, or do you wish me to do so?
[07:32:11] <elukey>	 joal: I didn't manage to and I found a weird error in the log (it is reported in the task) but I am pretty sure it is me not fully aware of how indexation works
[07:32:21] <elukey>	 so if you have time it would be great
[07:32:25] <joal>	 elukey: doing it now :)
[07:32:29] <elukey>	 are you working today? If not please don't :)
[07:32:54] <joal>	 elukey: Working today (even if bank holiday), swapping it for tomorrow :)
[07:33:27] <joal>	 elukey: Melissa doesn';t work today, so she cares the kids, but she'll working tomorrow, so I'll do it :)
[07:37:12] <elukey>	 ack :)
[07:38:32] <elukey>	 joal: very interesting https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&orgId=1&var-cluster=druid_public&var-druid_datasource=All&from=now-3h&to=now&var-datasource=eqiad%20prometheus%2Fanalytics&panelId=46&fullscreen
[07:39:01] <joal>	 wow
[07:39:43] <joal>	 indeed elukey, yesterday was the first time we did an indexation on druid-public
[07:40:38] <elukey>	 yesterday I thought it was due to that but I didn't recheck in my late evening, and this morning I saw that the alarm wasn't clear yet, so I immediately checked the historicals and I found what I was expecting :(
[07:40:55] <joal>	 mwarf :(
[07:41:00] <elukey>	 in theory, this glitch happens only the first time after the upgrade
[07:41:04] <elukey>	 on the analytics cluster we are good
[07:41:09] <joal>	 that's what we have experienced yes
[07:42:20] <joal>	 !log Rerun mediawiki-history-druid-wf-2018-04 in a non-sync way with mediawiki-reduced
[07:42:21] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:42:58] <joal>	 elukey: except for the 2 druid jobs that fail when launched at the same time, MWH was a success this month !!
[07:43:08] <elukey>	 \o/
[07:43:14] * joal is feeling better, if not yet good
[07:43:46] <elukey>	 is mediawiki-history-druid-wf-2018-04 going to trigger another indexation?
[07:44:41] <joal>	 elukey: on analytics cluster, yes
[07:45:09] <elukey>	 ahhh okok
[07:45:10] <elukey>	 ack
[07:48:02] <joal>	 elukey: currently importing/refining recent fake data in hadoop labs, then index
[07:48:44] <wikibugs_>	 10Analytics, 10Patch-For-Review, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.24x-I-Ice-lolly): Include HTTP Referer header when navigating through internal links - https://phabricator.wikimedia.org/T192779#4189348 (10Tbayer)
[07:50:00] <wikibugs_>	 10Analytics, 10Patch-For-Review, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.24x-I-Ice-lolly): Include HTTP Referer header when navigating through internal links - https://phabricator.wikimedia.org/T192779#4149695 (10Tbayer) Great! Adding #Analytics so that the Analytics Engineering team is awa...
[08:01:24] <elukey>	 !log removed cassandra-metrics-collector (graphite) from aqs nodes
[08:01:25] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[08:05:05] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4178076 (10Gnom1) Hi, as the author of [[ https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights | the Wikilegal page on Wikidata ]], I'd love to help with this one.
[08:16:55] <elukey>	 so joal, let me know if the procedure that I followed yesterday made sense
[08:17:53] <elukey>	 1) I executed the camus job manually as described in https://gist.github.com/jobar/c40cd46ef86d832f9f1c1f8b4e5bb98f for the same hour but belonging to May 6th
[08:18:55] <elukey>	 in order to verify, I'd need to check for the IMPORTED flag (for the pull from kafka part) and then SUCCESS (ack for refinement?)
[08:19:21] <elukey>	 2) run the Druid indexation for the same hour and then check via query if the results are ok
[08:19:50] <joal>	 That looks good elukey - Some comments:
[08:20:16] <joal>	  - amus doesn't run for any given hour - It checks previously imported offset, and starts from there
[08:20:29] <elukey>	 that makes sense ok
[08:20:52] <joal>	 After camus, webrequest-refine is needed, for the _SUCCESS file to be there
[08:21:26] <joal>	 Actually, in between and associated to camus, we have the CamusPartitionChecker, that does the job of creating the _IMPORTED flag for hours fully imported
[08:23:30] <joal>	 elukey: I have done the same exact
[08:23:59] <joal>	 elukey: I'm now finalizing some indexation - 1 failed (I need to investigate), but looks like the next one is going to succeed
[08:26:56] <elukey>	 ack
[08:27:54] <elukey>	 so the PARTITIONED/SUCCESS Flags in raw are related to Hive, meanwhile the the SUCCESS one on refined is related to refinery succeeded for that hour right?
[08:28:19] <elukey>	 I keep confusing those
[08:28:23] <joal>	 elukey: _SUCCESS flags are the ones we use for simple datasets
[08:28:40] <joal>	 before being processed, no flag, once finished successfully , _SUCCESS flag
[08:28:57] <joal>	 For some datasets, the status is more complex
[08:30:01] <joal>	 For webrequest raw data, data is first flagged as _IMPORTED (by the CamusPartitionChecker) when a full hour is done - This triggers the webrequest.load oozie job, that creates the hive partition and refines the data
[08:30:06] <elukey>	 yep yep, I meant the meaning of those flags in  raw vs refined
[08:30:31] <joal>	 in webrequest refine, only _SUCCESS
[08:31:02] <joal>	 in mediawiki-history, _SUCCESS when the data is computed, then _PARTITIONED when hive tables know about the data (partitions added)
[08:31:22] <joal>	 elukey: It looks I have issues with indexation :(
[08:32:08] <joal>	 elukey: 2 failed reducers for my indexation task :(
[08:33:22] <elukey>	 I am checking index_hadoop_webrequest_2018-05-08T08:16:11.222Z.log in /var/lib/druid/indexing-logs
[08:33:29] <elukey>	 and it seems the same error that I got
[08:33:33] <joal>	 indeed
[08:33:48] <joal>	 I think it's classpath versions issues
[08:33:55] <elukey>	 Error in custom provider, java.lang.VerifyError: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize.(Lcom/fasterxml/jackson/core/JsonParser;Lcom/fasterxml/jackson/databind/DeserializationContext;)Ljava/lang/Object;
[08:34:02] <joal>	 elukey: have you managed to get rid of them?
[08:34:23] <elukey>	 nope, yesterday I didn't have time to properly investigate
[08:34:32] <joal>	 ok elukey - Will do so
[08:35:17] <elukey>	 but one thing that I noticed is that 0.11 sources include a new version of the hadoop-client, 2.7.x (vs 2.3 used previously). Now we do the trick of adding our own jars via the symlinked extension 
[08:35:22] <elukey>	 (at least, IIUC)
[08:35:35] <elukey>	 that is one of the solutions described in the doc that I linked in the task
[08:35:45] <joal>	 elukey: we run hadoop 2.6, we probably shouldn't use 2.7
[08:36:02] <elukey>	 in theory though it is not used
[08:36:36] <elukey>	 we have /usr/share/druid/extensions/druid-hdfs-storage-cdh
[08:37:41] <elukey>	 druid.extensions.loadList=["druid-datasketches","druid-hdfs-storage-cdh","druid-histogram","druid-lookups-cached-global","mysql-metadata-storage"]
[08:38:03] <elukey>	 mmm but maybe that part is only for loading from hdfs, not indexation?
[08:39:00] <elukey>	 http://druid.io/docs/0.11.0/operations/other-hadoop.html
[08:41:41] <joal>	 elukey: testing a ack right now
[08:42:55] <elukey>	 elukey@d-1:/usr/share/druid/hadoop-dependencies/hadoop-client$ ls
[08:42:56] <elukey>	 2.7.3  cdh
[08:43:13] <elukey>	 elukey@druid1003:~$ ls /usr/share/druid/hadoop-dependencies/hadoop-client
[08:43:16] <elukey>	 2.3.0  cdh
[08:43:18] <elukey>	 yeppa
[08:44:07] <elukey>	 do we use the hadoopDependencyCoordinates ?
[08:44:23] <joal>	 elukey: I have no clue
[08:45:31] <elukey>	 I am reading "Preferred: Load using Druid's standard mechanism"
[08:45:53] <elukey>	 in theory using 'cdh' should do the right thing for the map reduce jobs
[08:46:49] <elukey>	 it should be in the jobs specs sent to the overlord
[08:49:13] <elukey>	 afaics we don't set any version in /srv/deployment/analytics/refinery/oozie/webrequest/druid/hourly/load_webrequests_hourly.json.template
[08:49:51] <elukey>	 so up to 0.10, by default 2.3 was used, meanwhile now with 2.7.3 we are in trouble
[08:50:00] <elukey>	 but if we use cdh we should be fine
[08:51:40] <joal>	 elukey: following https://github.com/druid-io/druid/issues/2087, I tried to add ""mapreduce.job.user.classpath.first": "true"" to the template
[08:51:54] <joal>	 elukey: I can try with the exact version
[08:52:36] <elukey>	 joal: I'd do a simple run of the indexation job with hadoopDependencyCoordinates: "cdh"
[08:54:30] <joal>	 elukey: Looks like "hadoopDependencyCoordinates" is expecting a java-style  dependency
[08:56:01] <joal>	 elukey: I'm gonna try to various potential solutions suggested in the page you linked 
[08:56:35] <joal>	 elukey: Currently on "user.classpath.first", then will try "mapreduce.job.classloader = true"
[08:56:49] <elukey>	 joal: wait a sec
[08:57:11] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4178076 (10Tgr) I think this proposal mixes two different things in an unhelpful way: * Wikidata (AIUI) largely ignores legal protections on databases ("sui generis database rights") as...
[08:57:16] <elukey>	 in the doc it is suggested to test the hadoopdepCoordinates and Andrew already did the groundwork
[08:57:23] <elukey>	 what about "hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:cdh"] ?
[08:57:27] <elukey>	 can we test it>
[08:57:27] <elukey>	 ?
[08:57:56] <joal>	 elukey: we can test it
[08:58:18] <joal>	 elukey: tip#2 is about classPath properties change though
[08:59:13] <elukey>	 yeah but I am reading the example for "hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:2.4.0"] and it seems doing what we need
[08:59:25] <elukey>	 "This instructs Druid to load hadoop-client 2.4.0 when processing the task. What happens behind the scene is that Druid first looks for a folder called hadoop-client underneath druid.extensions.hadoopDependenciesDir, then looks for a folder called 2.4.0 underneath hadoop-client, and upon successfully locating these folders, hadoop-client 2.4.0 is loaded."
[08:59:31] <elukey>	 I might be super wrong of course :)
[08:59:41] <joal>	 elukey: look also after --> Notes on specific Hadoop distributions --> CDH
[09:00:44] <elukey>	 joal: yep yep you are right, we can probably test both
[09:01:08] <elukey>	 as you prefer :)
[09:01:24] <joal>	 I'll test both
[09:01:33] <joal>	 the one with user.classpath just failed
[09:13:47] <joal>	 The one with classloader fialed as well
[09:14:30] <joal>	 elukey: I'm super glad we are testing his on labs :)
[09:19:21] <elukey>	 yes definitely :)
[09:21:53] <joal>	 failed with hadoopDependencyCoordinates as well hm - Need more investifation
[09:28:29] <elukey>	 joal: did you get the same error in the indexing-log logfile each time?
[09:29:52] <elukey>	 yeah seems so
[09:29:54] <elukey>	 super weird
[09:45:56] <elukey>	 joal: https://github.com/druid-io/druid/releases -> "Deprecation of support for Hadoop versions < 2.6.0"
[09:46:08] <elukey>	 apparently it happened from 0.10.1 onwards
[09:46:15] <joal>	 WHATTTT???
[09:46:30] <elukey>	 there seems to be a workaround though
[09:47:22] <elukey>	 If users wish to use Hadoop 2.7.3 as default for ingestion tasks, users should double check any existing druid.indexer.task.defaultHadoopCoordinates configurations.
[09:48:09] <elukey>	 maybe we can also test druid.indexer.task.defaultHadoopCoordinates in midddlemanager's config
[09:48:23] <joal>	 elukey: I put that in the config
[09:48:32] <joal>	 elukey: it still fails :(
[09:49:14] <elukey>	 joal: in the middlemanager's config + restart of the daemon?
[09:49:20] <elukey>	 or in the json submitted to the overlord?
[09:50:06] <joal>	 elukey: json sent to overlord
[09:50:58] <joal>	 elukey: still the same error in hadoop job on my side
[09:51:33] <elukey>	 let's try to see if adding the option to the mm works, but I think that the hadoop.compile setting in druid's pom is the root cause of the issues
[09:53:45] <joal>	 elukey: I was also messing things woth my template - testing again
[09:54:17] <joal>	 elukey: I'm assuming that hdp-2.7.3 is expected from druid0.12 onward, no?
[09:55:02] <elukey>	 they don't clearly state it afaics, but it is suggested yes
[09:55:18] <elukey>	 joal: lemme run puppet on druid to restart mm
[09:55:21] <joal>	 elukey: I'd like to make it work f v0.11 if feasible
[09:55:35] <joal>	 elukey: test launch, can you wait a minute?
[09:55:50] <elukey>	 already done sorry :(
[09:56:57] <elukey>	 all readh
[09:57:00] <elukey>	 ready
[09:57:10] <elukey>	 the mm will spawn peons running with druid.indexer.task.defaultHadoopCoordinates=org.apache.hadoop:hadoop-client:cdh
[10:01:18] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Druid clusters to 0.11 - https://phabricator.wikimedia.org/T193712#4189781 (10elukey) https://github.com/druid-io/druid/releases for 0.10.1 mention this:  ``` Deprecation of support for Hadoop versions < 2.6.0 To add support for Am...
[10:02:18] <elukey>	 also, re-reading the statement shows that they are deprecating versions OLDER than 2.6.0 joal, so it shouldn't be our case
[10:02:39] <joal>	 elukey: seems that your last patch broke something - I don't even get uindexation jobs anymore (and oozie doesn't fail - super weird)
[10:03:42] <elukey>	 2018-05-08T09:55:42,841 ERROR io.druid.cli.CliMiddleManager: Error when starting up.  Failing.
[10:03:45] <elukey>	 com.google.inject.ProvisionException: Unable to provision, see the following errors:
[10:03:48] <elukey>	 1) Problem parsing object at prefix[druid.indexer.task]: Can not deserialize instance of java.util.ArrayList out of VALUE_STRING token
[10:04:27] <joal>	 Ah elukey - Sorry, must be on my side :(
[10:05:28] * elukey hopes that the next run will work without exploding :(
[10:05:38] <joal>	 :S
[10:11:47] <elukey>	 joal: any luck in launching the indexation job?
[10:12:24] <joal>	 elukey: nope, looks like my task doesn't even get started on hadoop :(
[10:12:53] <joal>	 indexation was asked at 10:07 (UTC hadoop time)
[10:13:24] <joal>	 on d-1, overlord has: 2018-05-08T10:07:25,767 INFO io.druid.indexing.overlord.MetadataTaskStorage: Inserting task index_hadoop_webrequest_2018-05-08T10:07:25.766Z with status: TaskStatus{id=index_hadoop_webrequest_2018-05-08T10:07:25.766Z, status=RUNNING, duration=-1}
[10:14:35] <joal>	 elukey: it's super weird: no realtime job is running, and the cluster still seem to be willing to start indexation tasks :(
[10:17:08] <joal>	 elukey: I do think the patch you had broke the thing actually ...
[10:19:03] <elukey>	 joal: could be yes, but I don't see any sign of errors now on d1's middlemanager log
[10:19:30] <elukey>	 last one was 2018-05-08T09:55:42,841 ERROR io.druid.cli.CliMiddleManager: Error when starting up.  Failing.
[10:19:31] <joal>	 elukey: no more MM logs after your last restart on either d-1,2,3 
[10:19:47] <joal>	 I'm assuming they are just broken
[10:20:44] <elukey>	 yep they seems broken indeed
[10:21:38] <elukey>	 I restarted them and now on d-1 it works
[10:21:55] <joal>	 ok - I'm not completely mad :)
[10:22:41] <elukey>	 joal: let's retry!
[10:27:47] <elukey>	 nope middle managers die
[10:28:20] <elukey>	 it only says 2018-05-08 10:21:51,578 Thread-5 ERROR Unable to register shutdown hook because JVM is shutting down.
[10:28:23] <elukey>	 that doesn't make sense
[10:28:52] <elukey>	 ah no wait
[10:28:52] <elukey>	 2018-05-08T10:21:51,485 ERROR io.druid.cli.CliMiddleManager: Error when starting up.  Failing.
[10:28:55] <elukey>	 com.google.inject.ProvisionException: Unable to provision, see the following errors:
[10:28:59] <elukey>	 this is more recent
[10:29:43] <elukey>	 ahhh wait now I think I know what Caused by: java.lang.IllegalArgumentException: Can not deserialize instance of java.util.ArrayList out of VALUE_STRING token is
[10:30:21] <elukey>	 the setting that I added needs to be an array probably
[10:30:33] <elukey>	 the docs have a bad example to copy/paste
[10:30:35] * elukey grumbles
[10:37:34] <elukey>	 joal: if you don't hate me yet, let's retry one more time
[10:41:13] <joal>	 elukey: no hate at all- just left for a few minues sorry
[10:41:17] <joal>	 Trying o resindex now !
[10:47:55] <elukey>	 failing :(
[10:49:30] <joal>	 elukey: same exact error about versions mismatch
[10:49:31] <joal>	 hm
[10:50:14] <joal>	 elukey: Can we review the hadoopDependencyCoordinates value I should use?
[10:52:30] <joal>	 elukey: I have the feeling that "org.apache.hadoop:hadoop-client:cdh" is not what it should be
[10:57:53] <elukey>	 joal: I think that one is good.. reading from http://druid.io/docs/latest/operations/other-hadoop.html
[10:58:07] <elukey>	 "This instructs Druid to load hadoop-client 2.4.0 when processing the task. What happens behind the scene is that Druid first looks for a folder called hadoop-client underneath druid.extensions.hadoopDependenciesDir, then looks for a folder called 2.4.0 underneath hadoop-client, and upon successfully locating these folders, hadoop-client 2.4.0 is loaded."
[10:58:43] <joal>	 ok then
[10:58:45] <elukey>	 so we set druid.extensions.hadoopDependenciesDir=/usr/share/druid/hadoop-dependencies
[10:59:07] <elukey>	 elukey@d-1:/var/lib/druid/indexing-logs$ ls /usr/share/druid/hadoop-dependencies/hadoop-client/
[10:59:10] <elukey>	 2.7.3  cdh
[10:59:39] <elukey>	 on paper it should work
[10:59:59] <joal>	 it doesn't fail per say - But indexation doesn't work :(
[11:02:07] <elukey>	 in https://github.com/druid-io/druid/blob/0.10.0/pom.xml
[11:02:15] <elukey>	         <!-- Watch out for Hadoop compatibility when updating to >= 2.5; see https://github.com/druid-io/druid/pull/1669 -->
[11:02:18] <elukey>	         <jackson.version>2.4.6</jackson.version>
[11:03:41] <elukey>	 2.4.6 is also on 0.10, so it is weird
[11:03:56] <elukey>	 <hadoop.compile.version>2.7.3</hadoop.compile.version> seems to be the issue
[11:06:36] <elukey>	 there might be druid.extensions.hadoopContainerDruidClasspath to try
[11:07:25] <elukey>	 "Hadoop Indexing launches hadoop jobs and this configuration provides way to explicitly set the user classpath for the hadoop job. By default this is computed automatically by druid based on the druid process classpath and set of extensions. However, sometimes you might want to be explicit to resolve dependency conflicts between druid and hadoop."
[11:09:06] <elukey>	 I found the thing from http://druid.io/docs/latest/ingestion/batch-ingestion.html
[11:10:40] <joal>	 not sure how to set this up though :(
[11:11:39] <elukey>	 me too
[11:12:08] <joal>	 doing a last test for indexation, but I don't feel we're gonna have this working without recompiling druid :(
[11:12:52] <wikibugs_>	 10Analytics: Add legacy per-article pagecounts data (prior to 2015) - https://phabricator.wikimedia.org/T193759#4189974 (10CristianCantoro) >>! In T193759#4179085, @Milimetric wrote: > In the meantime, maybe we can host the data on dumps as files?  I am wondering if I can put this data in a format that would be...
[11:14:41] <elukey>	 joal: something interesting - in the middlemanager's log I always find "hadoopDependencyCoordinates" : null,
[11:15:02] <joal>	 Interesting !!!!
[11:15:17] <joal>	 elukey: in previous indexation tasks?
[11:15:52] <wikibugs_>	 10Analytics, 10Analytics-Wikistats, 10Accessibility, 10Easy, 10Patch-For-Review: Wikistats Beta: Fix accessibility/markup issues of Wikistats 2.0 - https://phabricator.wikimedia.org/T185533#4189980 (10mforns) @Volker_E Thanks, I understand it better now. We'll work on this task in the upcoming weeks.
[11:16:19] <elukey>	 joal: yeah last one was 2018-05-08T11:12:15,535 INFO
[11:17:04] <joal>	 interestingly enough, in MM logs, in the last indexation I just did, I didn't specify the parameter on purpose, the see if your change would do anything
[11:17:44] <elukey>	 but grepping through I am not seeing it in all the mm logs
[11:17:55] <elukey>	 maybe for some reason we didn't fully test it?
[11:19:07] <joal>	 There must be something worng about how I pass the param
[11:19:14] <joal>	 I'm gonna double check again
[11:19:31] <joal>	 it's null everywhere elukey as you said - Shouldn't be so, woudln'tit?
[11:19:56] <elukey>	 joal: so in theory I'd expect to see it populated with our hadoop-client:cdh
[11:20:05] <joal>	 +1, or something similar
[11:20:33] <joal>	 ok I think I know
[11:21:34] <joal>	 trying again
[11:23:21] <joal>	 YAY ! At least i's set now
[11:23:30] <joal>	 Let's see if indexing work
[11:25:50] <joal>	 hm - doesn't smell good - stuck at reducer results writing
[11:25:55] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Present a page view metric description to the user that they are likely to understand - https://phabricator.wikimedia.org/T182109#4190020 (10mforns) a:03sahil505
[11:26:07] <joal>	 need to test with other parameters added after that one
[11:26:12] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Present a page view metric description to the user that they are likely to understand - https://phabricator.wikimedia.org/T182109#3813007 (10mforns) a:05sahil505>03None
[11:50:13] <elukey>	 so joal better but different issue? (reduce stuck?)
[11:50:43] <joal>	 elukey: tried with various settings - Best I what you describe yes
[11:51:02] <joal>	 Works the same without MapReduce param or with user.classPath.first set
[11:51:13] <joal>	 However with mapreduce.job.classloader, no chance
[11:51:26] <joal>	 I'm waiting foir the current trial to fail
[11:51:57] <elukey>	 so I can see only mapreduce.job.classloader = true in the logs, did we test this param without it?
[11:53:08] <joal>	 I didn't set it to false explicitely elukey, but I have ran tests without setting it
[11:53:14] <elukey>	 ack
[11:53:42] <joal>	 elukey: grep for "jobProperties" - You'll see times where it is not set
[11:55:00] <elukey>	 I can try to remove the middlemanager setting that I put
[11:55:22] <joal>	 elukey: we can try (at that point, I'm surely ok to test anything)
[11:55:40] <elukey>	 but do we have any idea why it stops in the redue?
[11:55:42] <elukey>	 reduce?
[11:56:06] <joal>	 I'm waiting for it to fail to be sure, but I think it;s again the same issue
[11:56:13] <joal>	 (need confirmation)
[11:58:49] <elukey>	 what a mess
[11:59:45] <joal>	 pfff - Between a few PBCAK and the problem, it's not a nice day ...
[12:00:46] <joal>	 elukey: Different error !!!!
[12:00:56] <joal>	 /user/druid/deep-storage-analytics_test0/webrequest/2018-05-08T00:00:00.000Z_2018-05-08T01:00:00.000Z/2018-05-08T11:47:44.005Z/0/index.zip.1 from hdfs://analytics-hadoop-labs/user/druid/deep-storage-analytics_test0/webrequest/2018-05-08T00:00:00.000Z_2018-05-08T01:00:00.000Z/2018-05-08T11:47:44.005Z/0/index.zip.1 is not a valid DFS filename
[12:03:29] <elukey>	 where is this log file?
[12:03:40] <joal>	 yarn logs --applicationId application_1524125463609_0701 --appOwner druid |less
[12:03:44] <elukey>	 ahh
[12:03:53] <joal>	 :)
[12:04:15] <joal>	 elukey: I love how you implicitely look at app logs, and me at hadoop logs ;)
[12:07:07] <elukey>	 I found https://community.hortonworks.com/questions/177584/what-would-be-the-right-command-to-start-druid-had.html
[12:07:15] <elukey>	 they talk about 0.12 though
[12:08:51] <joal>	 elukey: could we be missing the HDFS jars?
[12:09:56] <elukey>	 in the extensions?
[12:13:29] <elukey>	 or do you mean in /usr/share/druid/hadoop-dependencies/hadoop-client/cdh ?
[12:13:47] <joal>	 in /usr/share/druid/hadoop-dependencies/hadoop-client/cdh
[12:15:09] <elukey>	 mmmm don't think so, the links are fine afaics
[12:15:48] <elukey>	 all right I am going to have a quick lunch and then prepare for the interview and the Kafka upgrade that we'll do in ~1h :(
[12:16:00] <joal>	 mwarf
[12:16:06] <joal>	 Have a good lunch elukey 
[12:21:11] <elukey>	 joal: thanks! Worst case scenario I'll try to modify the pom and build Druid from source
[12:21:18] <joal>	 :(
[12:21:21] <elukey>	 it is a pain since we are using the jars that they ship
[12:35:34] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Beta: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#4190148 (10mforns)
[12:36:22] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Beta: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#3974963 (10mforns)
[12:38:28] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Wikistats2 line chart and map displacement bugs in Chrome+Ubuntu - https://phabricator.wikimedia.org/T189197#4190156 (10mforns)
[12:38:37] <joal>	 taking a break -team
[13:14:02] <wikibugs_>	 10Analytics, 10EventBus, 10Services (doing), 10User-Elukey: Kafka sometimes misses to rebalance topics properly - https://phabricator.wikimedia.org/T179684#4190314 (10elukey) Can you report a graph or a time window so we can investigate in the logs? main-eqiad?
[13:14:28] <elukey>	 ottomata: o/
[13:16:52] <ottomata>	 yoohoo
[13:25:14] <wikibugs_>	 10Analytics: Add legacy per-article pagecounts data (prior to 2015) - https://phabricator.wikimedia.org/T193759#4190363 (10CristianCantoro) Anyway, I am totally ok with uploading these data, I think I just need a server where to save  them.
[13:33:04] <wikibugs_>	 10Analytics: Add legacy per-article pagecounts data (prior to 2015) - https://phabricator.wikimedia.org/T193759#4190449 (10Milimetric) @CristianCantoro: I'm sorry I didn't think of this, but isn't this what pagecounts-ez already did? https://dumps.wikimedia.org/other/pagecounts-ez/merged/  Oh but you have them g...
[13:38:07] <wikibugs_>	 10Analytics, 10EventBus, 10Services (doing), 10User-Elukey: Kafka sometimes misses to rebalance topics properly - https://phabricator.wikimedia.org/T179684#4190496 (10Pchelolo) Oh, sorry. It actually just happened again at 07:13 UTC:  {F18080061}  {F18080063}  The important thing here is in eqiad, codfw is...
[13:50:29] <ottomata>	 a-team elukey and i will be missing meetings today to do the kafka upgrade
[13:50:38] <ottomata>	 (hard to schedule, petr in SF and alex and luca in euro)
[13:50:56] <milimetric>	 cool, good luck
[14:18:29] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Improve scoping of CSS - https://phabricator.wikimedia.org/T190915#4190647 (10mforns) a:03sahil505
[14:18:31] <wikibugs_>	 10Analytics: Generate pagecounts-ez data back to 2008 - https://phabricator.wikimedia.org/T188041#3994247 (10CristianCantoro) I coming from [[ https://phabricator.wikimedia.org/T193759 | T193759 ]], I can help with this. Is the script doing the merge available? I can run it on one of my machines and let it run e...
[14:18:33] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Improve scoping of CSS - https://phabricator.wikimedia.org/T190915#4087497 (10mforns) p:05Low>03Normal
[14:20:19] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Upgrading Wikistats 2.0 footer UI/design - https://phabricator.wikimedia.org/T191672#4190660 (10mforns) p:05Triage>03Normal
[14:21:26] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039#4190673 (10Ottomata)
[14:21:44] <wikibugs_>	 10Analytics, 10Analytics-Wikistats, 10Patch-For-Review: Hide "Load more rows..." once all the data is visible in Table Chart - https://phabricator.wikimedia.org/T192407#4190674 (10mforns) p:05Triage>03Normal
[14:32:11] <wikibugs_>	 10Analytics: Add legacy per-article pagecounts data (prior to 2015) - https://phabricator.wikimedia.org/T193759#4190757 (10CristianCantoro)
[14:32:13] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Add wikistats metric about "pagecounts" - https://phabricator.wikimedia.org/T189619#4190758 (10mforns) a:03sahil505
[14:32:15] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Add wikistats metric about "pagecounts" - https://phabricator.wikimedia.org/T189619#4047832 (10mforns) p:05Triage>03Normal
[14:32:18] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Add wikistats metric about "pagecounts" - https://phabricator.wikimedia.org/T189619#4047832 (10mforns)
[14:35:07] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Present a page view metric description to the user that they are likely to understand - https://phabricator.wikimedia.org/T182109#4190767 (10mforns) Let's first work on 1) and 2). And once this is done, we can commit some time to having a popup that pulls the first paragraph...
[14:35:44] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Present a page view metric description to the user that they are likely to understand - https://phabricator.wikimedia.org/T182109#4190768 (10mforns) p:05Triage>03Normal a:03sahil505
[14:37:19] <wikibugs_>	 10Analytics, 10Analytics-Wikistats, 10Accessibility, 10Easy, 10Patch-For-Review: Wikistats Beta: Fix accessibility/markup issues of Wikistats 2.0 - https://phabricator.wikimedia.org/T185533#4190774 (10mforns) p:05Triage>03Normal
[14:40:06] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Beta: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#4190788 (10mforns) p:05Triage>03High a:03sahil505
[14:42:52] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Wikistats2 line chart and map displacement bugs in Chrome+Ubuntu - https://phabricator.wikimedia.org/T189197#4190805 (10mforns)
[14:42:54] <wikibugs_>	 10Analytics-Kanban, 10Google-Summer-of-Code (2018): [Analytics] Improvements to Wikistats2 front-end - https://phabricator.wikimedia.org/T189210#4190804 (10mforns)
[14:44:01] <wikibugs_>	 10Analytics-Kanban, 10Google-Summer-of-Code (2018): [Analytics] Improvements to Wikistats2 front-end - https://phabricator.wikimedia.org/T189210#4035243 (10mforns) a:03sahil505
[14:48:47] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Consider adding breadcrumbs to Wikistats 2 - https://phabricator.wikimedia.org/T178018#4190816 (10mforns) Hey @Milimetric, @fdans, @Nuria My initial guess on this is that it's not needed. Wikistats has only 2 pages: Dashobard and Detail, and lots of navigation options alread...
[14:51:23] <wikibugs_>	 10Analytics-Kanban, 10Google-Summer-of-Code (2018): [Analytics] Improvements to Wikistats2 front-end - https://phabricator.wikimedia.org/T189210#4190832 (10Milimetric)
[14:51:25] <wikibugs_>	 10Analytics, 10Analytics-Wikistats: Consider adding breadcrumbs to Wikistats 2 - https://phabricator.wikimedia.org/T178018#4190830 (10Milimetric) 05Open>03declined yeah, the logo being linked to go back to the home page might have been the bulk of the problem here.  We can close and reconsider later after...
[15:00:05] <fdans>	 hey a-team I'm feeling unwell right now, I'm going to skip standup and I'll send an e-scrum later
[15:00:55] <nuria_>	 fdans: k
[15:00:58] <nuria_>	 ping mforns 
[15:01:51] <mforns>	 coming!
[15:04:10] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039#4190895 (10Pchelolo)
[15:06:29] <ottomata>	 !log beginnng Kafka upgrade of main-codfw: T167039
[15:06:32] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[15:06:33] <stashbot>	 T167039: Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039
[15:11:08] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4190926 (10Psychoslave) Thank you @Tgr for bringing more clarification through distinction in the discussion. (also I'm not sure if thanks messages are welcome on phabricator tickets, b...
[15:19:41] <fdans>	 nuria_: the patch you uploaded yesterday for the UA stuff, is it ready to be reviewed?
[15:21:26] * fdans there's like 4 different people construction workers banging a hammer at the walls and I'm going insane 
[15:28:44] <elukey>	 joal: https://github.com/druid-io/druid/issues/2476 - somebody familiar already having a jackson issue before :)
[15:29:06] <joal>	 I have seen that elukey :)
[15:29:08] <joal>	 huhu
[15:29:20] <wikibugs_>	 10Analytics, 10Analytics-Kanban: 2018-03 snapshot still broken - https://phabricator.wikimedia.org/T194075#4190975 (10Nuria) a:03JAllemandou
[15:29:37] <joal>	 elukey: the problem I face now is really weird - seems due to druid not escaping HDFS path :(
[15:29:38] <wikibugs_>	 10Analytics, 10Analytics-Kanban: 2018-03 snapshot still broken - https://phabricator.wikimedia.org/T194075#4187918 (10Nuria) Assigning to @JAllemandou  to rename snapshot name and swap by good one.
[15:30:35] <elukey>	 joal: ah you mean the last one that we got?
[15:30:39] <joal>	 yes
[15:31:04] <elukey>	 lovely
[15:35:20] <wikibugs_>	 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10MediaWiki-extensions-Translate, 10Services (done): Unable to mark pages for translation in Meta - https://phabricator.wikimedia.org/T192107#4190989 (10mobrovac)
[15:35:28] <wikibugs_>	 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4190990 (10mobrovac)
[15:37:41] <wikibugs_>	 10Analytics-Kanban: Use user agent + IP to group anonymous users - https://phabricator.wikimedia.org/T194170#4190995 (10Milimetric)
[15:37:55] <wikibugs_>	 10Analytics-Kanban: Use user agent + IP to group anonymous users - https://phabricator.wikimedia.org/T194170#4191007 (10Milimetric) p:05Triage>03Normal
[15:38:19] <joal>	 !log Try again (last time) to rerun mediawiki-history-druid-wf-2018-04
[15:38:20] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[15:39:03] <wikibugs_>	 (03PS2) 10Milimetric: Update geowiki aggregation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T194170)
[15:41:32] <nuria_>	 fdans: i want to change one more thingand i will be done
[15:41:45] <fdans>	 cool!
[15:43:25] <joal>	 elukey: Do you have a minute for me?
[15:43:31] <elukey>	 sure
[15:43:59] <joal>	 elukey: cave? I'd like a pair while moving prod data
[15:46:59] <elukey>	 joal: I think that there is a little issue during the kafka upgrade, checking with Andrew sorry.. can we do after the meetings?
[15:47:06] <wikibugs_>	 10Analytics, 10Analytics-Wikistats, 10Patch-For-Review: Display of radio buttons in Wikistats 2 is somewhat confusing - https://phabricator.wikimedia.org/T183185#4191032 (10Milimetric) @Amitjoki yeah, that sounds good and I think it'll improve those buttons, I'll review when you submit, thank you!
[15:47:10] <joal>	 elukey: ping me when you have a minute :)
[15:47:30] <joal>	 Thanks mate - good luck elukey and ottomata with the kafka stuff - I'm here if needed (mehhh?)
[15:47:45] <wikibugs_>	 (03PS2) 10Milimetric: Rename geowiki to geoeditors, step 1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431689
[15:47:50] <wikibugs_>	 (03PS2) 10Milimetric: Rename geowiki to geoeditors, step 2 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431690
[15:57:04] <ottomata>	 joal:  FYI, i have a semi working solution (but probably not worth it) do the PartiionedDataFrame method proxy stuff
[15:57:15] <ottomata>	 case class PartitionedDataFrame(df: DataFrame, partition: HivePartition) extends Proxy {
[15:57:15] <ottomata>	     // implement self member for Proxy
[15:57:15] <ottomata>	     val self: DataFrame = df
[15:57:15] <ottomata>	 ...
[15:57:24] <ottomata>	 object PartitionedDataFrame {
[15:57:24] <ottomata>	     implicit def toDataFrame(partDf: PartitionedDataFrame): DataFrame = {
[15:57:24] <ottomata>	         partDf.self
[15:57:24] <ottomata>	     }
[15:57:25] <ottomata>	     implicit def toHivePartition(partDf: PartitionedDataFrame): HivePartition = {
[15:57:25] <ottomata>	         partDf.partition
[15:57:26] <ottomata>	     }
[15:57:26] <ottomata>	 }
[15:57:32] <ottomata>	 this let's you do things like
[15:57:37] <ottomata>	 partDf.select("...")
[15:57:38] <ottomata>	 or
[15:57:47] <ottomata>	 partDf.hiveQL
[15:59:45] <joal>	 Nice ottomata !! I think it still leads to the same problem for chaining functions over a PartitionedDataframe --> first step you get a dataframe back, then you use your datarframe but you're not in PartitionedDataframe mode
[15:59:50] <ottomata>	 yeah
[15:59:51] <ottomata>	 indeed
[16:00:07] <joal>	 mwarf
[16:00:10] <ottomata>	 which is why i'm not sure it is worht it
[16:03:49] <joal>	 ottomata: If we need to explicitely rebuild PartitionedDataframe, then let's keep it explicit, no? The current version of the patch does so
[16:04:17] <ottomata>	 agree
[16:04:19] <ottomata>	 yeah
[16:04:31] <ottomata>	 was just trying stuff
[16:04:48] <ottomata>	 it also got annoying because you can't call the DataFrameExtensions on the partDf implicitly :/
[16:04:49] <ottomata>	 so no
[16:04:54] <ottomata>	 partDf.normalize
[16:05:01] <ottomata>	 you'd have to do partDf.df.normalize anyway
[16:05:43] <joal>	 Oh! Scala doesn't chain explicit calls - Interesting!
[16:05:49] <joal>	 implicits
[16:06:38] <ottomata>	 yeha,i was trying to figure out some way of defineing an implicit to get from partDf to DataFrame extensions, but whoa i got real confused fase
[16:06:39] <ottomata>	 fast
[16:06:52] <joal>	 :)
[16:07:40] <joal>	 I actually find it super funny that a very strictly-typed language like scala builts around execution-implicits, which are super not clean :)
[16:09:52] <ottomata>	 yeah
[16:12:42] <milimetric>	 oozie oozie oozie can't you see,
[16:12:46] <milimetric>	 sometimes your xml just hypnotizes me,
[16:12:55] <milimetric>	 I can barely stand your ancient ways,
[16:13:00] <milimetric>	 guess that's why you broke and I get paid
[16:13:23] * milimetric drops the mic
[16:14:52] <elukey>	 https://giphy.com/gifs/clapping-clap-standing-ovation-qIXVd1RoKGqlO
[16:16:27] <ottomata>	 hah
[16:17:03] <ottomata>	 bouncing 2003
[16:18:53] <ottomata>	 alllrighhhhty
[16:19:03] <ottomata>	 codfw upgraded.
[16:19:13] <ottomata>	 oops wrong chat :)
[16:20:10] <joal>	 Going for diner team - Back after
[16:31:15] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Druid clusters to 0.11 - https://phabricator.wikimedia.org/T193712#4191223 (10elukey) So the indexing error seems to be related to a Jackson version updated indirectly by 0.10.1+ versions because of hadoop-client bumped from 2.3 to...
[16:32:02] <ottomata>	 oh boy jackson versions again
[16:34:42] <elukey>	 yeah, the bumped the hadoop-client to 2.7.3
[16:35:26] <elukey>	 I wasn't aware that we had to do -Dhadoop.mapreduce.job.user.classpath.first=true
[16:36:08] <elukey>	 "This instructs Hadoop to prefer loading Druid's version of a library when there is a conflict."
[16:36:55] <elukey>	 so now I am wondering if using "hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:cdh"] and removing that one could be an option
[16:37:29] <elukey>	 they also suggest to test mapreduce.job.classloader = true but not with mapreduce.job.user.classpath.first=true
[16:45:54] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4178076 (10Mateusz_Konieczny) It would be also helpful to create equivalent of https://commons.wikimedia.org/wiki/Commons:Copyright_rules - about how copyright affects what can be added...
[16:45:58] <elukey>	 ok removed in labs
[16:51:23] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4191324 (10Mateusz_Konieczny) Another example of "lets ignore copyright": https://www.wikidata.org/wiki/Wikidata:OpenStreetMap had no mention whatsoever that imports from OSM to Wikidat...
[17:14:16] <joal>	 elukey: tried both the suggestion you made :(
[17:15:12] <elukey>	 joal: I removed hadoop.mapreduce.job.user.classpath.first=true from the -D options that we were passing to the middlemanager
[17:15:19] <elukey>	 because it was implicitly adding them
[17:15:34] <elukey>	 I am trying now to test some indexing jobs again so we can test without that
[17:15:47] <joal>	 elukey: ah, it was in conf !
[17:16:14] <elukey>	 yeah in puppet!
[17:16:24] <joal>	 ok I hear that
[17:16:26] <elukey>	 apparently it made the previous jackson issues go away
[17:17:05] <joal>	 elukey: Do you use oozie indexation?
[17:17:45] <elukey>	 joal: yes I do, modifying /srv/deployment/analytics/refinery/^Czie/webrequest/druid/hourly/coordinator.properties
[17:17:58] <elukey>	 it is more just to learn how to use it
[17:18:08] <elukey>	 you were away and I decided to try :)
[17:18:23] <joal>	 elukey: this oozie job uses the template I modified, so the jackson error is gone thanks to that I htink
[17:19:28] <elukey>	 does it? which ones? I just tested mapreduce.job.classloader = true but it ended up hanging, didn't see the hadoopDependencyCoordinates setting though
[17:19:54] <joal>	 elukey: Ok I think I get it
[17:21:05] <joal>	 Here is how the oozie workflow works: first, it prepares the data using a hive query that generates json files in hdfs, then it uses a python script to launch and monitor the druid indexation task, using the load_....json.template file as a template and replacing some valuesi n it
[17:21:30] <elukey>	 yes
[17:21:49] <joal>	 elukey: if you want to change the values sent to druid for indexation, it happens in the template file
[17:22:09] <joal>	 makes sense?
[17:22:39] <elukey>	 yep I was doing that
[17:22:50] <elukey>	 ahh sorry I pasted the wrong file name
[17:22:55] <elukey>	 now I get your confusion
[17:22:57] <joal>	 Ah - you modified and uploaded the template file then?
[17:22:58] <elukey>	 just realized it sorry :P
[17:23:08] <elukey>	 /srv/deployment/analytics/refinery/oozie/webrequest/druid/hourly/load_webrequests_hourly.json.template
[17:23:53] <joal>	 ok - And you uploaded it to hdfs in /wmf/refinery/current/oozie/webrequest/druid/hourly
[17:24:42] <elukey>	 I thought I needed to but the setting ended up in the middlemanager's log anyhow, is it necessary?
[17:25:06] <joal>	 I don't get it :)
[17:25:32] <joal>	 Ah maybe I do - the file in HDFS has been changed by me a lot
[17:25:42] <joal>	 the version in /srv/deployment/analytics/refinery/oozie/webrequest/druid/hourly/load_webrequests_hourly.json.template is not the one in HDFS
[17:25:58] <elukey>	 ahhhh okok because I saw "mapreduce.job.user.classpath.first" : "true" popping up in middlemanager's log
[17:26:01] <elukey>	 okok
[17:26:07] <joal>	 :)
[17:26:10] <joal>	 Here we go
[17:26:25] <elukey>	 anyhow, I'll let you do the work since it is clearly not my area of expertise :D
[17:27:01] <joal>	 elukey: do you mind if we do the quick moving of data before that?
[17:27:07] <elukey>	 yep definitely
[17:27:42] <joal>	 ok - to the cave?
[17:27:57] <elukey>	 ack
[17:41:00] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039#4191481 (10Ottomata)
[17:42:34] <elukey>	 going offline o/
[17:42:36] * elukey off!
[17:57:54] <joal>	 !log Mvoe recomputed 2018-03 history snapshot in place of old one (T194075)
[17:57:57] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[17:57:57] <stashbot>	 T194075: 2018-03 snapshot still broken - https://phabricator.wikimedia.org/T194075
[17:57:57] <wikibugs_>	 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4191495 (10Psychoslave) Also I think that if we want to really expand our community outside Europe and North America, it would be important to provide an infrastructure that ease contri...
[17:59:23] <wikibugs_>	 10Analytics, 10Analytics-Kanban: 2018-03 snapshot still broken - https://phabricator.wikimedia.org/T194075#4191498 (10JAllemandou) Fixed today. ``` select snapshot, event_entity, count(*) from mediawiki_history where snapshot in ('2018-03', '2018-02', '2018-04') group by snapshot, event_entity;  snapshot event...
[18:10:06] <ottomata>	 milimetric:  you ok if i try to schedule a few event platform meetings next week?
[18:10:41] <milimetric>	 ottomata: totally, I don't work fridays but free otherwise
[18:11:26] <ottomata>	 k
[18:26:02] <wikibugs_>	 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4191626 (10mforns) The data set is regenerating into mforns.virtualpageview_hourly. From my manual checks, it looks good. It...
[18:38:36] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Services (watching): Update to latest kafkacat - https://phabricator.wikimedia.org/T182163#4191695 (10Ottomata)
[18:38:39] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Services (watching): Update to latest kafkacat - https://phabricator.wikimedia.org/T182163#3814846 (10Ottomata) Thanks Faidon!
[18:40:50] <wikibugs_>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform (with EventLogging of the Future (EoF)) - https://phabricator.wikimedia.org/T185233#4191701 (10Ottomata)
[18:40:52] <wikibugs_>	 10Analytics, 10Analytics-EventLogging, 10Reading Epics (Analytics): Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#4191700 (10Ottomata)
[19:17:40] <wikibugs_>	 (03PS3) 10Milimetric: Update geowiki aggregation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T194170)
[19:25:34] <wikibugs_>	 (03CR) 10Nuria: Update geowiki aggregation (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T194170) (owner: 10Milimetric)
[19:31:59] <wikibugs_>	 (03PS6) 10Mforns: Add jobs for druid indexing of virtualpageviews [analytics/refinery] - 10https://gerrit.wikimedia.org/r/427696 (https://phabricator.wikimedia.org/T192305)
[19:33:20] <wikibugs_>	 (03PS5) 10Mforns: Add source page fields to wmf.virtualpageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/430889 (https://phabricator.wikimedia.org/T186728)
[19:38:21] <wikibugs_>	 (03CR) 10Nuria: Update geowiki aggregation (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T194170) (owner: 10Milimetric)
[19:40:27] <mforns>	 nuria_, you +2'd https://gerrit.wikimedia.org/r/#/c/430889/, but it is dependent on https://gerrit.wikimedia.org/r/#/c/427696/. You reviewed that one also, but it is still unmerged, can you have a look please? There have been few changes since your last review.
[19:40:48] <nuria_>	 mforns: looking
[19:41:47] <mforns>	 the changes are basically adding compatibility on data set names between the virtualpageview/hourly job and the virtualpageview/druid job
[19:42:36] <mforns>	 thanks :]
[19:42:51] <wikibugs_>	 (03PS4) 10Milimetric: Update geowiki aggregation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T194170)
[19:45:02] <nuria_>	 mforns: i though we decided that having daily and hourly nad monthly was too many granularities
[19:45:05] <nuria_>	 *and
[19:45:13] <milimetric>	 nuria_: ok, I think that patch is ready to look at.  I tested the full pipeline including loading into druid, and I just tested the md5 change in isolation to make sure it works
[19:45:14] <nuria_>	 mforns: maybe i got that wrong...
[19:45:44] <mforns>	 I thought we were leaving as it was
[19:45:56] <mforns>	 maybe I too got that wrong
[19:46:28] <nuria_>	 milimetricc: and you removed the "merging marks" right?
[19:46:30] <milimetric>	 I queried the numbers and the ones that shouldn't have changed match exactly.  The anon count is up, which makes sense with UA making those more unique, and the namespace-zero only counts look logical, and very close to the all-namespace counts
[19:47:03] <mforns>	 nuria_, I commented on top of your comments on patch 2 just after the meeting, but maybe I was confused
[19:47:03] <nuria_>	 mforns: cause seems overkill to index this data every hour into druid if we are doing so daily
[19:47:40] <wikibugs_>	 (03CR) 10Nuria: [V: 032 C: 032] Update geowiki aggregation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431688 (https://phabricator.wikimedia.org/T194170) (owner: 10Milimetric)
[19:47:52] <nuria_>	 mforns: right?
[19:48:42] <nuria_>	 mforns: i mean, do we need data on druid per hour ? the only reason why we have pageview hourly there per hour is to help troubleshoot ops issues if any
[19:48:50] <nuria_>	 mforns: it used to be a daily index
[19:49:23] <nuria_>	 mforns: amaybe I am forgetting teh results of our discussion
[19:49:40] <mforns>	 nuria_, no problem by me, it's just removing the hourly folder
[19:49:46] <mforns>	 will do that now
[19:49:53] <nuria_>	 mforns: do you see a strong reason to do it per hour?
[19:49:58] <wikibugs_>	 (03PS3) 10Milimetric: Rename geowiki to geoeditors, step 1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431689
[19:50:58] <mforns>	 no no, will change
[19:51:01] <milimetric>	 nuria_: did you want to take a look at the rename patches or shall I just merge those?
[19:51:05] <milimetric>	 I tested with all of them together: https://gerrit.wikimedia.org/r/#/c/431689/
[19:52:13] <nuria_>	 milimetric: the patch did not chnaged name of files , see
[19:52:24] <nuria_>	 milimetric: "create_geowiki_daily_table.hql"
[19:52:30] <nuria_>	 milimetric: shoudl be chnaged right?
[19:52:46] <milimetric>	 nuria_: there's a follow-up patch that changes those
[19:52:56] <milimetric>	 nuria_: I did it that way because otherwise it's hard to see the changes in the files 
[19:53:02] <nuria_>	 milimetric: ah ya
[19:54:00] <nuria_>	 milimetric: is that patch also in gerrit?
[19:54:50] <milimetric>	 nuria_: yes, all related patches are on the right side of the screen
[19:54:55] <icinga-wm>	 PROBLEM - Hadoop NodeManager on analytics1036 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager
[19:54:56] <milimetric>	 in this case the title is the same + part 2
[19:55:03] <milimetric>	 https://gerrit.wikimedia.org/r/#/c/431690/
[19:55:45] <icinga-wm>	 PROBLEM - Hadoop NodeManager on analytics1034 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager
[19:55:45] <ottomata>	 that's me ^^
[19:55:48] <ottomata>	 sorry dt expired
[19:56:21] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Use user agent + IP to group anonymous users in geowiki (now geoeditors) - https://phabricator.wikimedia.org/T194170#4191998 (10Nuria)
[19:56:28] <wikibugs_>	 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Notifications, and 3 others: Make EchoNotification job JSON-serializable - https://phabricator.wikimedia.org/T192945#4191999 (10SBisson) a:03SBisson
[19:56:32] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4192001 (10Ottomata)
[19:57:00] <wikibugs_>	 10Analytics-Kanban: Renamed geowiki to geoeditors - https://phabricator.wikimedia.org/T194207#4192002 (10Nuria) p:05Triage>03Normal
[19:57:05] <wikibugs_>	 (03PS7) 10Mforns: Add jobs for druid indexing of virtualpageviews [analytics/refinery] - 10https://gerrit.wikimedia.org/r/427696 (https://phabricator.wikimedia.org/T192305)
[19:57:12] <nuria_>	 milimetric: let's add this bug to those two patches: https://phabricator.wikimedia.org/T194207
[19:57:19] <nuria_>	 milimetric: the rename 1 and rename2
[19:57:32] <milimetric>	 sure, ok
[19:57:54] <nuria_>	 ottomata: did teh node manager went kaput?
[19:57:57] <nuria_>	 *the
[19:58:19] <ottomata>	 nuria_:  rreimaging
[19:58:25] <ottomata>	 i stopped nm to drain jobs
[19:58:29] <ottomata>	 still waiting for a few to finish
[19:58:30] <milimetric>	 (I had linked them from the vet task manually)
[19:58:36] <wikibugs_>	 (03PS4) 10Milimetric: Rename geowiki to geoeditors, step 1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431689 (https://phabricator.wikimedia.org/T194207)
[19:59:38] <wikibugs_>	 (03CR) 10Nuria: [V: 032 C: 032] Rename geowiki to geoeditors, step 1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431689 (https://phabricator.wikimedia.org/T194207) (owner: 10Milimetric)
[20:00:46] <nuria_>	 milimetric: right, i think teh renaming does not have to with vetting per se, which is about numbers
[20:01:15] <milimetric>	 agree, that's just where people were talking about it
[20:01:23] <milimetric>	 I should've been more organized, not my strong suit
[20:01:29] <wikibugs_>	 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Discovery, 10Patch-For-Review: Migrate Mediawiki Monolog Kafka producer to Kafka Jumbo - https://phabricator.wikimedia.org/T188136#4192026 (10Ottomata) Hopefully @EBernhardson can help us out somehow with this one? We want to swap the Mediawiki Kafk...
[20:02:27] <wikibugs_>	 (03CR) 10Nuria: [V: 032 C: 032] Add jobs for druid indexing of virtualpageviews [analytics/refinery] - 10https://gerrit.wikimedia.org/r/427696 (https://phabricator.wikimedia.org/T192305) (owner: 10Mforns)
[20:03:24] <nuria_>	 milimetric: this also needs the new ticket: https://gerrit.wikimedia.org/r/#/c/431690/
[20:03:49] <milimetric>	 yeah, I'm fighting some failed gerrit merge nightmare with that one :)
[20:05:25] <wikibugs_>	 (03PS6) 10Mforns: Add source page fields to wmf.virtualpageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/430889 (https://phabricator.wikimedia.org/T186728)
[20:05:42] <wikibugs_>	 (03PS3) 10Milimetric: Rename geowiki to geoeditors, step 2 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431690 (https://phabricator.wikimedia.org/T194207)
[20:05:57] <wikibugs_>	 (03CR) 10Mforns: [V: 032] Add source page fields to wmf.virtualpageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/430889 (https://phabricator.wikimedia.org/T186728) (owner: 10Mforns)
[20:05:59] <milimetric>	 nuria_: k, https://gerrit.wikimedia.org/r/#/c/431690/
[20:06:19] <wikibugs_>	 (03CR) 10Nuria: [V: 032 C: 032] Rename geowiki to geoeditors, step 2 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/431690 (https://phabricator.wikimedia.org/T194207) (owner: 10Milimetric)
[20:06:53] <milimetric>	 k, I'll do a deploy now then, so I have some time to fix everything else like the superset and docs
[20:08:47] <milimetric>	 !log deploying refinery to relaunch geoeditors job
[20:08:48] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[20:10:33] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4192072 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1036.eqiad.wmnet'] ``` T...
[20:10:37] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4192073 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1034.eqiad.wmnet'] ``` T...
[20:12:59] <milimetric>	 !log aborting deployment, will deploy data truncation script too
[20:12:59] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[20:18:33] <milimetric>	 (eh, sorry, changing my mind again because the script to drop partitions is already there so it's only a puppet change I can do later, my bad)
[20:18:42] <milimetric>	 !log deploying geoeditors for real now
[20:18:43] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[20:24:35] <bmansurov>	 milimetric: o/ Are Wikidata dumps available in the analytics cluster?
[20:24:43] <bmansurov>	 any idea?
[20:25:09] <milimetric>	 bmansurov: there's some stufffff... one sec
[20:26:36] <milimetric>	 bmansurov: hm... no, I might be remembering wrong, I don't see anything
[20:26:43] <milimetric>	 bmansurov: what you lookin for
[20:27:10] <bmansurov>	 milimetric: ok, thanks for checking. I'm trying to parse wikidata dumps in order to get sitelinks count for each item.
[20:34:48] <milimetric>	 bmansurov: I remember Ellery did some work on sqooping and processing that stuff, but I'm not sure he saved the code anywhere
[20:35:01] <milimetric>	 I found ellery.wikidata in hive as proof but not much else
[20:35:14] <milimetric>	 !log refinery deploy complete
[20:35:14] <bmansurov>	 milimetric: ok, I'll look around. thanks!
[20:35:14] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[20:35:44] <milimetric>	 bmansurov: might also ask leila, she worked with him on that as part of the recommender (they needed interlanguage links)
[20:36:14] <bmansurov>	 ok, i will
[20:38:27] <icinga-wm>	 RECOVERY - Hadoop NodeManager on analytics1036 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager
[20:38:48] <icinga-wm>	 RECOVERY - Hadoop NodeManager on analytics1034 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager
[20:39:06] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4192190 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1034.eqiad.wmnet'] ```  and were **ALL** successful.
[20:41:46] <bmansurov>	 milimetric: I think I've found ellery's work: https://github.com/ewulczyn/wmf/blob/master/util/wikidata_utils.py
[20:42:37] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4192207 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1036.eqiad.wmnet'] ```  and were **ALL** successful.
[20:43:52] <milimetric>	 bmansurov: yep, nice, so that's all I know :)
[20:44:08] <bmansurov>	 that's what i needed ;)
[21:17:42] <AndyRussG>	 Hi!!! Quick question about EventLogging and Hive... How are the data types in the event structure column determined? Here is what Hive says about the structure for the schema (CentralNoticeImpression): struct<anonymous:boolean,banner:string,bannerCategory:string,bucket:bigint,campaign:string,campaignCategory:string,campaignCategoryUsesLegacy:boolean,country:string,db:string,debug:b
[21:17:44] <AndyRussG>	 oolean,device:string,impressionEventSampleRate:bigint,project:string,randombanner:double,randomcampaign:double,recordImpressionSampleRate:double,result:string,status:string,statusCode:string
[21:17:46] <AndyRussG>	 ,uselang:string,reason:string,bannerCanceledReason:string,bannersNotGuaranteedToDisplay:boolean,debugInfo:string,errorMsg:string,alterFunctionMissing:boolean>
[21:18:03] <AndyRussG>	 https://meta.wikimedia.org/wiki/Schema:CentralNoticeImpression
[21:18:24] <AndyRussG>	 Specifically impressionEventSampleRate shows up as bigint
[21:18:35] <AndyRussG>	 though it should be some kind of floating point value
[21:18:53] <AndyRussG>	 On the other hand, recordImpressionSampleRate is set as a double
[21:19:21] <AndyRussG>	 However, in the schema definition, both are "number"
[21:19:34] <AndyRussG>	 thx in advance!
[21:39:02] <nuria_>	 AndyRussG: it infers types based on what values you pass along , number is a javascript type  not a java type  it needs to be coerced into an int or a double
[21:39:49] <nuria_>	 milimetric: did you stop teh geowiki calculations such they do not happen for April?
[21:41:36] <milimetric>	 nuria_: nah, they already happened, and it was good 'cause I could check the numbers.  I'll stop and clean up after them once I launch the new ones, but I don't have time left today
[21:41:46] <nuria_>	 milimetric: k
[21:41:49] <milimetric>	 but everything's deployed, so I'll just resume tomorrow by simply starting jobs
[21:42:14] <nuria_>	 milimetric: ok, let's kill old jobs tomorrow too
[21:42:21] <milimetric>	 yep
[21:42:47] <milimetric>	 (it's on my checklist: https://phabricator.wikimedia.org/T190409
[21:43:31] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Checklist for geowiki pipeline - https://phabricator.wikimedia.org/T190409#4072764 (10Milimetric)
[21:46:56] <wikibugs_>	 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10MediaWiki-extensions-Translate, 10Services (done): Unable to mark pages for translation in Meta - https://phabricator.wikimedia.org/T192107#4192479 (10Pchelolo) I think it's time to close this one. Please reopen if that breaks again during the transition p...
[21:47:07] <wikibugs_>	 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10MediaWiki-extensions-Translate, 10Services (done): Unable to mark pages for translation in Meta - https://phabricator.wikimedia.org/T192107#4192480 (10Pchelolo) 05Open>03Resolved
[22:14:29] <wikibugs_>	 (03PS7) 10Nuria: UA parser specification changes [analytics/ua-parser/uap-java] (wmf) - 10https://gerrit.wikimedia.org/r/429527 (https://phabricator.wikimedia.org/T189230)
[22:42:19] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikimetrics, 10Patch-For-Review, 10Software-Licensing: Add a license file to wikimetrics - https://phabricator.wikimedia.org/T60753#4192650 (10Nuria) 05Open>03Resolved
[22:42:35] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikimetrics, 10Patch-For-Review: Wikimetrics database list fetching is broken - https://phabricator.wikimedia.org/T193742#4192651 (10Nuria) 05Open>03Resolved
[22:42:48] <wikibugs_>	 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: TimeseriesData tests broken in Dashiki - https://phabricator.wikimedia.org/T193513#4192652 (10Nuria) 05Open>03Resolved
[22:43:01] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add a --dry-run option to the sqoop script - https://phabricator.wikimedia.org/T188556#4192653 (10Nuria) 05Open>03Resolved
[22:43:15] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Wikistats Bug:  all but 2018 data missing? - https://phabricator.wikimedia.org/T192841#4192654 (10Nuria) 05Open>03Resolved
[22:43:31] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add Ecosia and Startpage to list of search engines - https://phabricator.wikimedia.org/T191714#4192655 (10Nuria) 05Open>03Resolved
[22:43:43] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Update pivot to latest source - https://phabricator.wikimedia.org/T164007#4192657 (10Nuria)
[22:43:47] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update druid to 0.10 - https://phabricator.wikimedia.org/T164008#4192656 (10Nuria) 05Open>03Resolved
[22:44:02] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats: Make the Wikistats 2 UI responsive - https://phabricator.wikimedia.org/T186812#4192659 (10Nuria)
[22:44:06] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Replace all hover actions with touch-compatible UI - https://phabricator.wikimedia.org/T188277#4192658 (10Nuria) 05Open>03Resolved
[22:44:18] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039#4192677 (10Nuria)
[22:44:21] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade to Stretch and Java 8 for Kafka main cluster - https://phabricator.wikimedia.org/T192832#4192676 (10Nuria) 05Open>03Resolved
[22:45:05] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hadoop HDFS Namenode shutdown on 26/04/2018 - https://phabricator.wikimedia.org/T193257#4192678 (10Nuria) 05Open>03Resolved
[22:45:12] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039#4134387 (10Nuria)
[22:45:15] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Use profile and prometheus for role::kafka::main::broker - https://phabricator.wikimedia.org/T192831#4192680 (10Nuria) 05Open>03Resolved
[22:45:58] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats: Please install JSON.pm at stat1005 for Wikistats_1 - https://phabricator.wikimedia.org/T192760#4192685 (10Nuria) 05Open>03Resolved
[22:46:56] <wikibugs_>	 10Analytics: Wikistats. Bug on title "wikistats 2" is not shown - https://phabricator.wikimedia.org/T194224#4192686 (10Nuria)
[22:47:09] <wikibugs_>	 10Analytics, 10Analytics-Kanban: Wikistats. Bug on title "wikistats 2" is not shown - https://phabricator.wikimedia.org/T194224#4192696 (10Nuria)
[22:47:32] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: SEO Friendly HTML titles for Wikistats 2.0 - https://phabricator.wikimedia.org/T182718#4192698 (10Nuria) 05Open>03Resolved
[22:47:48] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Add defaults section to WhitelistSanitization.scala - https://phabricator.wikimedia.org/T190202#4192699 (10Nuria) 05Open>03Resolved
[22:48:02] <wikibugs_>	 10Analytics-Kanban: [EL Sanitization] Translate TSV whitelist into new YAML whitelist - https://phabricator.wikimedia.org/T189690#4192700 (10Nuria) 05Open>03Resolved
[22:48:23] <wikibugs_>	 10Analytics-Kanban: Check that pagecounts-ez cron is ok - https://phabricator.wikimedia.org/T192110#4192714 (10Nuria) 05Open>03Resolved
[22:48:47] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Add Vue Filters to make the code clean and use them as necessary - https://phabricator.wikimedia.org/T191824#4192716 (10Nuria) 05Open>03Resolved
[22:50:14] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Report updater setting log ownership incorrectly (leading to cronspam) - https://phabricator.wikimedia.org/T191871#4192734 (10Nuria) 05Open>03Resolved
[22:51:47] <wikibugs_>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: [EL sanitization] Ensure presence of EL YAML whitelist in analytics1003 - https://phabricator.wikimedia.org/T189691#4192743 (10Nuria) 05Open>03Resolved
[22:52:06] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade Kafka on main cluster with security features - https://phabricator.wikimedia.org/T167039#4192745 (10Nuria)
[22:52:08] <wikibugs_>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Kafka on jumbo cluster to 1.1.0 (latest) - https://phabricator.wikimedia.org/T193495#4192744 (10Nuria) 05Open>03Resolved
[22:52:29] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Make 'metric' field not a partition in mediawiki_metrics - https://phabricator.wikimedia.org/T190058#4192746 (10Nuria) 05Open>03Resolved
[22:52:38] <wikibugs_>	 10Analytics, 10Analytics-Kanban: 2018-03 snapshot still broken - https://phabricator.wikimedia.org/T194075#4192747 (10Nuria)
[22:53:06] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Improve mediwiki-history performance - https://phabricator.wikimedia.org/T189449#4192748 (10Nuria) 05Open>03Resolved
[22:53:30] <wikibugs_>	 10Analytics, 10Analytics-Kanban: Wikistats. Bug on title "wikistats 2" is not shown - https://phabricator.wikimedia.org/T194224#4192750 (10Nuria)
[22:54:45] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Vet new geo wiki data - https://phabricator.wikimedia.org/T191343#4192755 (10Nuria) closing as vetting is done, work to rename datasource plus adding UA to anonymous user calculation is ongoing
[22:54:57] <wikibugs_>	 10Analytics-Kanban: Private geo wiki data in new analytics stack - https://phabricator.wikimedia.org/T176996#4192757 (10Nuria)
[22:55:00] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Vet new geo wiki data - https://phabricator.wikimedia.org/T191343#4192756 (10Nuria) 05Open>03Resolved
[22:58:48] <wikibugs_>	 10Analytics, 10Analytics-Kanban: Sesssion reconstruction - evaluate  privacy threat - https://phabricator.wikimedia.org/T194058#4192762 (10Nuria) I think we can probably do couple things here:  - the page previews should not agreggate hourly, data for small wikis will be too sparse, rather it should agreggated...
[23:15:03] <AndyRussG>	 nuria_: thanks!
[23:17:08] <wikibugs_>	 10Analytics: Create ops dashboard with info like ipv6 traffic split - https://phabricator.wikimedia.org/T138396#4192771 (10Nuria) Sorry I did not answer to this one earlier. Ahem, not super pretty but you can filter via regex , so anything that is .*:.* will match ipv6 (you can get  a total number) and that can...