[05:48:53] <icinga-wm>	 PROBLEM - Check the last execution of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[06:10:21] <icinga-wm>	 RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[07:00:47] <wikibugs>	 10Analytics, 10Analytics-Kanban: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10elukey)
[07:07:52] <wikibugs>	 10Analytics, 10Analytics-Kanban: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10elukey)
[07:23:04] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10ayounsi) Thanks, it looks great!  >>! In T254332#6649704, @mforns wrote: > Note that the last couple hours do not yet have t...
[07:31:26] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10elukey) @mforns I had a chat with Arzhel, the el-to-druid job is configured like this:  ` --since $(date --date '-6hours' -u...
[07:36:19] <elukey>	 goood morning
[07:37:13] <elukey>	 joal: the hive patch seems working fine, no more oozie failures \o/
[07:41:49] <joal>	 Hi elukey 
[07:42:13] <joal>	 elukey: I am misunderstanding something I think
[07:42:24] <joal>	 elukey: were there regular oozie failures before the patch?
[07:51:37] <joal>	 Dentist appointment - back in a bit
[07:53:07] <elukey>	 joal: there were regular failures (basically for all hive2 actions) in oozie after enabling the db token store
[07:53:36] <elukey>	 after the hive patch, all good
[07:53:59] <elukey>	 I added the patch to the bigtop 1.4 package manually (after rebuilding them)
[07:54:19] <elukey>	 and now I am building for bigtop 1.5, and I'll send a patch later on to ask upstream to include it before the release
[07:55:08] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10Nikerabbit) I don't really know which one (if any) I should be in. IIRC I was added there for eventlogging access, which I still occasionally use.
[08:03:23] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10elukey) >>! In T268801#6650551, @Nikerabbit wrote: > I don't really know which one (if any) I should be in. IIRC I was added there for eventlogging access, which I still...
[08:15:42] <wikibugs>	 10Analytics-Radar: Superset getting slower as usage increases - https://phabricator.wikimedia.org/T239130 (10Aklapper) a:05Nuria→03None Resetting assignee (inactive account)
[08:15:59] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Research-Backlog: 20K events by a single user in the span of 20 mins - https://phabricator.wikimedia.org/T202539 (10Aklapper) a:05Nuria→03None Resetting assignee (inactive account)
[08:16:09] <wikibugs>	 10Analytics-Radar, 10Analytics-Wikistats, 10Internet-Archive: Feedback on Wikistats 2 new edits pages - https://phabricator.wikimedia.org/T210306 (10Aklapper) a:05Nuria→03None Resetting assignee (inactive account)
[08:16:14] <wikibugs>	 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Evaluate a differentially private solution to release wikipedia's  project-title-country data - https://phabricator.wikimedia.org/T267283 (10Aklapper) a:05Nuria→03None Resetting assignee (inactive account)
[08:16:18] <wikibugs>	 10Analytics-Radar, 10Data-release, 10Privacy Engineering, 10Privacy: An expert panel to produce recommendations on open data sharing for public good - https://phabricator.wikimedia.org/T189339 (10Aklapper) a:05Nuria→03None Resetting assignee (inactive account)
[08:17:03] <wikibugs>	 10Analytics-Radar, 10Analytics-Wikistats, 10Internet-Archive: Feedback on Wikistats 2 new edits pages - https://phabricator.wikimedia.org/T210306 (10Aklapper) (Also, this task looks unactionable - a bunch of many different things; instead of dedicated tickets for each task.)
[08:24:23] <elukey>	 the thing that worries me a bit is https://phabricator.wikimedia.org/T268733
[08:24:32] <elukey>	 I am wondering if it is a new weird hive 2 feature
[08:36:56] <wikibugs>	 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10elukey)
[08:40:48] <elukey>	 !log roll restart cassandra on aqs10* for openjdk upgrades
[08:40:53] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[08:41:55] <wikibugs>	 10Analytics-Clusters, 10WMDE-Analytics-Engineering, 10Patch-For-Review, 10User-GoranSMilovanovic: Downscale Wikidata-analysis pyspark scripts to analytics limits - https://phabricator.wikimedia.org/T268684 (10JAllemandou) @GoranSMilovanovic I looked at all the config files mentioned above, they look good :...
[08:43:12] <joal>	 Back1
[08:43:26] <elukey>	 bonjour
[08:43:32] <joal>	 elukey: About T268733
[08:43:32] <stashbot>	 T268733: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733
[08:44:08] <joal>	 elukey: Refine uses a weird strategy to communicate with Hive, so I'm not surprised of the error
[08:44:29] <joal>	 elukey: Is there a patch for the refine version you run with bigtop on hive2?
[08:44:52] <elukey>	 joal: ah!!!!
[08:45:04] <elukey>	 right the jdbc creds!
[08:45:13] <elukey>	 but IIRC we source hive-site.xml no??
[08:45:29] <joal>	 elukey: we source hive-site I think - which one I can't tell
[08:45:59] <elukey>	 the one where refine runs in theory, but in this case joal it seems more a DDL issue rather than a kerb one no?
[08:46:02] <joal>	 but mainly we use a JDBC connection with dedicated hive-driver jar included in refine-run command
[08:46:04] <elukey>	 (trying to understand)
[08:46:24] <joal>	 elukey: I think the problem comes from driver version mismatch
[08:46:36] <elukey>	 interesting
[08:46:52] <joal>	 elukey: all our spark jobs except for refine use Spark to connect to Hive - Th
[08:47:04] <joal>	 elukey: The version stuff is handled by Spark
[08:47:28] <joal>	 elukey: For refine, Spark refuse to let us run the DDL we want to over the hive tables
[08:47:52] <elukey>	 yep and we open a jdbc conn and issue an explicit alter
[08:47:53] <joal>	 elukey: So we use a JDBC connection to hive, as it is less picky
[08:48:40] <joal>	 elukey: Now for this JDBC connection the hive-jdbc driver in use differs between versions
[08:49:54] <elukey>	 ah right I see /srv/deployment/analytics/refinery/artifacts/hive-jdbc-1.1.0-cdh5.10.0.jar
[08:50:29] <joal>	 elukey: looking at the refine start command also helps (we add the jars manually to the spark job)
[08:51:53] <elukey>	 joal: so I could try to pull a newer hive-jdbc from maven and replace it in the spark cmd line
[08:52:17] <joal>	 elukey: That would be a first good try
[08:52:28] <elukey>	 perfect as always you are awesome
[08:52:29] <elukey>	 will test
[08:53:15] <elukey>	 (the cassandra oozie failure is me doing a restart of aqs, will re-run as soon as the cookbook finishes)
[08:53:37] <joal>	 ack elukey 
[08:53:45] <joal>	 elukey: I was wondering
[08:54:02] <joal>	 elukey: also, it seems there is no data for Maradona pageviews from yesterdya :(
[08:56:22] <joal>	 elukey: oozie has been working - I assume it's a caching problem
[08:59:40] <elukey>	 joal: is there a specific page that you are checking?
[09:00:00] <joal>	 elukey: of course! The big one from yesterday :)
[09:01:01] <joal>	 elukey: cassandra has data - I confirm it's a caching problem
[09:01:10] <joal>	 s/problem/feature
[09:01:39] <joal>	 Maradona page yesterday: 2.5M hits
[09:01:46] <joal>	 almost 2.6
[09:02:46] <elukey>	 joal: no I mean if a specific UI / Tool / etc.. doesn't show the correct graph, or if it is somethign else
[09:02:56] <elukey>	 I am not getting what you are checking :)
[09:03:10] <joal>	 elukey: I was checking the pageview tools (labs) for the Maradona page
[09:03:25] <joal>	 elukey: And despite asking for data up to yesterday, the site shows yesterday empty
[09:03:45] <joal>	 This is due to AQS sending no data for yesterday
[09:03:58] <joal>	 And I think this is a caching thing
[09:04:24] <joal>	 There must have been requests to the tool earlier in the day, when the data was not yet loaded, and this got cached
[09:04:51] <elukey>	 joal: do you have an http link for the maradona api page handy?
[09:04:57] <elukey>	 we can check headers
[09:05:12] <joal>	 elukey: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/Diego_maradona/daily/2020110500/2020112500
[09:05:35] <joal>	 elukey: wrong link sorry
[09:05:43] <joal>	 https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/Diego_Maradona/daily/2020110500/2020112500
[09:05:47] <joal>	 (capital M
[09:07:47] <joal>	 elukey: cache-control says max-age 86400
[09:07:54] <elukey>	 !log force purging https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/Diego_Maradona/daily/2020110500/2020112500 from caches
[09:07:56] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[09:08:13] <elukey>	 joal: better now :)
[09:08:16] <joal>	 <3 elukey 
[09:08:44] <elukey>	 is the pageview tool ok now?
[09:08:52] <joal>	 elukey: arh, nope :(
[09:08:56] <joal>	 https://pageviews.toolforge.org/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=0&range=latest-20&pages=Diego_Maradona
[09:09:24] <elukey>	 it works for me, I see a huge blue bar :D
[09:09:27] <joal>	 elukey: manual call to pageview-api works - I got data
[09:09:39] <joal>	 Ok great, must be caching on my side
[09:09:51] <joal>	 Thanks a lot elukey 
[09:09:57] <elukey>	 super thanks for checking :)
[09:10:06] <elukey>	 better that we have done it and not the community asking us
[09:10:18] <elukey>	 should we follow up to have a different caching time?
[09:11:05] <joal>	 I don't think so elukey - We've never been asked to change it - In most cases it's not problematic
[09:11:55] <elukey>	 joal: not sure if we set any header, but if not varnish/ats uses a default I think of a day.. maybe controlling it specifically wouldn't be bad
[09:12:03] <elukey>	 say 12 hours, or similar
[09:12:27] <elukey>	 it would change a lot in these cases, since we wouldn't need to manually purge
[09:12:28] <joal>	 works for me elukey 
[09:12:37] <joal>	 12h seems fine
[09:12:38] <elukey>	 I can open a task and then see what the team thinks about
[09:12:52] <elukey>	 also there is another aqs-related issue that I was thinking
[09:13:15] <elukey>	 the other day I stopped druid1005 and aqs started to alert like crazy
[09:13:39] <elukey>	 that is not acceptable, one out of 5 druid nodes down shouldn't cause any trouble
[09:13:54] <joal>	 I agree
[09:14:10] <joal>	 elukey: let me triple check on something
[09:14:13] <elukey>	 my theory is that aqs -> druid-lvs has a long timeout, so if a broker is stopped 
[09:14:23] <elukey>	 then 1/5th of connections for the edit api are piling up
[09:14:29] <joal>	 ahhhhh - possible
[09:14:49] <elukey>	 but not sure where to check in the aqs code :(
[09:14:49] <joal>	 I'm going to check druid data redundancy policy for our datasets
[09:14:54] <elukey>	 ack
[09:21:46] <wikibugs>	 10Analytics, 10Release-Engineering-Team (Development services): Unable to clone git repo from stat1008 - https://phabricator.wikimedia.org/T268290 (10hashar)
[09:21:54] <joal>	 elukey: I confirm data should be available - We have replication 2 by default
[09:25:22] <joal>	 elukey: it made a long time I had not checked datasources in Druid-prod - I'm gonna make a change on the druid loading job, splitting segments more:currently we have 6 segments per month, and recent segments weight 1G each (too much) - Will ask for 16 segments for each month (this will give us time before having to change)
[09:25:30] <wikibugs>	 10Analytics: AQS pageview default caching is one day - https://phabricator.wikimedia.org/T268809 (10elukey)
[09:25:54] <elukey>	 super
[09:30:23] <wikibugs>	 10Analytics: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10elukey)
[09:30:29] <elukey>	 there you go
[09:30:42] <elukey>	 joal: if you want to add your thoughts --^
[09:31:23] <joal>	 sure elukey 
[09:32:22] <elukey>	 <3
[09:37:37] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make druid mediawiki-history-reduced segments smaller - https://phabricator.wikimedia.org/T268813 (10JAllemandou)
[09:37:55] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make druid mediawiki-history-reduced segments smaller - https://phabricator.wikimedia.org/T268813 (10JAllemandou) a:03JAllemandou
[09:38:12] <wikibugs>	 (03PS1) 10Joal: Update mediawiki-history-reduced druid loading (shards) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643689 (https://phabricator.wikimedia.org/T268813)
[09:43:12] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] Update mediawiki-history-reduced druid loading (shards) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643689 (https://phabricator.wikimedia.org/T268813) (owner: 10Joal)
[09:48:04] <wikibugs>	 10Analytics: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10JAllemandou) @elukey your idea makes sense. I quickly looked at Hyperswitch and didn't find an obvious way to set a client call timeout. Let's ask @Pchelolo if he can enlighten us.
[10:05:38] <joal>	 gmodena: :(
[12:46:48] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) @ayounsi @elukey The reason of the gap is that the streaming job that is ingesting the data into Druid does not have...
[12:48:23] <elukey>	 mforns: o/
[12:48:39] <elukey>	 I am not super clear about the 4 hours thing :(
[12:58:37] <mforns>	 elukey: I just checked using hdfs dfs -ls, that it takes about 1 hour to ingest netflow raw from kafka
[12:58:58] <mforns>	 and then the data is refined and ready only 3 hours later
[12:59:34] <mforns>	 at least that's what I can se by looking at hdfs dfs -ls /wmf/data/wmf/netflow/year=2020/month=11/day=26
[13:00:10] <mforns>	 maybe the refine job is also too conservative in the since until params
[13:01:09] <mforns>	 elukey: yes, the default refine_job.pp time window config is: --since 28 --until 4
[13:01:12] <mforns>	 so 4 hours from now
[13:02:08] <mforns>	 if we changed that for netflow to say --until 2, then we could change the druid_load config to --until 3
[13:02:19] <wikibugs>	 10Analytics-Clusters, 10WMDE-Analytics-Engineering, 10Patch-For-Review, 10User-GoranSMilovanovic: Downscale Wikidata-analysis pyspark scripts to analytics limits - https://phabricator.wikimedia.org/T268684 (10GoranSMilovanovic) @joal Thank you. The `--num-executors` parameter is intentionally still there t...
[13:02:20] <mforns>	 and reduce the gap by 2 hours...
[13:02:46] <mforns>	 but I believe the real solution would be to add the new fields to the streaming job, it should be possible
[13:04:12] <elukey>	 mforns: ah yes yes right, I was only puzzled by the 4 hours :)
[13:04:25] <elukey>	 I agree that changing the streaming job is the right way
[13:04:48] <elukey>	 I didn't follow the whole set of changes and I thought that the new fields were the result of refine or augmentation
[13:04:53] <elukey>	 (so not in kafka)
[13:05:08] <elukey>	 but if they are, let's also modify the streaming job now no?
[13:05:14] <elukey>	 it should take 10 mins
[13:05:22] <joal>	 mforns: quick note: We would need a new streaming job (none exist now), sending augmented data to a new kafka topic, and have druid ingest from this new topic 
[13:05:35] <joal>	 elukey: --^ as well sorry
[13:05:43] <elukey>	 ah ok then this is indeed an issue
[13:06:34] <mforns>	 joal: no streaming job???
[13:06:55] <joal>	 mforns: nope - Druid ingests straight from Kafka (as format is json)
[13:07:05] <mforns>	 aah! I see
[13:07:55] <mforns>	 ok, well this will be a bit more difficult, but yea!
[13:08:49] <elukey>	 !log force umount/mount of all /mnt/hdfs mountpoints to pick up opendjdk upgrades
[13:08:50] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[13:08:57] <mforns>	 elukey: yea, new fields are not in kakfa
[13:14:44] <wikibugs>	 10Analytics: AQS pageview default caching is one day - https://phabricator.wikimedia.org/T268809 (10JAllemandou) Thanks @elukey for this ticket. Given that the loading of the data usually finishes before 2AM (UTC) of the current day for the previous day, high-traffic pages would most probably have been requested...
[13:52:44] <elukey>	 !log roll restart druid daemons on druid analytics to pick up new openjdk upgrades
[13:52:45] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[14:07:20] <wikibugs>	 (03PS2) 10Joal: [WIP] Update sqoop adding tables and removing timestamps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077)
[14:12:43] <wikibugs>	 (03PS3) 10Joal: [WIP] Update sqoop adding tables and removing timestamps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077)
[14:19:05] <elukey>	 joal: as FYI, I am merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/643446
[14:19:39] <joal>	 awesome elukey -- thanks a lot
[14:40:50] <wikibugs>	 10Analytics, 10Patch-For-Review: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10elukey) 05Open→03Resolved a:03elukey We solved the problem, on stat1005 there was a wrong log4j root logge...
[14:55:37] <elukey>	 joal: I am playing a bit with hive versions in hadoop test, but one thing that I noticed is that if I execute the alter in say beeline
[14:55:52] <elukey>	 (so on a hive 2.3.3 client) it returns the same error
[14:56:25] <elukey>	 if it was a problem of refine I'd have expected the alter to succeed in beeline
[14:56:28] <elukey>	 does it make sense?
[15:00:20] <wikibugs>	 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10elukey) The idea that I have is the following:  1) We move Superset to an-tool1010, with Turnilo 2) We deploy a on-host memcached instance, we have already a lot of puppet code + monitoring + metrics to re...
[15:01:05] <elukey>	 going afk for some errands, be back later :)
[15:01:30] <joal>	 ack elukey - same for me let's talk post standup about refine
[15:35:38] <mforns>	 helloo teamm
[15:35:50] <mforns>	 joal: can you do the pairing on netflow in short?
[15:40:50] <wikibugs>	 10Analytics: [data quality alarms] Reduce the K to generate more reports - https://phabricator.wikimedia.org/T246682 (10ssingh) Hi! Following up on these tickets now; sorry for the delay.  I think the current levels are reasonable and it's fine to leave them at the present value.
[15:45:02] <wikibugs>	 10Analytics: [data quality alarms] Reduce the K to generate more reports - https://phabricator.wikimedia.org/T246682 (10mforns) 05Open→03Resolved a:03mforns Great! Will close this task then.
[15:45:04] <wikibugs>	 10Analytics, 10Analytics-Kanban: Traffic anomaly alarms - https://phabricator.wikimedia.org/T267355 (10mforns)
[16:32:25] <joal>	 Hi mforns - I can do it now if you wish
[16:36:14] <mforns>	 heya joal, yes, gimme 2 mins
[16:36:16] <mforns>	 :]
[16:36:26] <joal>	 sure
[16:38:57] <mforns>	 joal: ah! I just realized there will be a conflict in the name of the druid datasource
[16:39:35] <mforns>	 when we move to event db, HiveToDruid will automagically ingest into event_netflow as opposed of wmf_netflow...
[16:39:56] <mforns>	 you know if there's a way to rename datasources in Druid?
[16:40:30] <joal>	 mforns: hm, I don't know an
[16:41:16] <joal>	 mforns: the only way I can think of is thrgough reindexing based on existing data
[16:41:20] <joal>	 There might be a way though
[16:41:52] <mforns>	 joal: or else, we can add a new argument to HiveToDruid to let us override the datasource name...
[16:42:06] <mforns>	 but that would take a refinery-source deployment
[16:42:34] <mforns>	 whatcha think?
[16:42:40] <joal>	 Being able to override the druid datasource seems like a good option to have
[16:42:54] <joal>	 in HiveToDruid I mean
[16:42:56] <joal>	 mforns: --^
[16:43:03] <mforns>	 yea
[16:47:38] <joal>	 mforns: I assume this means we'll do the db change after the needed deplo, right?
[16:47:48] <mforns>	 yes
[16:47:50] <joal>	 elukey: nearby by any chance?
[16:52:29] <joal>	 nevermind elukey - I managed myself :)
[16:53:35] <wikibugs>	 (03PS1) 10Mforns: Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339)
[16:54:11] <mforns>	 joal: ^
[16:54:16] <joal>	 yessir
[16:57:07] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns)
[16:57:12] <mforns>	 I think we could deploy today, it'd be just the packaging to archiva
[16:57:24] <mforns>	 and adding the links
[16:57:29] <joal>	 feasible mforns - We'd need approval from elukey though
[16:57:35] <mforns>	 yea yea ofc
[16:57:50] <mforns>	 uou -1
[16:59:31] <wikibugs>	 (03PS2) 10Mforns: Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339)
[16:59:41] <elukey>	 sorry joal I was in a meeting!
[16:59:56] <joal>	 np elukey :)
[17:01:47] <joal>	 klausman: standup if you wish :)
[17:08:12] <wikibugs>	 (03PS4) 10Joal: Update sqoop adding tables and removing timestamps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077)
[17:09:02] <wikibugs>	 (03CR) 10Joal: [V: 03+2] "Tested on some small wikis" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[17:26:24] <wikibugs>	 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10elukey)
[17:49:13] <wikibugs>	 (03CR) 10Joal: [C: 03+1] "Ok for me - I'd have liked to use an option instead of empty-string as no-value default, but I'm not sure it can be easily done with the c" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns)
[17:52:57] <elukey>	 joal:  refine completed :)
[17:53:03] <joal>	 \o/
[17:53:06] <joal>	 Let me check the data
[17:53:09] <wikibugs>	 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10elukey) I had a chat during post stand up today with Marcel and Joseph about the problem, and after some brain-bounce and research this new shiny parameter came up:  ` hive.metastore.disallow.i...
[17:53:58] <joal>	 elukey: I can't see new partition :(
[17:54:38] <mforns>	 joal, thanks for the review :] I think that ConfigHelper does support Option[String] params: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/scala/org/wikimedia/analytics/refinery/core/config/ConfigHelper.scala#L64
[17:55:11] <joal>	 mforns: if you don't mind, we can try it!
[17:55:17] <joal>	 mforns: otherwise it's ok :)
[17:55:26] <mforns>	 joal: of course! tryinh
[17:55:48] <joal>	 mforns: I think it'll also be worth tring an execution with logs, just in case ;)
[17:56:05] <mforns>	 yea yea
[17:56:18] <joal>	 elukey: could it be that there was no data?
[17:59:19] <elukey>	 no idea
[17:59:32] <elukey>	 lemme check
[18:01:15] <elukey>	 previously failed refinement and does not have new data since the last refin etc..
[18:01:24] <elukey>	 I have to run it with the flag to don't care :)
[18:01:26] <joal>	 right
[18:01:35] <joal>	 I imagined it could have been
[18:01:46] <joal>	 Thanks for checking that elukey - I was trying to do so as well
[18:02:02] * joal is slower than elukey - truth is stable
[18:02:45] <elukey>	 yes sure
[18:02:54] <elukey>	 you are SLOWER than me
[18:03:16] <elukey>	 I don't buy it :D
[18:04:14] <elukey>	 ok re-running with the --ignore_failure_flag=true flag
[18:04:29] <joal>	 Ack!
[18:15:01] <joal>	 elukey: refined suceeded
[18:15:04] <joal>	 checking data :)
[18:15:19] <joal>	 MOAR DATAZ!
[18:16:12] <elukey>	 :D
[18:16:19] <joal>	 And importantly, queryable data!
[18:16:23] <joal>	 That's a win elukey :)
[18:17:42] <elukey>	 \o/
[18:18:34] <mforns>	 :]
[18:23:45] <joal>	 Ok - Leaving for tonight - see you folks :)
[18:24:19] <mforns>	 byeeee
[18:27:39] <wikibugs>	 (03PS1) 10Elukey: oozie: move all hive2 actions settings to analytics-hive.eqiad.wmnet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028)
[18:27:46] <elukey>	 the moment has come! :D
[18:28:35] <elukey>	 ah no it is wrong
[18:28:51] <elukey>	 uff fixing
[18:29:14] <wikibugs>	 10Analytics, 10Product-Analytics, 10Inuka-Team (Kanban): Set up preview counting for KaiOS app - https://phabricator.wikimedia.org/T244548 (10SBisson) a:05SBisson→03None
[18:30:52] <wikibugs>	 (03PS2) 10Elukey: oozie: move all hive2 actions settings to analytics-hive.eqiad.wmnet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028)
[18:37:10] <elukey>	 weird, in some files I get an extra newline at the end
[18:40:15] <wikibugs>	 (03PS3) 10Elukey: oozie: move all hive2 actions settings to analytics-hive.eqiad.wmnet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028)
[18:42:59] <elukey>	 all right ready to go :)
[18:51:12] <elukey>	 also, I think that only oozie on bigtop supports multiple metastores
[18:51:14] <elukey>	 sigh
[18:51:28] <elukey>	 we'll need to test and see
[18:51:37] <elukey>	 anyway, logging off, ttl!
[18:51:59] <wikibugs>	 (03PS3) 10Mforns: Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339)
[18:56:02] <wikibugs>	 (03CR) 10Mforns: [V: 03+2] "Tested it with real ingestion. Seems to work!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns)
[20:22:55] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up automatic deletion/snitization for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) @ayounsi   As discussed in earlier comments, we Analytics will migrate netflow data in HDFS from `/wmf/data/wmf/netflow` to `/wmf/data...
[20:44:24] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up automatic deletion/snitization for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) @jallemandou @elukey  Here's a sketch of the migration plan, mostly a reference for myself! Please raise flags if something is missing...
[21:18:38] <wikibugs>	 (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)