[05:57:07] <elukey>	 good morning
[05:57:25] <elukey>	 !log stop timers on an-launcher1002 to allow a reboot of an-coord1001
[05:57:28] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[05:57:52] <elukey>	 some of the kernel settings are not applied, I need to quickly reboot it
[06:20:52] <elukey>	 !log reboot an-coord1001 to pick up kernel security settings
[06:20:55] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:23:49] <elukey>	 host up
[06:25:37] <elukey>	 !log re-enable timers
[06:25:38] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:32:18] <elukey>	 !log roll restart of hadoop-yarn-nodemanagers to pick up new log4j settings - T276906
[06:32:24] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:32:24] <stashbot>	 T276906: Configure the HDFS Namenodes to use the log4j rolling gzip appender - https://phabricator.wikimedia.org/T276906
[06:37:21] <joal>	 Good morning
[06:37:28] <joal>	 Am I already too late elukey u?
[06:38:27] <elukey>	 joal: bonjour! nono I was progressing other tasks, nothing related to capacity :)
[06:38:46] <joal>	 Ah :)
[06:38:55] <elukey>	 when I reimaged an-coord1001 I didn't reboot, so kernel settings (security mostly) were not picked up, so I did it today
[06:39:05] <joal>	 ack - all good so far?
[06:39:07] <elukey>	 and the roll restart is to apply the log4j settings
[06:39:09] <elukey>	 yes yes
[06:39:16] <joal>	 awesome :)
[06:39:23] <elukey>	 I am happy about the gzip rolling appender, works nicely :)
[06:39:53] <joal>	 I understand :) About 10x less storage used on root partition is a big win!
[06:40:41] <elukey>	 also we are keeping 10x2GB files for the hdfs audit! \o/
[06:40:51] <joal>	 elukey: shall we take advantage of timers having been stopped for some time to deplo capacity scheduler?
[06:41:07] <elukey>	 joal: already restarted them! :(
[06:41:07] <joal>	 audit files: \o/ as well!
[06:41:29] <joal>	 elukey: almost no job kicked in - if you stop them anew we should be good to go in a matter of minutes
[06:41:35] <joal>	 elukey: this is why I ask :)
[06:42:16] <joal>	 elukey: we can can also let the cluster recover and apply later - let me know how you prefer
[06:44:48] <elukey>	 !log stop timers on an-launcher1002 again as prep step for capacity scheduler changes
[06:44:50] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:45:32] <elukey>	 ok ready then
[06:45:53] <elukey>	 going to send an email to announce@
[06:46:04] <joal>	 elukey: there is a camus job running currently, let's wait for that one to finish, then +1
[06:46:08] <joal>	 ack
[06:46:09] <joal>	 for email
[06:46:44] * joal is happy to see users leftover kernels disappear :)
[06:50:21] <elukey>	 email sent
[06:51:19] <joal>	 elukey: Arf- ongoing jobs are supposed to stay up, right?
[06:51:50] <elukey>	 joal: yes yes, it shouldn't impact people
[06:51:57] <joal>	 Actually elukey webrequest kicked in - shall I kill it manually and restart it after?
[06:52:07] <joal>	 only a small amount of compute done
[06:52:20] <elukey>	 joal: sure but I think we can leave it running as well
[06:52:26] <elukey>	 and kill in ase
[06:52:28] <elukey>	 *case
[06:52:39] <joal>	 ok, leaving of for now - time for coffee elukey :)
[06:52:46] <elukey>	 +1
[06:53:11] <elukey>	 brb as well
[07:22:17] <elukey>	 joal: ok if I merge the patch etc..?
[07:22:39] <joal>	 elukey: yessir - no impact expected before restart, right?
[07:25:32] <elukey>	 yes
[07:28:34] <joal>	 elukey: production queue empty - let's move :)
[07:30:18] <elukey>	 joal: doing it now
[07:31:02] <elukey>	 !log restart hadoop RM on an-master* to pick up capacity scheduler changes
[07:31:04] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:35:45] <elukey>	 joal: we are on capacity :)
[07:36:12] <joal>	 \o/
[07:36:25] <joal>	 let's restart the timers elukey :)
[07:36:32] <joal>	 I'm gonna monitor
[07:36:47] <elukey>	 ack
[07:36:57] <elukey>	 !log re-enable timers after setting the capacity scheduler
[07:37:00] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:37:10] <elukey>	 joal: I haven't added any label yet
[07:37:14] <joal>	 ack
[07:39:07] <elukey>	 aaand timers are back
[07:55:22] <joal>	 elukey: I had not noticed UI info about preemption - I foresee this to be very useful!
[07:56:13] <elukey>	 joal: where is it?
[07:56:27] <joal>	 At the bottom of the scheduler page
[07:56:44] <elukey>	 ahhh nice!
[07:57:13] <elukey>	 I am going to prep the changes to add the GPU labels
[07:57:24] <elukey>	 for the moment it is manual, every host is added/removed with commands
[07:57:31] <elukey>	 and the list is saved on hdfs
[07:57:34] <elukey>	 but
[07:57:47] <elukey>	 there is the possibility to transfer the labels to NM with scripts
[07:57:55] <elukey>	 that then are communicated to the RM etc..
[07:58:02] <elukey>	 but it was a little overkill :D
[07:59:00] <elukey>	 --
[07:59:07] <joal>	 :)
[07:59:11] <elukey>	 I also like a lot the scheduler breakdowns for used capacity etc..
[07:59:15] <elukey>	 very detailed
[07:59:28] <elukey>	 so happy that we did it joal!
[08:01:31] <elukey>	 !log restart hadoop-mapreduce-historyserver on an-master1001 after changes to the yarn ui user
[08:01:33] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[08:03:19] <elukey>	 verified that Hue can read yarn logs
[08:05:37] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review the Yarn Capacity scheduler and see if we can move to it - https://phabricator.wikimedia.org/T277062 (10elukey) Capacity scheduler deployed, all good up to now. In order to add labels, the following commands are needed:  ` sudo -u yarn kerb...
[08:05:50] <elukey>	 joal: will execute --^ when you give me the green light
[08:24:18] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Data-Infrastructure, 10Readers-Web-Backlog (Kanbanana-FY-2020-21): VirtualPageView should use EventLogging api to send virtual page view events - https://phabricator.wikimedia.org/T279382 (10ovasileva)
[08:25:21] <joal>	 elukey: two options about GPU labels - If you want us to do it today, we can do it now - if you don't mind, we can wait to tomorrow (monitoring of daily jobs)
[08:26:18] <elukey>	 joal: ack sure!
[08:41:07] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Configure the HDFS Namenodes to use the log4j rolling gzip appender - https://phabricator.wikimedia.org/T276906 (10elukey) The last step is to roll restart all hdfs daemons (journal and data nodes), but we can wait for the next openjdk upgrade (th...
[09:06:07] <wikibugs>	 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10elukey) @herron ping, we should start working on this :)
[09:21:20] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Data-Infrastructure, 10Readers-Web-Backlog (Kanbanana-FY-2020-21): VirtualPageView should use EventLogging api to send virtual page view events - https://phabricator.wikimedia.org/T279382 (10phuedx) >>! In T279382#7030279, @mforns wrote: > If...
[09:48:44] <hnowlan>	 deployment-eventlog08 was signed and puppeted over the weekend, seems like it just needed time to properly end pup in puppet's cert list - looks like eventlog works fine on python 3.7 based on the logs
[09:49:28] <hnowlan>	 event creation is so infrequent in deployment-prep it's not really obvious when the new instance came online on the graphs though  https://grafana-labs.wikimedia.org/d/JAX5JD9Gk/hnowlan-eventlogging?orgId=1
[09:50:09] <elukey>	 gooood
[09:50:20] <elukey>	 if nothing fires up in the logs we should be ok :)
[10:10:21] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban: Migrate eventlog1002 to buster - https://phabricator.wikimedia.org/T278137 (10Volans) What is the current status of `eventlog1003`? It's reported by a cumin check that ensures that all hosts matching the alias `A:all` are part of one of the datacenters, and `eventlog10...
[10:14:19] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban: Migrate eventlog1002 to buster - https://phabricator.wikimedia.org/T278137 (10hnowlan) >>! In T278137#7033264, @Volans wrote: > What is the current status of `eventlog1003`? > It's reported by a cumin check that ensures that all hosts matching the alias `A:all` are par...
[10:36:27] * elukey lunch time!
[13:14:08] <mforns>	 hi teammm!
[13:14:42] <elukey>	 hola Marcel
[13:14:56] <mforns>	 hey elukey :]
[13:21:01] <hnowlan>	 heyhey! 
[13:35:45] <awight>	 mforns: If you have a moment, I'm afraid that the reportupdater codemirror job failed.  The other three seem to be healthy.
[13:36:02] <mforns>	 ok
[13:36:07] <mforns>	 gimme 1 min
[13:39:30] <mforns>	 awight: yes looking into it
[13:47:47] <mforns>	 awight: I can not see any error, remember that reportupdater executes all reports that belong to 1 call in sequence, so what I see now is that for codemirror, the toggles report is done, and the sessions report is nearly done, but still executing. when that finishes, reportupdater will start with the users_codemirror_and_wikitext reports
[13:48:07] <mforns>	 this happens when we have backfilling
[13:48:24] <awight>	 Thanks!
[13:49:21] <mforns>	 maybe we could change the execution order of reportupdater, instead of runing: Report1(t1), Report1(t2), ..., Report1(tN), Report2(t1), Report2(t2), ...
[13:50:06] <mforns>	 we could make it run like: Report1(t1), Report2(t1), ..., ReportM(t1), Report1(t2), Report2(t2), ...
[13:50:07] <awight>	 In this case, just knowing that it's incomplete is plenty to work with :-)  I'll wait a few hours before sanity-checking the data.
[13:50:49] <awight>	 s/incomplete/pending/
[13:50:55] <mforns>	 awight: yes, and the queries for users_codemirror_and_wikitext take a while no? It's possible that backfilling for codemirror takes 1 or 2 days.
[13:55:51] <awight>	 Looks like the Grafana boards need adjustments anyway, to deal with a new dimension in the metrics paths.  So not the fault of the RU job, anyway.
[14:00:40] <wikibugs>	 10Analytics, 10EventStreams, 10Services: To provide performer array in RC stream - https://phabricator.wikimedia.org/T218063 (10Ottomata) There isn't a lot of priority around this, but there is talk about making revision-tags-change public in {T280538}.  I'm not sure, but https://gerrit.wikimedia.org/r/c/679...
[14:04:50] <joal>	 Gone for kids, back for standup
[14:08:01] <klausman>	 *grmbl* how do I get spark to save my result DF as JSON? all I get is *.snappy.
[14:17:31] <awight>	 mforns: Confirmed that that the codemirror.toggles job is healthy!
[14:18:36] <mforns>	 awight: cool! :]
[14:18:56] <mforns>	 awight: are graphite metrics placing correctly?
[14:29:40] <milimetric>	 hm, something's up with the metastore, I'm getting all kinds of errors running simple spark checks on webrequest, and some Hive operations like "desc" are taking forever
[14:42:53] <elukey>	 milimetric: o/ did you use --master yarn? We changed the scheduler today
[14:43:51] <elukey>	 can you post in here the errors that you get?
[14:45:42] <elukey>	 (I don't see sign of queues being overloaded though)
[14:45:52] <elukey>	 *signs
[14:49:43] <wikibugs>	 10Analytics: [event sanitization] Add a "mask" functionality to purge string fields - https://phabricator.wikimedia.org/T281144 (10mforns)
[14:54:10] <wikibugs>	 10Analytics, 10FR-Tech-Analytics: event.WikipediaPortal referer modification - https://phabricator.wikimedia.org/T279952 (10mforns) A couple comments.  1) The other day, talking with the team, we thought we Analytics could take this task, as sanitizing a full URL by applying a mask, could be useful to other da...
[14:55:25] <milimetric>	 elukey: I did use --master yarn, here's what I see:
[14:56:03] <milimetric>	 some ooms: Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
[14:56:40] <milimetric>	 actually... I think it's just the ooms and then a bunch of resulting errors
[14:59:38] <elukey>	 milimetric: yeah but maybe those are on the client side, because of too much data returned?
[14:59:53] <awight>	 mforns: Yes, the data is landing just as hoped :-)  Thanks for getting us over the many obstacles!
[15:00:24] <joal>	 klausman: have you found a solution?
[15:01:21] <joal>	 klausman: when reading from HDFS you can do 'hdfs dfs -text ' (instead of 'hdfs dfs -cat') - they get decompressed for you
[15:09:15] <milimetric>	 elukey: I don't know... for sure... or how I could tell... but it shouldn't be, I'm just doing the webrequest sequence stats false positive checker
[15:14:03] <wikibugs>	 (03CR) 10Ottomata: [C: 03+1] "Should I merge, or does this depend on https://gerrit.wikimedia.org/r/c/integration/config/+/681988" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/682093 (owner: 10Gehel)
[15:15:09] <wikibugs>	 (03CR) 10Gehel: "> Patch Set 1: Code-Review+1" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/682093 (owner: 10Gehel)
[15:31:09] <wikibugs>	 (03CR) 10Ottomata: [C: 03+2] Upgrade Findbugs to Spotbugs and integrate with Sonar. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/682093 (owner: 10Gehel)
[15:31:11] <elukey>	 milimetric: in theory the OOM errors should be coming from the jvm that runs your spark client, maybe the default settings for Xmx etc.. are too low?
[15:33:17] <milimetric>	 hm... never had a problem running that script before... I'll run it with more power
[15:34:00] <elukey>	 milimetric: more POWA! :)
[15:35:08] <isaacj>	 o/ elukey should i just remove any usage of `SET mapreduce.job.queuename=nice;` from scripts or does it need to be replaced with something? it looks like with the new scheduler, that just leaving it as default is fine?
[15:36:38] <elukey>	 isaacj: hi! For huge jobs you can use "fifo" instead, otherwise default is fine!
[15:36:53] <elukey>	 the mapping user -> queue for default is automatic for analytics-privatedata-users
[15:36:58] <elukey>	 so you should be good even without it
[15:36:58] <wikibugs>	 10Analytics-Radar: Reportupdater output can be corrupted by hive logging - https://phabricator.wikimedia.org/T275757 (10awight)
[15:37:01] <wikibugs>	 10Analytics-Radar, 10observability, 10Graphite, 10WMDE-TechWish-Sprint-2021-04-14: Broken reportupdater queries: edit count bucket label contains illegal characters - https://phabricator.wikimedia.org/T279046 (10awight)
[15:37:11] <wikibugs>	 10Analytics-Radar, 10observability, 10Graphite, 10WMDE-TechWish-Sprint-2021-04-14: Broken reportupdater queries: edit count bucket label contains illegal characters - https://phabricator.wikimedia.org/T279046 (10awight) 05Open→03Resolved a:03awight Thanks to the many people who helped with this!
[15:37:19] <isaacj>	 elukey: great -- this one isn't big, i think i was trying to be nice because it was a background job
[15:40:12] <tltaylor>	 FYI Olja's having some technical difficulties with WMF accounts, so there will probably be some delay in getting responses to you. Please invite her personal account to any meetings you may have scheduled for her today. 
[15:41:30] <joal>	 ack tltaylor - thanks for letting us know
[15:49:25] <wikibugs>	 10Analytics, 10Analytics-Kanban: Crunch and delete many old dumps logs - https://phabricator.wikimedia.org/T280678 (10fdans) p:05High→03Medium
[15:58:42] <wikibugs>	 10Analytics, 10WMCZ-Stats: Review request: New datasets for WMCZ published under analytics.wikimedia.org - https://phabricator.wikimedia.org/T279567 (10fdans) p:05Triage→03Medium
[16:01:55] <wikibugs>	 10Analytics: Use inclusive language - https://phabricator.wikimedia.org/T280268 (10fdans) p:05Triage→03High
[16:03:18] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Consolidate labs / production sqoop lists to a single list - https://phabricator.wikimedia.org/T280549 (10fdans) p:05Triage→03High
[16:03:45] <wikibugs>	 10Analytics: [reportupdater] add --no-graphite flag - https://phabricator.wikimedia.org/T280823 (10fdans) 05Open→03Resolved
[16:06:39] <wikibugs>	 10Analytics: [Reportupdater] Support category of jobs that cannot be backfilled - https://phabricator.wikimedia.org/T280997 (10fdans) p:05Triage→03Low
[16:09:06] <wikibugs>	 10Analytics: [event sanitization] Add a "mask" functionality to purge string fields - https://phabricator.wikimedia.org/T281144 (10fdans) p:05Triage→03Low
[16:09:19] <milimetric>	 elukey (cc joal): I ran it with "--executor-memory 8G --executor-cores 1 --driver-memory 4G" and it worked... what are the defaults?  That seems low
[16:14:27] <hnowlan>	 elukey: the pcc for eventlog1003 looks okay to me, I'm okay to merge it if it seems okay to you. I was wondering - should I add the new instance to https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/role/common/kafka/jumbo/broker.yaml#80 in the same CR or would it make sense to spin it up isolated from kafka to begin with? 
[16:23:19] <elukey>	 hnowlan: let's add it before, and run puppet on all jumbo nodes, otherwise eventlog1003 will not be able to pull from kafka
[16:23:28] <elukey>	 it is a very good point
[16:23:34] <elukey>	 razzi: --^
[16:23:52] <elukey>	 do you have time to follow up with Hugh? It is a good use case to work together 
[16:24:21] <elukey>	 basically eventlog1002 and 1003 will both consume from the same topics, splitting the partitions
[16:24:42] <elukey>	 so they can both work together for a bit, and then 1002 can be shutdown (to allow 1003 to fully takeover)
[16:24:49] <elukey>	 it is nice to check logs/metrics/etc..
[16:25:22] <razzi>	 Yeah, let me do some reading on this
[16:26:11] <hnowlan>	 Cool, I'll hang on for a bit - could even do these changes together tomorrow razzi? 
[16:26:16] <hnowlan>	 on a screenshare or whatever 
[16:27:10] <razzi>	 Yeah, I'd be happy to work together tomorrow hnowlan 
[16:30:10] <hnowlan>	 sounds great
[17:13:14] <wikibugs>	 10Analytics, 10Analytics-Wikistats, 10Inuka-Team, 10Language-strategy, and 2 others: Have a way to show the most popular pages per country - https://phabricator.wikimedia.org/T207171 (10JFishback_WMF) @Htriedman
[18:04:40] * elukey afk!
[18:05:41] <elukey>	 joal: we have a over capacity event right now for production, really nice to see it
[18:05:48] <elukey>	 https://yarn.wikimedia.org/cluster/scheduler?openQueues=Queue:%20production
[18:06:31] <elukey>	 (jobs are flowing but we are over capacity for production)
[18:06:41] * elukey afk for real :)
[18:17:46] <wikibugs>	 (03PS1) 10Awight: Remove all instances of deprecated `funnel` attribute [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682702 (https://phabricator.wikimedia.org/T193170)
[18:19:15] <wikibugs>	 10Analytics, 10Patch-For-Review: [reportupdater] eliminate the funnel parameter - https://phabricator.wikimedia.org/T193170 (10awight)
[18:20:43] <wikibugs>	 10Analytics, 10Patch-For-Review: [reportupdater] eliminate the funnel parameter - https://phabricator.wikimedia.org/T193170 (10awight)
[18:20:54] <wikibugs>	 10Analytics, 10Patch-For-Review: [reportupdater] eliminate the funnel parameter - https://phabricator.wikimedia.org/T193170 (10awight)
[18:23:51] <ebernhardson>	 how do i find the list of running coordinators and their status in the new hue? Apparently I haven't used it in awhile
[18:26:34] <ebernhardson>	 thinking maybe the problem is it only wants to show me things owned by ebernhardson, and not analytics-search? Unclear
[18:27:05] <joal>	 ebernhardson: removing the filter is needed indeed
[18:27:48] <ebernhardson>	 joal: but where? I cant find a filter, just buttons for workflows and bundles, that don't list anything, but coordinators are missing
[18:28:10] <joal>	 In new hue they are 'schedules'
[18:28:17] <joal>	 ebernhardson: 
[18:28:50] <ebernhardson>	 joal: ahh, and where would the filter be? Selecting `choose a workflow` from the schedules page gives a blank box with 'There are no workflows matching your search term'
[18:29:03] <ebernhardson>	 sorry, i'm having an old person moment and can't figure out this new software :P
[18:29:12] <joal>	 :D
[18:29:37] <milimetric>	 joal: if you take a look at my wmf branch (https://github.com/milimetric/incubator-gobblin/tree/wmf) I have some cleanup in a separate commit and the extractor working with a list of fields.  It didn't need changing outside of the extractor after all, pretty self-contained, but tell me if I missed something
[18:29:42] <milimetric>	 (added some tests)
[18:30:14] <ebernhardson>	 joal: oh! It's hidden a few levels in, first Jobs, then the nested tabs inside that (multi-level tabs in separate ui's isn't exactly expected..)
[18:30:27] <joal>	 ebernhardson: in the "Schedule" page, side to the boxes for ""Succeed
[18:30:30] <joal>	 indeed ebernhardson 
[18:30:56] <joal>	 ebernhardson: you found it faster than I typed :)
[18:31:15] <ebernhardson>	 joal: i must have seen and completely ignored that header a few times....but at least now i'll hopefully remember :)
[18:31:33] <joal>	 ebernhardson: empty lists make me remember :)
[18:32:11] <joal>	 ack milimetric, will embed in my current changes - thank you!
[18:32:38] <milimetric>	 joal: I guess now I start on the topic detection thing?
[18:32:59] <joal>	 milimetric: if you wish 
[18:33:29] <milimetric>	 joal: I just don't wanna conflict with something you're working on, if I got it right you're doing the offset synchronization part, right?
[18:33:52] <joal>	 milimetric: I'm revamping our code to use string instead of byte[]
[18:34:01] <joal>	 milimetric: Then I'll go to the checker
[18:34:21] <milimetric>	 ok, cool, sounds good
[18:46:06] <wikibugs>	 (03PS1) 10Awight: Convert `browser` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682730 (https://phabricator.wikimedia.org/T193169)
[18:54:23] <wikibugs>	 (03PS2) 10Awight: Convert `browser` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682730 (https://phabricator.wikimedia.org/T193169)
[18:59:12] <joal>	 milimetric: I have on purpose not changed the Guava optionals to java API ones, to match gobblin ways - I'll keep it as is if you don't mind
[18:59:50] <milimetric>	 joal: ah, I remember you saying that now.  I kind of don't like the compiler warnings, more noise in which real problems can hide
[19:00:02] <milimetric>	 consistency is nice though, so I see your point
[19:00:05] <joal>	 milimetric: I have incorporated your change for multi-field parsing, with making the json-parser taking a list as an argument (better to have more precise APIs
[19:01:08] <joal>	 milimetric: also, I have created a KafkaSource class in our wmf package, but have not included any code in there nor tests - This can be our base I imagine :)
[19:01:09] <milimetric>	 joal: agreed
[19:01:16] <joal>	 milimetric: pushing my patch now
[19:01:26] <milimetric>	 k, I'll work on the topic detection on top of that
[19:15:20] <joal>	 milimetric: commit pushed to jobar/wmf
[19:15:24] <joal>	 (forced)
[19:15:30] <joal>	 Ok team - gone for tonight :)
[19:15:57] <milimetric>	 o/
[20:30:24] <wikibugs>	 (03PS3) 10Awight: Convert some `browser` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682730 (https://phabricator.wikimedia.org/T193169)
[20:41:07] <wikibugs>	 (03PS1) 10Awight: Convert `cx` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682742 (https://phabricator.wikimedia.org/T193169)
[20:44:26] <wikibugs>	 (03PS1) 10Awight: Convert `interlanguage` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682743 (https://phabricator.wikimedia.org/T193169)
[20:47:11] <wikibugs>	 (03PS1) 10Awight: Convert `pingback` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682744 (https://phabricator.wikimedia.org/T193169)
[20:54:49] <wikibugs>	 (03PS1) 10Awight: Convert `published_cx2_translations` to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682747 (https://phabricator.wikimedia.org/T193169)
[20:58:07] <wikibugs>	 (03PS1) 10Awight: Convert `reference-previews` to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682748 (https://phabricator.wikimedia.org/T193169)
[21:00:46] <wikibugs>	 (03PS1) 10Awight: Convert `structured-data` to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682749 (https://phabricator.wikimedia.org/T193169)
[21:07:42] <wikibugs>	 (03PS1) 10Awight: Whitespace-only [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682752
[21:07:44] <wikibugs>	 (03PS1) 10Awight: Convert `wmcs` to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682753 (https://phabricator.wikimedia.org/T193169)
[21:08:49] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Data-Infrastructure, and 2 others: VirtualPageView should use EventLogging api to send virtual page view events - https://phabricator.wikimedia.org/T279382 (10Jdlrobson) a:05mforns→03None
[21:18:30] <wikibugs>	 (03PS2) 10Awight: Whitespace-only [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682752
[21:18:32] <wikibugs>	 (03PS2) 10Awight: Convert `interlanguage` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682743 (https://phabricator.wikimedia.org/T193169)
[21:23:11] <wikibugs>	 (03PS3) 10Awight: Whitespace-only [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682752
[21:23:13] <wikibugs>	 (03PS4) 10Awight: Convert some `browser` queries to native HiveHQL [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682730 (https://phabricator.wikimedia.org/T193169)
[21:44:34] <wikibugs>	 (03PS1) 10Awight: [WIP] Provide to_year, to_month, and to_day [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/682761 (https://phabricator.wikimedia.org/T193169)
[21:45:20] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Provide to_year, to_month, and to_day [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/682761 (https://phabricator.wikimedia.org/T193169) (owner: 10Awight)
[21:48:05] <wikibugs>	 (03PS2) 10Awight: [WIP] Provide to_year, to_month, and to_day [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/682761 (https://phabricator.wikimedia.org/T193169)
[22:51:21] <wikibugs>	 (03CR) 10Razzi: [V: 03+2 C: 03+2] Combine labs_grouped_wikis and prod_grouped_wikis to grouped_wikis [analytics/refinery] - 10https://gerrit.wikimedia.org/r/681496 (https://phabricator.wikimedia.org/T280549) (owner: 10Razzi)
[23:30:33] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 3 others: Automate ingestion and refinement into Hive of event data from Kafka using stream configs and canary/heartbeat events - https://phabricator.wikimedia.org/T251609 (10Milimetric)