[08:07:26] <joal>	 Good morning
[08:12:28] <elukey>	 bonjour
[08:13:27] <elukey>	 joal: an-presto1004 is down again /o'
[08:13:31] <elukey>	  /o\
[08:13:37] <joal>	 yeah I've seen that :(
[08:14:29] <joal>	 I really don't know what we should do
[08:15:16] <elukey>	 burn the host down
[08:15:44] <elukey>	 I'll have a chat with DCops, DELL replaced everything basically, so we'll probably need to swap it
[08:15:52] <joal>	 If we go for that solution let's make sure we are physically invited
[08:20:29] <joal>	 today is test-day for me, to deploy before tomorrow
[08:22:03] <elukey>	 joal: super, if you are ok I'd merge later on my patch to move all the hive creds to analytics-hive
[08:22:41] <joal>	 elukey: I'll take time to review if ok for you - double pair of eyes 
[08:23:12] <elukey>	 yep yep
[08:23:22] <elukey>	 I wanted to spare you from the review of a ton of text :)
[08:24:08] <joal>	 Thanks mate :)
[08:24:08] <elukey>	 I am going to review it again, I think I found a corner case
[08:27:11] <wikibugs>	 (03CR) 10Elukey: [C: 04-1] "In cases like:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey)
[08:27:13] <elukey>	 yes
[08:37:02] <elukey>	 so one thing that it is weird
[08:37:17] <elukey>	 we have 3 hive-related settings in coordinators
[08:37:18] <elukey>	 hive_principal
[08:37:22] <elukey>	 hive2_jdbc_url
[08:37:26] <elukey>	 hive_metastore_uri
[08:37:54] <elukey>	 the first is shared by hive2 actions (jdbc) and spark actions (that need the metastore)
[08:38:11] <elukey>	 and there is a wikidata coordinator with all 3
[08:38:29] <elukey>	 so I'd propose to just, for the moment, split hive_principal into two
[08:38:44] <elukey>	 hive_metastore_principal and hive_server_principal
[08:38:53] <elukey>	 and modify the coords xml accordingly
[08:42:14] <joal>	 Makes complete sense elukey - Good catch!
[08:43:59] * elukey cries in a corner
[08:44:18] <wikibugs>	 (03Abandoned) 10Elukey: oozie: move all hive2 actions settings to analytics-hive.eqiad.wmnet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey)
[08:53:54] <elukey>	 in any case, there is something weird
[08:54:14] <elukey>	 say that we have
[08:54:15] <elukey>	 hive_metastore_principal          = hive/an-coord1001.eqiad.wmnet@WIKIMEDIA
[08:54:18] <elukey>	 hive_metastore_uri                = thrift://an-coord1001.eqiad.wmnet:9083
[08:54:47] <elukey>	 I know how hive_metastore_uri can become with 2 metastores, but what about the principal? 
[08:56:06] <joal>	 hm - interesting elukey 
[08:57:58] <elukey>	 I am starting to think that using analytics-hive is ok
[08:58:16] <joal>	 as a mestastore with cname you mean?
[08:58:22] <elukey>	 exactly yes
[08:58:30] <joal>	 That would make sense
[08:59:13] <joal>	 In any case, cname change means manual intervention, right elukey ?
[08:59:24] <elukey>	 joal: yes exactly
[09:00:11] <joal>	 We could document in the procedure to force mysql transactions on metastore DB to be closed before switching (not sure how though)
[09:00:19] <joal>	 elukey: --^
[09:01:07] <joal>	 And we should also force the passive metastore to NOT answer calls while not behind cname, to prevent potential mistakes
[09:01:17] <elukey>	 joal: in theory it shouldn't be needed, the session is kept on the db so both metastores should be able to handle the same set of clients (in theory)
[09:01:46] <joal>	 Ah - metastore handles sessions - that's great :)
[09:07:04] <elukey>	 yes this is my understanding
[09:08:25] <elukey>	 !log force execution of refinery-drop-pageview-actor-hourly-partitions on an-launcher1002 (after args fixup from Joseph)
[09:08:26] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[09:08:39] <elukey>	 INFO   Dropping 2217 Hive partitions from table wmf.pageview_actor.
[09:11:33] <joal>	 \o/
[09:11:38] <joal>	 Thanks elukey 
[09:15:05] <elukey>	 np :)
[09:15:33] <elukey>	 correction from the above - the sessions' tokens are stored in the db,so multiple clients can in theory auth to multiple metastores
[09:15:56] <elukey>	 keeping their session alive I mean
[09:26:17] <wikibugs>	 (03PS2) 10Joal: Add tables to mediawiki-history-load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077)
[09:26:41] <joal>	 elukey: this means that we could actually do an active-active metastore
[09:27:07] <joal>	 elukey: we won't, but this means there is no need for us to make the passive metastore no listen to connections - right?
[09:28:37] <elukey>	 joal: in theory yes :D
[09:28:48] <joal>	 elukey: I'll go with the theroy ;)
[09:29:02] <elukey>	 but I think that there is no need to have the multi-uris thing
[09:29:25] <elukey>	 I mean, the purpose is for a client to decide what metastore to target, using a certain principal
[09:29:54] <elukey>	 so keeping metastore with different creds is not needed, but I don't see any reference of this in docs
[09:30:08] <joal>	 ack
[09:30:17] <joal>	 this still is bizzare :(
[09:33:49] <joal>	 elukey: launching a test of my oozie refactor - Normally the metastore overload should be solved, but I prefer to let you anyhow, in case :)
[09:33:56] <elukey>	 ack!
[09:42:03] <Amir1>	 elukey: hey, what do you suggest for me to do on this? https://gerrit.wikimedia.org/r/644002
[09:42:10] <joal>	 also elukey - the job you restarted this weekend is expected to be very long - it is the conversion of xml to avro
[09:44:50] <elukey>	 joal: yes yes I recall that, but the tons of errors were weird :(
[09:45:02] <joal>	 indeed!!
[09:45:26] <elukey>	 Amir1: hi! I am going to follow up and see
[09:47:04] <Amir1>	 Thanks!
[10:04:07] <elukey>	 Amir1: I created https://phabricator.wikimedia.org/T268978, let's wait for John to comment to see if we can solve this issue before merging, is it ok?
[10:06:15] <Amir1>	 sure
[10:06:29] <Amir1>	 It's Monday, my plate is more than full already 
[10:31:43] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10elukey) While reviewing my change, I realized that there is something more problematic for the Hive Metastore, that we should probably resolve sooner ra...
[10:34:54] <wikibugs>	 (03PS3) 10Joal: Add tables to mediawiki-history-load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077)
[10:48:53] <wikibugs>	 (03CR) 10Joal: [C: 03+1] "Tested on cluster! (with new table addition, will merge both together)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[10:49:01] <wikibugs>	 (03CR) 10Joal: [V: 03+2] Refactor oozie mediawiki-history-load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[10:49:11] <wikibugs>	 (03CR) 10Joal: [V: 03+2] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[10:49:39] <joal>	 Gone to eat with kids - back after - elukey with your permission I'll merge and deploy sqoop + oozie patches
[10:49:49] <elukey>	 +1
[10:49:55] <elukey>	 should I merge the puppet change?
[11:13:53] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10elukey) I think I found a good compromise for our use case. We could have a config for each of the following use case:  * on clients, hive-site.xml and...
[11:17:25] <elukey>	 Amir1: deployed :)
[11:17:30] <Amir1>	 \o/
[11:17:31] <Amir1>	 Thanks
[11:17:38] <elukey>	 thank you again
[11:17:49] <Amir1>	 I'll create more patches tomorrow 
[11:18:23] <Amir1>	 Thank you for deploying I just fell on my keyboard
[11:32:20] <klausman>	 Hey everyone. Migraine has struck again :( Not sure how useful I'll be today
[11:36:15] <wikibugs>	 10Analytics: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10elukey)
[11:39:53] <elukey>	 klausman: np, take care!
[11:39:57] * elukey lunch!
[12:19:26] <wikibugs>	 (03PS1) 10Gilles: ServerTiming has been folded into NavigationTiming [analytics/refinery] - 10https://gerrit.wikimedia.org/r/644201 (https://phabricator.wikimedia.org/T264987)
[12:32:56] <fdans>	 good morning!
[12:34:58] <joal>	 Hi fdans - Recovery wishes klausman :(
[13:52:19] <elukey>	 fdans: hola!
[13:52:37] <fdans>	 elukey: o/
[13:58:31] <wikibugs>	 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) > We deploy a on-host memcached instance, we have already a lot of puppet code + monitoring + metrics to re-use. Sounds good.  Q: Is there a reason we couldn't/shouldn't use the prod memcached cl...
[14:02:44] <wikibugs>	 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10Ottomata) This is very similar to an issue in Spark: https://issues.apache.org/jira/browse/SPARK-23890, which is why we are using the Hive session to alter the table in the first place.  The ch...
[14:05:00] <wikibugs>	 10Analytics, 10Operations: Backport kafkacat 1.6.0 from bullseye to buster-backports or buster-wikimedia - https://phabricator.wikimedia.org/T268936 (10Ottomata) THAT IS AWESOME YES PLEASE!
[14:05:42] <wikibugs>	 10Analytics, 10Analytics-Kanban: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10Ottomata) Thanks Luca!
[14:06:22] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10Ottomata) Cool!
[14:06:58] <wikibugs>	 10Analytics: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10Ottomata) > execute kinit -R automatically upon login for every user  THIS WOULD BE AWESOME!
[14:08:41] <wikibugs>	 (03CR) 10Ottomata: [C: 03+1] "One nit, feel free to ignore if you like yours better. :)" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns)
[14:18:11] <wikibugs>	 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10elukey) >>! In T268784#6655721, @Ottomata wrote: >> We deploy a on-host memcached instance, we have already a lot of puppet code + monitoring + metrics to re-use. > Sounds good.  Q: Is there a reason we co...
[14:18:38] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Investigate sporadic failures in oozie hive actions due to Kerberos auth - https://phabricator.wikimedia.org/T241650 (10elukey) Happened again today.
[14:22:10] <wikibugs>	 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) +1 sounds good!
[14:26:55] <joal>	 ok, going to deploy refinery for sqoop and related oozie jobs
[14:27:23] <joal>	 Adding a fix for mediawiki-history-reduced number of shards
[14:27:45] <joal>	 For folks around, anything special to deploy in refinery (not source)?
[14:29:58] <joal>	 fdans: given that https://gerrit.wikimedia.org/r/c/analytics/refinery/+/642079 has been merged, should the job been restarted?
[14:31:02] <wikibugs>	 (03PS8) 10Joal: Update sqoop adding tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077)
[14:31:04] <fdans>	 joal: yessir, are you doing the train now?
[14:31:19] <joal>	 fdans: outside-of-usual hours train :)
[14:31:24] <joal>	 train before 1st of month
[14:31:33] <fdans>	 ah understood
[14:31:36] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[14:31:55] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[14:32:32] <fdans>	 joal: I can take care of restarting since I was also going to run the 2015 missing period
[14:32:49] <joal>	 ok fdans - I'll ping you when deploy is done :)
[14:32:56] <fdans>	 joal: perfect
[14:33:22] <mforns>	 hello teammm
[14:33:29] <joal>	 Hi mforns 
[14:33:39] <wikibugs>	 (03PS5) 10Joal: Refactor oozie mediawiki-history-load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077)
[14:34:27] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[14:35:22] <wikibugs>	 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) I propose we do not migrate this schema and mark it as unused.  There hasn't been a schema edit on metawiki since Jan 2017, and there isn't even a Hive table for thi...
[14:35:59] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[14:36:19] <wikibugs>	 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) 05Open→03Declined FYI @sdkim I'm declining this one and marking it as To Deprecate on our audit sheet.
[14:36:40] <wikibugs>	 (03PS4) 10Joal: Add tables to mediawiki-history-load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077)
[14:38:07] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[14:39:50] <milimetric>	 o/
[14:41:02] <wikibugs>	 10Analytics, 10Anti-Harassment, 10Event-Platform: SpecialMuteSubmit Event Platform Migration - https://phabricator.wikimedia.org/T267350 (10Ottomata)
[14:41:04] <wikibugs>	 10Analytics, 10Anti-Harassment, 10Event-Platform: SpecialInvestigate Event Platform Migration - https://phabricator.wikimedia.org/T267349 (10Ottomata)
[14:41:06] <wikibugs>	 10Analytics, 10Anti-Harassment, 10Event-Platform: AutoblockIpBlock Event Platform Migration - https://phabricator.wikimedia.org/T267340 (10Ottomata)
[14:41:08] <wikibugs>	 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata)
[14:41:49] <wikibugs>	 (03PS2) 10Joal: Update mediawiki-history-reduced druid loading (shards) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643689 (https://phabricator.wikimedia.org/T268813)
[14:42:14] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643689 (https://phabricator.wikimedia.org/T268813) (owner: 10Joal)
[14:42:38] <milimetric>	 joal: don't forget to update the puppet systemd commands with the new tables 
[14:42:58] <joal>	 patch ready milimetric :) Thanks for the ping!
[14:43:08] <elukey>	 joal: should I merge it btw?
[14:43:23] <joal>	 elukey: after I deployed please (minimal change but I still prefer :)
[14:43:57] <joal>	 actually elukey, if you prefer you can go now - the code will run tomorrow and and in the meantime I'll have deployed it
[14:45:17] <elukey>	 nono that's ok, we can do it later :)
[14:45:38] <joal>	 !log Deploy refinery using scap
[14:45:39] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[14:46:38] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[14:49:42] <joal>	 !log Create new hive tables for newly sqooped data
[14:49:43] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[14:51:03] <elukey>	 all right sounds a green light for puppet :)
[14:52:23] <elukey>	 joal: puppet change deployed
[14:52:23] <wikibugs>	 (03PS4) 10Mforns: Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339)
[14:52:40] <wikibugs>	 (03CR) 10Mforns: Add datasource argument to HiveToDruid (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns)
[14:53:06] <joal>	 awesome elukey 
[14:53:08] <joal>	 thanks for that
[14:53:43] <wikibugs>	 (03CR) 10Mforns: [V: 03+2] Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns)
[14:56:51] <joal>	 !log Deploying refinery onto hdfs
[14:56:52] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[15:00:16] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[15:02:11] <joal>	 ok almost ready - a few jobs restarts and we're good - doing that post-standup
[15:02:19] <joal>	 Gone for kids, back at standup
[15:08:02] <wikibugs>	 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10elukey) Sounds good, I am going to rebuild hive with https://github.com/apache/hive/commit/e542f7f3cb103b7d33914d8b7510fbb294d8369c on top and I'll test if alters can be done at a session level.
[15:09:24] <wikibugs>	 10Analytics, 10Product-Analytics, 10Inuka-Team (Kanban): Set up preview counting for KaiOS app - https://phabricator.wikimedia.org/T244548 (10hueitan) a:03hueitan
[15:15:41] <mforns>	 ottomata: where can I pick up from re. migration of anti-harrasment schemas?
[15:17:00] <ottomata>	 ah mforns sorry should have pinged you about it, i'm getting it all the way to testwiki now, for the two Special* schemas
[15:17:06] <ottomata>	 i think the other ones don't actually need migrated
[15:17:40] <ottomata>	 mforns:  its kind of hard to do individual ones together, which i was kind of hoping we could do them in parallel,...but now we have a schedule...
[15:17:52] <ottomata>	 mforns:  you can do the growth ones (you already started!) for next week?
[15:17:59] <ottomata>	 you could prep all the patches for those now
[15:18:08] <mforns>	 ottomata: cool, will do
[15:18:24] <mforns>	 :]
[15:21:15] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[15:21:45] <razzi>	 Hi team, g'day
[15:22:44] <ottomata>	 razzi: hellllOoOOOO
[15:23:29] <razzi>	 hi ottomata, long time no chat! I see we're on ops week...!
[15:24:02] <ottomata>	 OH yeahhHHhh
[15:24:09] <ottomata>	 its a first time for both of us!
[15:24:14] <razzi>	 Oh yeah!!! ok
[15:24:43] <ottomata>	 reading https://wikitech.wikimedia.org/wiki/Analytics/Ops_week for the first time...
[15:25:46] <elukey>	 hello folks :)
[15:25:50] <ottomata>	 yoohoo
[15:26:10] <ottomata>	 mforns: what else is needed for us to merge your UI patch?
[15:26:12] <ottomata>	 one more review from dan?
[15:26:31] <mforns>	 ottomata: I was writing about that
[15:26:37] <mforns>	 the problem with Jenkins
[15:26:41] <ottomata>	 oh riught
[15:26:45] <mforns>	 I tried to solve that on Friday
[15:26:46] <ottomata>	 i'll look into that today
[15:26:47] <mforns>	 but no luck
[15:27:00] <ottomata>	 first: finish this migration, look into ops week, then UI patch
[15:27:01] <ottomata>	 :)
[15:27:06] <ottomata>	 well, put meetings in there somewhere
[15:27:10] <mforns>	 I believe it's the same that was happening to me with node-rdkafka, but I solved it by upgrading the lib locally
[15:27:17] <mforns>	 xD
[15:27:39] <mforns>	 well, I can solve it, I was just asking if it rang a bell to you? It seems Jenkins has a different setup than locally?
[15:28:00] <ottomata>	 hm, jenkins will run the tests in the docker image it builds
[15:28:19] <ottomata>	 OH, yeah, and i think it might be using the librdkafka deb to build node-rdkafka
[15:28:20] <ottomata>	 not sure.
[15:28:23] <mforns>	 ok, maybe I can try building the docker locally
[15:28:24] <ottomata>	 see  the .pipeline/ dir
[15:28:32] <ottomata>	 you can do that
[15:28:46] <mforns>	 ok, thanks
[15:28:57] <ottomata>	 blubber .pipeline/blubber.yaml <env_name> > Dockerfile
[15:29:01] <ottomata>	 then docker build ...
[15:29:10] <ottomata>	 oh but you need blubber :)
[15:29:43] <ottomata>	 !log migrated EventLogging schemas  SpecialMuteSubmit and SpecialInvestigate to EventGate - T268517
[15:29:46] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[15:29:46] <stashbot>	 T268517: Migrate Anti-Harassment EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T268517
[15:30:50] <wikibugs>	 (03PS2) 10Fdans: Expand EZ project conversion to adapt to raw format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/632597
[15:33:54] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Expand EZ project conversion to adapt to raw format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/632597 (owner: 10Fdans)
[15:48:02] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[15:49:32] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[15:56:45] <wikibugs>	 (03PS3) 10Fdans: Expand EZ project conversion to adapt to raw format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/632597
[16:02:40] <ottomata>	 milimetric:  standup?
[16:09:38] <wikibugs>	 (03PS20) 10Sbisson: Oozie job for Wikipedia Preview stats [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/635578 (https://phabricator.wikimedia.org/T261953)
[16:10:08] <wikibugs>	 (03CR) 10Sbisson: Oozie job for Wikipedia Preview stats (031 comment) [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/635578 (https://phabricator.wikimedia.org/T261953) (owner: 10Sbisson)
[16:12:57] <ottomata>	 elukey:  what happens if kinit -R with no active ticket?
[16:13:27] <elukey>	 ottomata: it fails saying that there is not ticket
[16:13:43] <ottomata>	 hmm ok great!
[16:13:44] <elukey>	 at least this is what I got from my tests
[16:38:16] <wikibugs>	 10Analytics-Features, 10Analytics-Radar: Feature request: Keeping track of time spent in phases of edits for users - https://phabricator.wikimedia.org/T268385 (10fdans) @jlinehan any thoughts on this?
[16:39:33] <wikibugs>	 10Analytics-Features, 10Analytics-Radar: Feature request: Keeping track of time spent in phases of edits for users - https://phabricator.wikimedia.org/T268385 (10Milimetric) (@Jukeboksi: basically, if Jason & team are working on this already, then maybe he can point you in the right direction.  If not, you'd h...
[16:39:44] <wikibugs>	 10Analytics, 10WMDE-Analytics-Engineering, 10Patch-For-Review, 10User-GoranSMilovanovic: Downscale Wikidata-analysis pyspark scripts to analytics limits - https://phabricator.wikimedia.org/T268684 (10Ottomata)
[16:40:03] <wikibugs>	 10Analytics-Clusters: Create kafka test cluster - https://phabricator.wikimedia.org/T268074 (10Ottomata)
[16:40:28] <wikibugs>	 10Analytics-Clusters: Create kafka test cluster - https://phabricator.wikimedia.org/T268074 (10Ottomata) a:03razzi
[16:42:15] <wikibugs>	 10Analytics, 10Analytics-EventLogging: Delete tofu table from staging database after research is done - https://phabricator.wikimedia.org/T70441 (10fdans) 05Open→03Resolved closing since it seems this table doesn't exist anymore
[16:42:17] <wikibugs>	 10Analytics, 10Analytics-EventLogging: UniversalLanguageSelector-tofu logging too much data - https://phabricator.wikimedia.org/T69463 (10fdans)
[16:46:14] <wikibugs>	 10Analytics, 10Analytics-Kanban: Fix purging pageview_actor data - https://phabricator.wikimedia.org/T268382 (10fdans) 05Open→03Resolved
[16:46:54] <wikibugs>	 10Analytics-Clusters: Review recurrent Hadoop worker disk saturation events - https://phabricator.wikimedia.org/T265487 (10Ottomata) p:05Triage→03Medium a:03elukey
[16:50:22] <wikibugs>	 10Analytics, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10fdans) cc @Milimetric @Ottomata
[16:51:05] <wikibugs>	 10Analytics, 10Analytics-Kanban: Fix pageview title accepted values (trailing EOL) - https://phabricator.wikimedia.org/T268630 (10fdans) patch: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/643255
[16:51:46] <wikibugs>	 10Analytics, 10Analytics-Kanban: Fix pageview title accepted values (trailing EOL) - https://phabricator.wikimedia.org/T268630 (10fdans) p:05Triage→03High
[16:53:03] <wikibugs>	 10Analytics, 10Analytics-Kanban: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10fdans) p:05Triage→03High
[16:56:05] <wikibugs>	 10Analytics-Clusters, 10Patch-For-Review: Create a temporary hadoop backup cluster - https://phabricator.wikimedia.org/T260411 (10Ottomata) a:05razzi→03elukey
[16:56:41] <wikibugs>	 10Analytics-Clusters: Put 24 Hadoop worker nodes in service (cluster expansion) - https://phabricator.wikimedia.org/T255146 (10Ottomata) a:03elukey
[16:58:32] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review an-coord1001's usage and failover plans - https://phabricator.wikimedia.org/T257412 (10Ottomata)
[16:58:55] <wikibugs>	 10Analytics: AQS pageview default caching is one day - https://phabricator.wikimedia.org/T268809 (10fdans) agreed on 4 during triage
[17:03:52] <wikibugs>	 10Analytics: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10Milimetric) thoughts: configure a 10 second timeout in the call from AQS to Druid, and retry 3 times (is there a retry mechanism already or do we just juggle the promise?)
[17:04:01] <wikibugs>	 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) a:03razzi
[17:04:12] <wikibugs>	 10Analytics-Clusters, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata)
[17:04:23] <wikibugs>	 10Analytics: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10Milimetric) p:05Triage→03High
[17:04:37] <wikibugs>	 10Analytics-Clusters, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) p:05Triage→03Medium
[17:05:08] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make druid mediawiki-history-reduced segments smaller - https://phabricator.wikimedia.org/T268813 (10fdans) p:05Triage→03High
[17:05:24] <wikibugs>	 10Analytics-Clusters, 10Operations: Backport kafkacat 1.6.0 from bullseye to buster-backports or buster-wikimedia - https://phabricator.wikimedia.org/T268936 (10Ottomata)
[17:05:43] <wikibugs>	 10Analytics-Clusters, 10Operations, 10ops-eqiad: an-presto1004 shows only the NIC in the boot list - https://phabricator.wikimedia.org/T268951 (10Ottomata)
[17:06:01] <wikibugs>	 10Analytics-Clusters: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10Ottomata) p:05Triage→03Medium a:03elukey
[17:06:25] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Refactor puppet profiles to reduce hiera pollution - https://phabricator.wikimedia.org/T268220 (10Ottomata)
[17:28:49] <joal>	 MEH - I just realized I messed up some naming
[17:29:32] <wikibugs>	 (03PS1) 10Joal: Fix names of updated mediawiki-history-load jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/644293
[17:29:33] <joal>	 I'm gonna deploy again :(
[17:29:52] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Hotfix - merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/644293 (owner: 10Joal)
[17:32:11] <joal>	 !log Deploy refinery using scap for naming hotfix
[17:32:12] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[17:32:45] <joal>	 !log Kill-restart mediawiki-history-reduced job for druid-public datasource number of shards update
[17:32:46] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[17:47:01] <elukey>	 joal: out of band deployments are 5 euros in the jar each, remember it :P
[17:47:51] * elukey runs away
[17:47:56] * joal owes beers to the ops team even without that :)
[17:49:01] <joal>	 !log Kill-restart mediawiki-history-load job after refactor (1 coordinator per table) and tables addition
[17:49:03] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[17:51:05] <wikibugs>	 10Analytics, 10ChangeProp, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Ottomata)
[17:51:24] <wikibugs>	 10Analytics-Radar, 10ChangeProp, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Ottomata)
[17:51:59] <joal>	 !log Deploy refinery onto hdfs
[17:52:01] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[17:53:54] <wikibugs>	 10Analytics-Radar, 10ChangeProp, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Pchelolo) We could add a method to JobQueue checking whether reporting statistic like this is supported an...
[18:03:26] * razzi afk for lunch
[18:10:35] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) @ayounsi Regarding netflow data size in Druid:  We Analytics took a look at the data size after adding the new field...
[18:16:50] * elukey afk!
[19:15:18] <wikibugs>	 10Analytics: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10Ottomata) Hm, this looks to be related to or caused by {T222603}.  The failed months timed out waiting for a dependency 2 months and 19 days after the month in question.  See: https://hue.wikimedia.org/oozie...
[19:17:09] <wikibugs>	 10Analytics: Improve Refine - https://phabricator.wikimedia.org/T266872 (10Ottomata) a:03Ottomata
[19:30:05] <wikibugs>	 10Analytics, 10Design: Broken icons on https://analytics.wikimedia.org/ - https://phabricator.wikimedia.org/T255840 (10Ottomata) This looks like a problem with Firefox, it works ok in Chrome.
[19:31:12] <wikibugs>	 10Analytics: Address refinery security vulnerabilities with jackson and netty - https://phabricator.wikimedia.org/T237774 (10Ottomata) I don't think this should be ops week, right?  This is a regular task that should be scheduled, no?  Moving back to incoming.
[19:32:34] <wikibugs>	 10Analytics: Add folder creation for sqoop initial installation in puppet - https://phabricator.wikimedia.org/T251788 (10Ottomata) a:03razzi So, the sqoop puppetization should just ensure that this directory exists?
[19:41:05] <wikibugs>	 10Analytics, 10Design: Broken icons on https://analytics.wikimedia.org/ - https://phabricator.wikimedia.org/T255840 (10SerDIDG) >>! In T255840#6656905, @Ottomata wrote: > This looks like a problem with Firefox, it works ok in Chrome.    Chrome (87.0.4280.67) and Safari (14.0.1) just does not show broken icons...
[19:51:51] <wikibugs>	 10Analytics: mediawiki-wikitext-history-2020-10 failed - https://phabricator.wikimedia.org/T269032 (10Ottomata)
[19:52:00] <mforns>	 ottomata: I don't manage to make blubber work, maybe it's because it's for debian stretch, and I use ubuntu 18 which is based on buster...?
[19:52:36] <mforns>	 when I run it it says it doesn't support v4 config
[19:53:21] <mforns>	 when I run `blubber -v` it returns `+`
[19:53:27] <mforns>	 which seems strange
[19:54:16] <mforns>	 oh, I'll try to install from source
[20:01:28] <mforns>	 ok, got it to work
[20:19:21] <ottomata>	 hmm weird
[20:19:24] <ottomata>	 i use on on mac 
[20:19:27] <ottomata>	 ok
[20:20:52] <ottomata>	 mforns:  lemme see if i can help
[20:23:51] <ottomata>	 yeah that seems like an rdkafka compatibility problem
[20:23:54] <ottomata>	 lemme bang on it
[20:26:01] <wikibugs>	 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access to production shell groups for JAnstee - https://phabricator.wikimedia.org/T266249 (10JAnstee_WMF) Excellent, @Dzahn  -- thank you - I will reach out if I need further support!
[20:27:34] <ottomata>	 mforns: i'm going to change the verison of node-rdkafka to the one we use for eventgate
[20:27:49] <ottomata>	 that seems to work for me
[20:31:14] <ottomata>	 mforns:  patch passes with version ~2.4.2
[20:31:18] <ottomata>	 ok if I merge?
[20:31:21] <ottomata>	 ...and deploy?! :o
[20:37:19] <ottomata>	 i'm merging because I have a moment to do so!
[20:37:21] <ottomata>	 hope that's ok!
[21:13:03] <mforns>	 ottomata: hey! sorry was having dinner
[21:13:19] <mforns>	 ottomata: yes, sure, can merge on my side
[21:13:34] <mforns>	 ottomata: did you deploy?
[21:14:16] <mforns>	 ottomata: and btw, thanks for solving the issue
[21:16:23] <ottomata>	 mforns:  yes, but i realized we need a build step to build the ui/dist folder
[21:16:26] <ottomata>	 trying to figure that out now
[21:16:50] <mforns>	 oh yes, just `npm run build`
[21:17:17] <mforns>	 hm, should we commit that?
[21:17:48] <ottomata>	 well, i need to run it from the parent as part of or after rnpm install
[21:17:55] <ottomata>	 since there is a separate package.json
[21:17:59] <mforns>	 aha
[21:18:07] <ottomata>	 so i'm trrying to make it a postinstall script
[21:18:15] <ottomata>	 it works locally, but in docker its being weird
[21:18:19] <ottomata>	 trying to figure itout now
[21:26:37] <ottomata>	 mforns: i dunno...
[21:26:47] <mforns>	 hm
[21:26:47] <ottomata>	 it might be easier to just get rid of the extra sub package
[21:26:54] <ottomata>	 we can keep the ui dir
[21:27:02] <ottomata>	 but just merge its package.json into the parent one
[21:27:12] <ottomata>	 not sure
[21:27:14] <ottomata>	 hmm lemme try
[21:27:17] <ottomata>	 maybe that is not my problem
[21:27:27] <mforns>	 but we'd still have to build
[21:27:43] <ottomata>	 yes, i'm having an issue running npm for npm in another package
[21:27:46] <ottomata>	 or something
[21:27:51] <ottomata>	 it might be a blubber issue...
[21:27:54] <mforns>	 ok
[21:28:03] <ottomata>	 hmm ithink it might be blubber/docker
[21:28:05] <ottomata>	 not sure what is going on
[21:28:08] <ottomata>	 let me try it and see
[21:28:15] <mforns>	 let me think if there's any problem with merging both package.jsons
[21:29:31] <mforns>	 mm, I think it should work
[21:29:47] <mforns>	 it has few dependencies
[21:29:57] <mforns>	 I don't think there's any comflict
[21:30:53] <mforns>	 *conflict
[21:34:52] * mforns ottomata: https://github.com/lerna/lerna
[21:40:45] <ottomata>	 aye yai yai
[21:41:24] <ottomata>	 mforns:  can i use vue-cli-service to build a target dir?
[21:41:27] <ottomata>	 trying to do e.g.
[21:41:33] <ottomata>	 vue-cli-service --dest ui/dist ./ui
[21:41:35] <ottomata>	 or something
[21:41:39] <ottomata>	 but not succeeding
[21:41:53] <mforns>	 I see
[21:42:04] <mforns>	 lookin
[21:46:30] <ottomata>	 oh maybe vue-cli-service build --dest ui/dist ./ui/src/main.js
[21:46:31] <ottomata>	 trying
[21:47:19] <mforns>	 I read you can specify the target directory, but can't find the way to specify the project root directory
[21:49:57] <ottomata>	 yeahhHhH i dunno, this seems like the wrong approach anyway, i was trying it to see if it was a problem with blubber or not
[21:49:59] <ottomata>	 but i'm pretty sure it is
[21:50:02] <ottomata>	 i should be able to do
[21:50:08] <ottomata>	 npm --prefix ./ui install && npm --prefix ./ui run build
[21:50:49] <mforns>	 aha
[21:51:15] <ottomata>	 but it doesn't work
[21:51:22] <ottomata>	 as a postinstall
[21:51:23] <mforns>	 yes, I was wondering about the index.html file if you did `vue-cli-service build --dest ui/dist ./ui/src/main.js`
[21:51:32] <ottomata>	 that fails ^ somehow
[21:51:33] <mforns>	 what's the error
[21:51:35] <mforns>	 ?
[21:51:40] <ottomata>	 lots
[21:51:40] <ottomata>	 but
[21:51:41] <ottomata>	 * @/apis/EventStreamsApi.js in ./node_modules/cache-loader/dist/cjs.js??ref--12-0!./node_modules/thread-loader/dist/cjs.js!./node_modules/babel-loader/lib!./node_modules/cache-loader/dist/cjs.js??ref--0-0!./node_modules/vue-loader/lib??vue-loader-options!./ui/src/views/Home.vue?vue&type=script&lang=js&
[21:51:55] <ottomata>	 oh ^ thats for the vue-cli-service build trick
[21:52:06] <ottomata>	 for just using postinstall
[21:52:09] <ottomata>	 its missing the ui dir
[21:52:13] <ottomata>	 after the first npm instll
[21:52:15] <ottomata>	 OHHHH
[21:52:16] <ottomata>	 i know why
[21:52:17] <ottomata>	 yes
[21:52:32] <ottomata>	 the dockerfile does npm install with just pacakge.json, before it copies the local src code into the image
[21:52:32] <mforns>	 O.o
[21:52:46] <mforns>	 aaah
[21:56:01] * mforns ottomata: https://cloudnweb.dev/2019/10/crafting-multi-stage-builds-with-docker-in-node-js/
[21:57:20] <ottomata>	 cloudnweb.dev refused to connect.
[22:07:56] <mforns>	 ?
[22:14:50] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10nettrom_WMF) >>! In T267333#6642129, @Ottomata wrote: > Hiya @nettrom_WMF, we'd like to migrate these schemas durin...
[22:18:19] <ottomata>	 yeah mforns this is a blubber issue 
[22:18:25] <ottomata>	 the dockerfile its building only copies package.json in
[22:18:35] <ottomata>	 i don't know how or when it gets the actual source directories and files
[22:18:40] <ottomata>	 beenn trying to make it do it
[22:18:44] <ottomata>	 but i'll have to pick it up tomorrow
[22:18:50] <mforns>	 ok
[22:19:34] <mforns>	 I feel a bit useless here, no big experience in npm or docker, zero in blubber
[23:06:46] <ottomata>	 mforns:  i think i have to switch to a new pipeline format
[23:06:51] <ottomata>	 https://wikitech.wikimedia.org/wiki/PipelineLib
[23:06:55] <ottomata>	 got some advice from releng
[23:06:59] <ottomata>	 i'll figure it out thank you!
[23:26:09] <wikibugs>	 10Analytics-Radar, 10Discovery: Display automata and humans separately on zero results rate graph - https://phabricator.wikimedia.org/T112846 (10Deskana) 05Open→03Declined Per the notice on https://discovery.wmflabs.org/ these dashboards are now in maintenance mode. This feature request is no longer relevant.