[08:07:26] Good morning [08:12:28] bonjour [08:13:27] joal: an-presto1004 is down again /o' [08:13:31] /o\ [08:13:37] yeah I've seen that :( [08:14:29] I really don't know what we should do [08:15:16] burn the host down [08:15:44] I'll have a chat with DCops, DELL replaced everything basically, so we'll probably need to swap it [08:15:52] If we go for that solution let's make sure we are physically invited [08:20:29] today is test-day for me, to deploy before tomorrow [08:22:03] joal: super, if you are ok I'd merge later on my patch to move all the hive creds to analytics-hive [08:22:41] elukey: I'll take time to review if ok for you - double pair of eyes [08:23:12] yep yep [08:23:22] I wanted to spare you from the review of a ton of text :) [08:24:08] Thanks mate :) [08:24:08] I am going to review it again, I think I found a corner case [08:27:11] (03CR) 10Elukey: [C: 04-1] "In cases like:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey) [08:27:13] yes [08:37:02] so one thing that it is weird [08:37:17] we have 3 hive-related settings in coordinators [08:37:18] hive_principal [08:37:22] hive2_jdbc_url [08:37:26] hive_metastore_uri [08:37:54] the first is shared by hive2 actions (jdbc) and spark actions (that need the metastore) [08:38:11] and there is a wikidata coordinator with all 3 [08:38:29] so I'd propose to just, for the moment, split hive_principal into two [08:38:44] hive_metastore_principal and hive_server_principal [08:38:53] and modify the coords xml accordingly [08:42:14] Makes complete sense elukey - Good catch! [08:43:59] * elukey cries in a corner [08:44:18] (03Abandoned) 10Elukey: oozie: move all hive2 actions settings to analytics-hive.eqiad.wmnet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643762 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey) [08:53:54] in any case, there is something weird [08:54:14] say that we have [08:54:15] hive_metastore_principal = hive/an-coord1001.eqiad.wmnet@WIKIMEDIA [08:54:18] hive_metastore_uri = thrift://an-coord1001.eqiad.wmnet:9083 [08:54:47] I know how hive_metastore_uri can become with 2 metastores, but what about the principal? [08:56:06] hm - interesting elukey [08:57:58] I am starting to think that using analytics-hive is ok [08:58:16] as a mestastore with cname you mean? [08:58:22] exactly yes [08:58:30] That would make sense [08:59:13] In any case, cname change means manual intervention, right elukey ? [08:59:24] joal: yes exactly [09:00:11] We could document in the procedure to force mysql transactions on metastore DB to be closed before switching (not sure how though) [09:00:19] elukey: --^ [09:01:07] And we should also force the passive metastore to NOT answer calls while not behind cname, to prevent potential mistakes [09:01:17] joal: in theory it shouldn't be needed, the session is kept on the db so both metastores should be able to handle the same set of clients (in theory) [09:01:46] Ah - metastore handles sessions - that's great :) [09:07:04] yes this is my understanding [09:08:25] !log force execution of refinery-drop-pageview-actor-hourly-partitions on an-launcher1002 (after args fixup from Joseph) [09:08:26] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:08:39] INFO Dropping 2217 Hive partitions from table wmf.pageview_actor. [09:11:33] \o/ [09:11:38] Thanks elukey [09:15:05] np :) [09:15:33] correction from the above - the sessions' tokens are stored in the db,so multiple clients can in theory auth to multiple metastores [09:15:56] keeping their session alive I mean [09:26:17] (03PS2) 10Joal: Add tables to mediawiki-history-load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) [09:26:41] elukey: this means that we could actually do an active-active metastore [09:27:07] elukey: we won't, but this means there is no need for us to make the passive metastore no listen to connections - right? [09:28:37] joal: in theory yes :D [09:28:48] elukey: I'll go with the theroy ;) [09:29:02] but I think that there is no need to have the multi-uris thing [09:29:25] I mean, the purpose is for a client to decide what metastore to target, using a certain principal [09:29:54] so keeping metastore with different creds is not needed, but I don't see any reference of this in docs [09:30:08] ack [09:30:17] this still is bizzare :( [09:33:49] elukey: launching a test of my oozie refactor - Normally the metastore overload should be solved, but I prefer to let you anyhow, in case :) [09:33:56] ack! [09:42:03] elukey: hey, what do you suggest for me to do on this? https://gerrit.wikimedia.org/r/644002 [09:42:10] also elukey - the job you restarted this weekend is expected to be very long - it is the conversion of xml to avro [09:44:50] joal: yes yes I recall that, but the tons of errors were weird :( [09:45:02] indeed!! [09:45:26] Amir1: hi! I am going to follow up and see [09:47:04] Thanks! [10:04:07] Amir1: I created https://phabricator.wikimedia.org/T268978, let's wait for John to comment to see if we can solve this issue before merging, is it ok? [10:06:15] sure [10:06:29] It's Monday, my plate is more than full already [10:31:43] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10elukey) While reviewing my change, I realized that there is something more problematic for the Hive Metastore, that we should probably resolve sooner ra... [10:34:54] (03PS3) 10Joal: Add tables to mediawiki-history-load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) [10:48:53] (03CR) 10Joal: [C: 03+1] "Tested on cluster! (with new table addition, will merge both together)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal) [10:49:01] (03CR) 10Joal: [V: 03+2] Refactor oozie mediawiki-history-load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal) [10:49:11] (03CR) 10Joal: [V: 03+2] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal) [10:49:39] Gone to eat with kids - back after - elukey with your permission I'll merge and deploy sqoop + oozie patches [10:49:49] +1 [10:49:55] should I merge the puppet change? [11:13:53] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10elukey) I think I found a good compromise for our use case. We could have a config for each of the following use case: * on clients, hive-site.xml and... [11:17:25] Amir1: deployed :) [11:17:30] \o/ [11:17:31] Thanks [11:17:38] thank you again [11:17:49] I'll create more patches tomorrow [11:18:23] Thank you for deploying I just fell on my keyboard [11:32:20] Hey everyone. Migraine has struck again :( Not sure how useful I'll be today [11:36:15] 10Analytics: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10elukey) [11:39:53] klausman: np, take care! [11:39:57] * elukey lunch! [12:19:26] (03PS1) 10Gilles: ServerTiming has been folded into NavigationTiming [analytics/refinery] - 10https://gerrit.wikimedia.org/r/644201 (https://phabricator.wikimedia.org/T264987) [12:32:56] good morning! [12:34:58] Hi fdans - Recovery wishes klausman :( [13:52:19] fdans: hola! [13:52:37] elukey: o/ [13:58:31] 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) > We deploy a on-host memcached instance, we have already a lot of puppet code + monitoring + metrics to re-use. Sounds good. Q: Is there a reason we couldn't/shouldn't use the prod memcached cl... [14:02:44] 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10Ottomata) This is very similar to an issue in Spark: https://issues.apache.org/jira/browse/SPARK-23890, which is why we are using the Hive session to alter the table in the first place. The ch... [14:05:00] 10Analytics, 10Operations: Backport kafkacat 1.6.0 from bullseye to buster-backports or buster-wikimedia - https://phabricator.wikimedia.org/T268936 (10Ottomata) THAT IS AWESOME YES PLEASE! [14:05:42] 10Analytics, 10Analytics-Kanban: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10Ottomata) Thanks Luca! [14:06:22] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10Ottomata) Cool! [14:06:58] 10Analytics: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10Ottomata) > execute kinit -R automatically upon login for every user THIS WOULD BE AWESOME! [14:08:41] (03CR) 10Ottomata: [C: 03+1] "One nit, feel free to ignore if you like yours better. :)" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns) [14:18:11] 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10elukey) >>! In T268784#6655721, @Ottomata wrote: >> We deploy a on-host memcached instance, we have already a lot of puppet code + monitoring + metrics to re-use. > Sounds good. Q: Is there a reason we co... [14:18:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Investigate sporadic failures in oozie hive actions due to Kerberos auth - https://phabricator.wikimedia.org/T241650 (10elukey) Happened again today. [14:22:10] 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) +1 sounds good! [14:26:55] ok, going to deploy refinery for sqoop and related oozie jobs [14:27:23] Adding a fix for mediawiki-history-reduced number of shards [14:27:45] For folks around, anything special to deploy in refinery (not source)? [14:29:58] fdans: given that https://gerrit.wikimedia.org/r/c/analytics/refinery/+/642079 has been merged, should the job been restarted? [14:31:02] (03PS8) 10Joal: Update sqoop adding tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077) [14:31:04] joal: yessir, are you doing the train now? [14:31:19] fdans: outside-of-usual hours train :) [14:31:24] train before 1st of month [14:31:33] ah understood [14:31:36] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [14:31:55] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal) [14:32:32] joal: I can take care of restarting since I was also going to run the 2015 missing period [14:32:49] ok fdans - I'll ping you when deploy is done :) [14:32:56] joal: perfect [14:33:22] hello teammm [14:33:29] Hi mforns [14:33:39] (03PS5) 10Joal: Refactor oozie mediawiki-history-load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) [14:34:27] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal) [14:35:22] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) I propose we do not migrate this schema and mark it as unused. There hasn't been a schema edit on metawiki since Jan 2017, and there isn't even a Hive table for thi... [14:35:59] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [14:36:19] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) 05Open→03Declined FYI @sdkim I'm declining this one and marking it as To Deprecate on our audit sheet. [14:36:40] (03PS4) 10Joal: Add tables to mediawiki-history-load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) [14:38:07] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643985 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal) [14:39:50] o/ [14:41:02] 10Analytics, 10Anti-Harassment, 10Event-Platform: SpecialMuteSubmit Event Platform Migration - https://phabricator.wikimedia.org/T267350 (10Ottomata) [14:41:04] 10Analytics, 10Anti-Harassment, 10Event-Platform: SpecialInvestigate Event Platform Migration - https://phabricator.wikimedia.org/T267349 (10Ottomata) [14:41:06] 10Analytics, 10Anti-Harassment, 10Event-Platform: AutoblockIpBlock Event Platform Migration - https://phabricator.wikimedia.org/T267340 (10Ottomata) [14:41:08] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) [14:41:49] (03PS2) 10Joal: Update mediawiki-history-reduced druid loading (shards) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643689 (https://phabricator.wikimedia.org/T268813) [14:42:14] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy before 1st of month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643689 (https://phabricator.wikimedia.org/T268813) (owner: 10Joal) [14:42:38] joal: don't forget to update the puppet systemd commands with the new tables [14:42:58] patch ready milimetric :) Thanks for the ping! [14:43:08] joal: should I merge it btw? [14:43:23] elukey: after I deployed please (minimal change but I still prefer :) [14:43:57] actually elukey, if you prefer you can go now - the code will run tomorrow and and in the meantime I'll have deployed it [14:45:17] nono that's ok, we can do it later :) [14:45:38] !log Deploy refinery using scap [14:45:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:46:38] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [14:49:42] !log Create new hive tables for newly sqooped data [14:49:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:51:03] all right sounds a green light for puppet :) [14:52:23] joal: puppet change deployed [14:52:23] (03PS4) 10Mforns: Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) [14:52:40] (03CR) 10Mforns: Add datasource argument to HiveToDruid (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns) [14:53:06] awesome elukey [14:53:08] thanks for that [14:53:43] (03CR) 10Mforns: [V: 03+2] Add datasource argument to HiveToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643748 (https://phabricator.wikimedia.org/T231339) (owner: 10Mforns) [14:56:51] !log Deploying refinery onto hdfs [14:56:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:00:16] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [15:02:11] ok almost ready - a few jobs restarts and we're good - doing that post-standup [15:02:19] Gone for kids, back at standup [15:08:02] 10Analytics: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10elukey) Sounds good, I am going to rebuild hive with https://github.com/apache/hive/commit/e542f7f3cb103b7d33914d8b7510fbb294d8369c on top and I'll test if alters can be done at a session level. [15:09:24] 10Analytics, 10Product-Analytics, 10Inuka-Team (Kanban): Set up preview counting for KaiOS app - https://phabricator.wikimedia.org/T244548 (10hueitan) a:03hueitan [15:15:41] ottomata: where can I pick up from re. migration of anti-harrasment schemas? [15:17:00] ah mforns sorry should have pinged you about it, i'm getting it all the way to testwiki now, for the two Special* schemas [15:17:06] i think the other ones don't actually need migrated [15:17:40] mforns: its kind of hard to do individual ones together, which i was kind of hoping we could do them in parallel,...but now we have a schedule... [15:17:52] mforns: you can do the growth ones (you already started!) for next week? [15:17:59] you could prep all the patches for those now [15:18:08] ottomata: cool, will do [15:18:24] :] [15:21:15] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [15:21:45] Hi team, g'day [15:22:44] razzi: hellllOoOOOO [15:23:29] hi ottomata, long time no chat! I see we're on ops week...! [15:24:02] OH yeahhHHhh [15:24:09] its a first time for both of us! [15:24:14] Oh yeah!!! ok [15:24:43] reading https://wikitech.wikimedia.org/wiki/Analytics/Ops_week for the first time... [15:25:46] hello folks :) [15:25:50] yoohoo [15:26:10] mforns: what else is needed for us to merge your UI patch? [15:26:12] one more review from dan? [15:26:31] ottomata: I was writing about that [15:26:37] the problem with Jenkins [15:26:41] oh riught [15:26:45] I tried to solve that on Friday [15:26:46] i'll look into that today [15:26:47] but no luck [15:27:00] first: finish this migration, look into ops week, then UI patch [15:27:01] :) [15:27:06] well, put meetings in there somewhere [15:27:10] I believe it's the same that was happening to me with node-rdkafka, but I solved it by upgrading the lib locally [15:27:17] xD [15:27:39] well, I can solve it, I was just asking if it rang a bell to you? It seems Jenkins has a different setup than locally? [15:28:00] hm, jenkins will run the tests in the docker image it builds [15:28:19] OH, yeah, and i think it might be using the librdkafka deb to build node-rdkafka [15:28:20] not sure. [15:28:23] ok, maybe I can try building the docker locally [15:28:24] see the .pipeline/ dir [15:28:32] you can do that [15:28:46] ok, thanks [15:28:57] blubber .pipeline/blubber.yaml > Dockerfile [15:29:01] then docker build ... [15:29:10] oh but you need blubber :) [15:29:43] !log migrated EventLogging schemas SpecialMuteSubmit and SpecialInvestigate to EventGate - T268517 [15:29:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:29:46] T268517: Migrate Anti-Harassment EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T268517 [15:30:50] (03PS2) 10Fdans: Expand EZ project conversion to adapt to raw format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/632597 [15:33:54] (03CR) 10jerkins-bot: [V: 04-1] Expand EZ project conversion to adapt to raw format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/632597 (owner: 10Fdans) [15:48:02] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [15:49:32] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [15:56:45] (03PS3) 10Fdans: Expand EZ project conversion to adapt to raw format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/632597 [16:02:40] milimetric: standup? [16:09:38] (03PS20) 10Sbisson: Oozie job for Wikipedia Preview stats [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/635578 (https://phabricator.wikimedia.org/T261953) [16:10:08] (03CR) 10Sbisson: Oozie job for Wikipedia Preview stats (031 comment) [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/635578 (https://phabricator.wikimedia.org/T261953) (owner: 10Sbisson) [16:12:57] elukey: what happens if kinit -R with no active ticket? [16:13:27] ottomata: it fails saying that there is not ticket [16:13:43] hmm ok great! [16:13:44] at least this is what I got from my tests [16:38:16] 10Analytics-Features, 10Analytics-Radar: Feature request: Keeping track of time spent in phases of edits for users - https://phabricator.wikimedia.org/T268385 (10fdans) @jlinehan any thoughts on this? [16:39:33] 10Analytics-Features, 10Analytics-Radar: Feature request: Keeping track of time spent in phases of edits for users - https://phabricator.wikimedia.org/T268385 (10Milimetric) (@Jukeboksi: basically, if Jason & team are working on this already, then maybe he can point you in the right direction. If not, you'd h... [16:39:44] 10Analytics, 10WMDE-Analytics-Engineering, 10Patch-For-Review, 10User-GoranSMilovanovic: Downscale Wikidata-analysis pyspark scripts to analytics limits - https://phabricator.wikimedia.org/T268684 (10Ottomata) [16:40:03] 10Analytics-Clusters: Create kafka test cluster - https://phabricator.wikimedia.org/T268074 (10Ottomata) [16:40:28] 10Analytics-Clusters: Create kafka test cluster - https://phabricator.wikimedia.org/T268074 (10Ottomata) a:03razzi [16:42:15] 10Analytics, 10Analytics-EventLogging: Delete tofu table from staging database after research is done - https://phabricator.wikimedia.org/T70441 (10fdans) 05Open→03Resolved closing since it seems this table doesn't exist anymore [16:42:17] 10Analytics, 10Analytics-EventLogging: UniversalLanguageSelector-tofu logging too much data - https://phabricator.wikimedia.org/T69463 (10fdans) [16:46:14] 10Analytics, 10Analytics-Kanban: Fix purging pageview_actor data - https://phabricator.wikimedia.org/T268382 (10fdans) 05Open→03Resolved [16:46:54] 10Analytics-Clusters: Review recurrent Hadoop worker disk saturation events - https://phabricator.wikimedia.org/T265487 (10Ottomata) p:05Triage→03Medium a:03elukey [16:50:22] 10Analytics, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10fdans) cc @Milimetric @Ottomata [16:51:05] 10Analytics, 10Analytics-Kanban: Fix pageview title accepted values (trailing EOL) - https://phabricator.wikimedia.org/T268630 (10fdans) patch: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/643255 [16:51:46] 10Analytics, 10Analytics-Kanban: Fix pageview title accepted values (trailing EOL) - https://phabricator.wikimedia.org/T268630 (10fdans) p:05Triage→03High [16:53:03] 10Analytics, 10Analytics-Kanban: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10fdans) p:05Triage→03High [16:56:05] 10Analytics-Clusters, 10Patch-For-Review: Create a temporary hadoop backup cluster - https://phabricator.wikimedia.org/T260411 (10Ottomata) a:05razzi→03elukey [16:56:41] 10Analytics-Clusters: Put 24 Hadoop worker nodes in service (cluster expansion) - https://phabricator.wikimedia.org/T255146 (10Ottomata) a:03elukey [16:58:32] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review an-coord1001's usage and failover plans - https://phabricator.wikimedia.org/T257412 (10Ottomata) [16:58:55] 10Analytics: AQS pageview default caching is one day - https://phabricator.wikimedia.org/T268809 (10fdans) agreed on 4 during triage [17:03:52] 10Analytics: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10Milimetric) thoughts: configure a 10 second timeout in the call from AQS to Druid, and retry 3 times (is there a retry mechanism already or do we just juggle the promise?) [17:04:01] 10Analytics, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) a:03razzi [17:04:12] 10Analytics-Clusters, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) [17:04:23] 10Analytics: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10Milimetric) p:05Triage→03High [17:04:37] 10Analytics-Clusters, 10Product-Analytics: Configure superset cache - https://phabricator.wikimedia.org/T268784 (10Ottomata) p:05Triage→03Medium [17:05:08] 10Analytics, 10Analytics-Kanban: Make druid mediawiki-history-reduced segments smaller - https://phabricator.wikimedia.org/T268813 (10fdans) p:05Triage→03High [17:05:24] 10Analytics-Clusters, 10Operations: Backport kafkacat 1.6.0 from bullseye to buster-backports or buster-wikimedia - https://phabricator.wikimedia.org/T268936 (10Ottomata) [17:05:43] 10Analytics-Clusters, 10Operations, 10ops-eqiad: an-presto1004 shows only the NIC in the boot list - https://phabricator.wikimedia.org/T268951 (10Ottomata) [17:06:01] 10Analytics-Clusters: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10Ottomata) p:05Triage→03Medium a:03elukey [17:06:25] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Refactor puppet profiles to reduce hiera pollution - https://phabricator.wikimedia.org/T268220 (10Ottomata) [17:28:49] MEH - I just realized I messed up some naming [17:29:32] (03PS1) 10Joal: Fix names of updated mediawiki-history-load jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/644293 [17:29:33] I'm gonna deploy again :( [17:29:52] (03CR) 10Joal: [V: 03+2 C: 03+2] "Hotfix - merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/644293 (owner: 10Joal) [17:32:11] !log Deploy refinery using scap for naming hotfix [17:32:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:32:45] !log Kill-restart mediawiki-history-reduced job for druid-public datasource number of shards update [17:32:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:47:01] joal: out of band deployments are 5 euros in the jar each, remember it :P [17:47:51] * elukey runs away [17:47:56] * joal owes beers to the ops team even without that :) [17:49:01] !log Kill-restart mediawiki-history-load job after refactor (1 coordinator per table) and tables addition [17:49:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:51:05] 10Analytics, 10ChangeProp, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Ottomata) [17:51:24] 10Analytics-Radar, 10ChangeProp, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Ottomata) [17:51:59] !log Deploy refinery onto hdfs [17:52:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:53:54] 10Analytics-Radar, 10ChangeProp, 10Event-Platform, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Pchelolo) We could add a method to JobQueue checking whether reporting statistic like this is supported an... [18:03:26] * razzi afk for lunch [18:10:35] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) @ayounsi Regarding netflow data size in Druid: We Analytics took a look at the data size after adding the new field... [18:16:50] * elukey afk! [19:15:18] 10Analytics: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10Ottomata) Hm, this looks to be related to or caused by {T222603}. The failed months timed out waiting for a dependency 2 months and 19 days after the month in question. See: https://hue.wikimedia.org/oozie... [19:17:09] 10Analytics: Improve Refine - https://phabricator.wikimedia.org/T266872 (10Ottomata) a:03Ottomata [19:30:05] 10Analytics, 10Design: Broken icons on https://analytics.wikimedia.org/ - https://phabricator.wikimedia.org/T255840 (10Ottomata) This looks like a problem with Firefox, it works ok in Chrome. [19:31:12] 10Analytics: Address refinery security vulnerabilities with jackson and netty - https://phabricator.wikimedia.org/T237774 (10Ottomata) I don't think this should be ops week, right? This is a regular task that should be scheduled, no? Moving back to incoming. [19:32:34] 10Analytics: Add folder creation for sqoop initial installation in puppet - https://phabricator.wikimedia.org/T251788 (10Ottomata) a:03razzi So, the sqoop puppetization should just ensure that this directory exists? [19:41:05] 10Analytics, 10Design: Broken icons on https://analytics.wikimedia.org/ - https://phabricator.wikimedia.org/T255840 (10SerDIDG) >>! In T255840#6656905, @Ottomata wrote: > This looks like a problem with Firefox, it works ok in Chrome. Chrome (87.0.4280.67) and Safari (14.0.1) just does not show broken icons... [19:51:51] 10Analytics: mediawiki-wikitext-history-2020-10 failed - https://phabricator.wikimedia.org/T269032 (10Ottomata) [19:52:00] ottomata: I don't manage to make blubber work, maybe it's because it's for debian stretch, and I use ubuntu 18 which is based on buster...? [19:52:36] when I run it it says it doesn't support v4 config [19:53:21] when I run `blubber -v` it returns `+` [19:53:27] which seems strange [19:54:16] oh, I'll try to install from source [20:01:28] ok, got it to work [20:19:21] hmm weird [20:19:24] i use on on mac [20:19:27] ok [20:20:52] mforns: lemme see if i can help [20:23:51] yeah that seems like an rdkafka compatibility problem [20:23:54] lemme bang on it [20:26:01] 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access to production shell groups for JAnstee - https://phabricator.wikimedia.org/T266249 (10JAnstee_WMF) Excellent, @Dzahn -- thank you - I will reach out if I need further support! [20:27:34] mforns: i'm going to change the verison of node-rdkafka to the one we use for eventgate [20:27:49] that seems to work for me [20:31:14] mforns: patch passes with version ~2.4.2 [20:31:18] ok if I merge? [20:31:21] ...and deploy?! :o [20:37:19] i'm merging because I have a moment to do so! [20:37:21] hope that's ok! [21:13:03] ottomata: hey! sorry was having dinner [21:13:19] ottomata: yes, sure, can merge on my side [21:13:34] ottomata: did you deploy? [21:14:16] ottomata: and btw, thanks for solving the issue [21:16:23] mforns: yes, but i realized we need a build step to build the ui/dist folder [21:16:26] trying to figure that out now [21:16:50] oh yes, just `npm run build` [21:17:17] hm, should we commit that? [21:17:48] well, i need to run it from the parent as part of or after rnpm install [21:17:55] since there is a separate package.json [21:17:59] aha [21:18:07] so i'm trrying to make it a postinstall script [21:18:15] it works locally, but in docker its being weird [21:18:19] trying to figure itout now [21:26:37] mforns: i dunno... [21:26:47] hm [21:26:47] it might be easier to just get rid of the extra sub package [21:26:54] we can keep the ui dir [21:27:02] but just merge its package.json into the parent one [21:27:12] not sure [21:27:14] hmm lemme try [21:27:17] maybe that is not my problem [21:27:27] but we'd still have to build [21:27:43] yes, i'm having an issue running npm for npm in another package [21:27:46] or something [21:27:51] it might be a blubber issue... [21:27:54] ok [21:28:03] hmm ithink it might be blubber/docker [21:28:05] not sure what is going on [21:28:08] let me try it and see [21:28:15] let me think if there's any problem with merging both package.jsons [21:29:31] mm, I think it should work [21:29:47] it has few dependencies [21:29:57] I don't think there's any comflict [21:30:53] *conflict [21:34:52] * mforns ottomata: https://github.com/lerna/lerna [21:40:45] aye yai yai [21:41:24] mforns: can i use vue-cli-service to build a target dir? [21:41:27] trying to do e.g. [21:41:33] vue-cli-service --dest ui/dist ./ui [21:41:35] or something [21:41:39] but not succeeding [21:41:53] I see [21:42:04] lookin [21:46:30] oh maybe vue-cli-service build --dest ui/dist ./ui/src/main.js [21:46:31] trying [21:47:19] I read you can specify the target directory, but can't find the way to specify the project root directory [21:49:57] yeahhHhH i dunno, this seems like the wrong approach anyway, i was trying it to see if it was a problem with blubber or not [21:49:59] but i'm pretty sure it is [21:50:02] i should be able to do [21:50:08] npm --prefix ./ui install && npm --prefix ./ui run build [21:50:49] aha [21:51:15] but it doesn't work [21:51:22] as a postinstall [21:51:23] yes, I was wondering about the index.html file if you did `vue-cli-service build --dest ui/dist ./ui/src/main.js` [21:51:32] that fails ^ somehow [21:51:33] what's the error [21:51:35] ? [21:51:40] lots [21:51:40] but [21:51:41] * @/apis/EventStreamsApi.js in ./node_modules/cache-loader/dist/cjs.js??ref--12-0!./node_modules/thread-loader/dist/cjs.js!./node_modules/babel-loader/lib!./node_modules/cache-loader/dist/cjs.js??ref--0-0!./node_modules/vue-loader/lib??vue-loader-options!./ui/src/views/Home.vue?vue&type=script&lang=js& [21:51:55] oh ^ thats for the vue-cli-service build trick [21:52:06] for just using postinstall [21:52:09] its missing the ui dir [21:52:13] after the first npm instll [21:52:15] OHHHH [21:52:16] i know why [21:52:17] yes [21:52:32] the dockerfile does npm install with just pacakge.json, before it copies the local src code into the image [21:52:32] O.o [21:52:46] aaah [21:56:01] * mforns ottomata: https://cloudnweb.dev/2019/10/crafting-multi-stage-builds-with-docker-in-node-js/ [21:57:20] cloudnweb.dev refused to connect. [22:07:56] ? [22:14:50] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10nettrom_WMF) >>! In T267333#6642129, @Ottomata wrote: > Hiya @nettrom_WMF, we'd like to migrate these schemas durin... [22:18:19] yeah mforns this is a blubber issue [22:18:25] the dockerfile its building only copies package.json in [22:18:35] i don't know how or when it gets the actual source directories and files [22:18:40] beenn trying to make it do it [22:18:44] but i'll have to pick it up tomorrow [22:18:50] ok [22:19:34] I feel a bit useless here, no big experience in npm or docker, zero in blubber [23:06:46] mforns: i think i have to switch to a new pipeline format [23:06:51] https://wikitech.wikimedia.org/wiki/PipelineLib [23:06:55] got some advice from releng [23:06:59] i'll figure it out thank you! [23:26:09] 10Analytics-Radar, 10Discovery: Display automata and humans separately on zero results rate graph - https://phabricator.wikimedia.org/T112846 (10Deskana) 05Open→03Declined Per the notice on https://discovery.wmflabs.org/ these dashboards are now in maintenance mode. This feature request is no longer relevant.