[05:55:07] 10Analytics, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10elukey) @leila Usually if there are free slots for extra ram banks or if we can replace the existing ones with bigger banks it may work, but 1-2 TB of RAM is not used even for the most powerful Databas... [05:56:36] good morning [09:24:39] So, this report updater thingy... [09:25:30] ... ? :D [09:30:01] https://www.logicalclocks.com/hopsworks [09:30:11] I find the naming something great [09:30:26] hortonworks gone, let's go for hopsworks [09:34:10] very interesting feature store though [09:40:50] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Combine filters and splits on wikistats UI - https://phabricator.wikimedia.org/T249758 (10fdans) Mocks created: {F31870212} {F31870211} [09:48:18] * elukey bbiab [10:15:40] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Refactor breakdowns so they allow more than one dimension to be active - https://phabricator.wikimedia.org/T255757 (10fdans) [10:18:22] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Combine filters and splits on wikistats UI - https://phabricator.wikimedia.org/T249758 (10fdans) One thing to consider is that this is not the final form and feedback is appreciated. @Milimetric gave two pieces of feedback that aren't fully implemented... [10:19:00] elukey: bibimbap? https://en.wikipedia.org/wiki/Bibimbap [10:19:06] god I'm so hungry [10:23:24] fdans: awwwwwww [10:23:29] so, reportupdater, for a single wiki, I want to do some SQL queries, and pull some metrics out and send them to graphite, I should be able to od that right? [10:24:00] so hungry now [10:24:40] addshore: I suggest to follow up with mforns [10:24:46] * elukey lunch! [10:24:53] okay! what timezone? :D [10:25:03] This afternoon :S [10:25:06] err :D [10:25:07] cool! np :D [10:25:13] this afternoon eu time? :P [10:25:15] xD [10:25:16] ahhaha yes [11:46:05] Hi team [12:07:54] 10Analytics, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10JAllemandou) @elukey beat me to comment (as usual :) Just as a comparison, the hadoop cluster has 4Tb total over 54 nodes. Machines with more than 256G or 512G (which is already a lot!) are usually sm... [12:32:18] 10Analytics, 10Analytics-Cluster: Upgrade schema[12]00[12] to Debian Buster - https://phabricator.wikimedia.org/T255026 (10elukey) @Ottomata this should be easy to do right? Or is there any special consideration to make to avoid fireworks? :D [12:32:37] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move the Analytics infrastructure to Debian Buster - https://phabricator.wikimedia.org/T234629 (10elukey) [12:54:55] (03CR) 10Joal: Make ActorSignatureGenerator non-singleton (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606162 (https://phabricator.wikimedia.org/T255660) (owner: 10Joal) [13:04:18] 10Analytics, 10Analytics-Cluster: Upgrade schema[12]00[12] to Debian Buster - https://phabricator.wikimedia.org/T255026 (10Ottomata) Should be easy peasy, there's nothing special or fancy here, and the hosts are HA LB-ed, so you should be able to just depool, reinstall, puppetize, repool. [13:08:32] 10Analytics, 10AbuseFilter, 10Cognate, 10ConfirmEdit (CAPTCHA extension), and 28 others: Replace PageContent(Insert|Save)Complete hooks - https://phabricator.wikimedia.org/T250566 (10DannyS712) [13:25:07] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10User-Elukey: replace onboard NIC in kafka-jumbo100[1-6] - https://phabricator.wikimedia.org/T236327 (10Jclark-ctr) @elukey Sounds good. i will be taking a vacation in august so july would be best [13:31:43] 10Analytics, 10Analytics-Kanban, 10EventStreams, 10Operations, and 2 others: EventStreams drops the connection after 15 minutes, which makes it unreliable - https://phabricator.wikimedia.org/T242767 (10Ottomata) Bump [13:51:09] (03PS4) 10Joal: Add pageview_actor_hourly table and oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) [14:00:09] (03PS5) 10Joal: Add pageview_actor_hourly table and oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) [14:06:45] elukey: I've hit again the heisenbug!?! [14:06:49] * joal is afraid [14:13:38] 10Analytics, 10Analytics-Wikistats: Trends for editor types, and new editors in particular (in Wikistats 2.0) - https://phabricator.wikimedia.org/T186791 (10jeblad) Two years later it does not seem likely that this will be implemented. [14:23:13] (03CR) 10Joal: Update unique-devices jobs to use pageview_actor (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606233 (https://phabricator.wikimedia.org/T250744) (owner: 10Joal) [14:23:47] (03PS2) 10Joal: Update unique-devices jobs to use pageview_actor_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606233 (https://phabricator.wikimedia.org/T250744) [14:27:05] joal: can you give me the link to the error? [14:27:44] elukey: https://hue.wikimedia.org/oozie/list_oozie_workflow/0002577-200616151022463-oozie-oozi-W/?coordinator_job_id=0002576-200616151022463-oozie-oozi-C [14:28:26] elukey: as I said, I'm afraid: plenty instances worked great - two random ones failed [14:29:10] :( [14:29:29] maybe let's open a task [14:29:50] elukey: I also corrected a bug in the patch since the failure - could be related to the bug I corrected, even if error is unrelated (not nice either_) [14:36:07] 10Analytics, 10Analytics-Kanban: Update clickstream and interlanguage jobs to use `pageview_actor_hourly` table instread of webrequest - https://phabricator.wikimedia.org/T255779 (10JAllemandou) [14:36:24] 10Analytics, 10Analytics-Kanban: Update clickstream and interlanguage jobs to use `pageview_actor_hourly` table instread of webrequest - https://phabricator.wikimedia.org/T255779 (10JAllemandou) a:03JAllemandou [14:45:16] (03PS1) 10Joal: Update clickstream to read from pageview_actor_hourly [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606443 (https://phabricator.wikimedia.org/T255779) [14:56:52] (03PS1) 10Joal: Update clickstream and interlanguage jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606449 [14:57:01] (03CR) 10Nuria: [C: 03+2] Update clickstream to read from pageview_actor_hourly [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606443 (https://phabricator.wikimedia.org/T255779) (owner: 10Joal) [14:58:29] 10Analytics, 10Analytics-Wikistats: Trends for editor types, and new editors in particular (in Wikistats 2.0) - https://phabricator.wikimedia.org/T186791 (10Nuria) >Two years later it does not seem likely that rolling averages will be implemented. That's correct, teh work we expect to tackle in the mid term... [15:01:50] ottomata: heya :) [15:03:19] (03Merged) 10jenkins-bot: Update clickstream to read from pageview_actor_hourly [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606443 (https://phabricator.wikimedia.org/T255779) (owner: 10Joal) [15:33:41] 10Analytics, 10Analytics-Wikistats: Annotations in wikistats2 can't be split on project and language - https://phabricator.wikimedia.org/T208665 (10jeblad) Not sure, but has anyone done anything on this? [15:40:06] 10Analytics, 10Analytics-Wikistats: Annotations in wikistats2 can't be split on project and language - https://phabricator.wikimedia.org/T208665 (10fdans) @jeblad hi! Unfortunately right now we have a pretty big backlog and given the relatively low usage of annotations I don't think we'll be able to dedicate t... [15:43:07] 10Analytics, 10Analytics-Kanban, 10EventStreams, 10Operations, and 2 others: EventStreams drops the connection after 15 minutes, which makes it unreliable - https://phabricator.wikimedia.org/T242767 (10ema) >>! In T242767#6207395, @Ottomata wrote: > Hm, I'm pretty sure the connection is terminated even whe... [15:49:04] 10Analytics-Radar, 10AbuseFilter, 10Cognate, 10ConfirmEdit (CAPTCHA extension), and 28 others: Replace PageContent(Insert|Save)Complete hooks - https://phabricator.wikimedia.org/T250566 (10fdans) [15:49:47] 10Analytics-Radar, 10Growth-Team (Current Sprint), 10Product-Analytics (Kanban): Newcomer tasks: update schema whitelist for Guidance - https://phabricator.wikimedia.org/T255501 (10fdans) [15:55:48] 10Analytics, 10Cloud-VPS, 10Puppet: Puppet failing on wikistats.analytics.eqiad.wmflabs: /usr/local/sbin/x509-bundle error - https://phabricator.wikimedia.org/T255464 (10fdans) a:03elukey For this site, the puppet configuration needs to skip TLS deployment. [15:56:15] * addshore waits for mforns to ask questions about reportupdater to :) [15:56:31] 10Analytics, 10Analytics-Kanban, 10Cloud-VPS, 10Puppet: Puppet failing on wikistats.analytics.eqiad.wmflabs: /usr/local/sbin/x509-bundle error - https://phabricator.wikimedia.org/T255464 (10fdans) a:05elukey→03fdans [15:56:40] addshore: Marcel sent an email that he'll be out for the next couple of days :( [15:56:43] Ah addshore - I completely forgot - mforns sent an email saying he'll be oof this end pf week :S [15:56:46] bis [15:56:51] :D [15:56:56] always later than elukey - I should nopw [15:57:03] okay! :P [15:57:05] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create intermediate dataset: pageview with actor information - https://phabricator.wikimedia.org/T255467 (10fdans) p:05Triage→03High [15:57:25] * addshore will leave looking at that ticket until next week [15:57:51] addshore: we send some wikilove [15:58:10] (to you) :D [15:58:28] 10Analytics, 10Analytics-Kanban: Update skewed-join strategy in Mediawiki-history to prevent errors in case of task-retry - https://phabricator.wikimedia.org/T255548 (10fdans) p:05Triage→03High [16:01:18] 10Analytics-EventLogging, 10Analytics-Radar, 10NewcomerTasks 1.2, 10Product-Analytics, and 2 others: NewcomerTask EventLogging schema has invalid array items type specification - https://phabricator.wikimedia.org/T255597 (10fdans) @Tgr can you confirm the correct data is there? [16:07:24] 10Analytics, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10Nuria) Other things we talked about: - can't this use case bernefit from usage of GPU resources (cc @diego ) - could we work on these models running distributed on hadoop? [16:09:08] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10fdans) [16:13:34] 10Analytics, 10Product-Analytics: Remove COUNT(*) from datasets when not useful in Superset & Turnilo - https://phabricator.wikimedia.org/T255725 (10fdans) we're not sure this can be removed, let's look into it [16:14:18] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Refactor breakdowns so they allow more than one dimension to be active - https://phabricator.wikimedia.org/T255757 (10fdans) p:05Triage→03High [16:22:11] wow a-team, really too bad we aren't using kafka connect [16:22:18] people are forking my connector and improving it [16:22:20] https://github.com/ottomata/kafka-connect-jsonschema/network/members [16:22:22] also [16:22:31] https://madhead.me/posts/kafka-connect-wikipedia/ [16:32:40] (03CR) 10Nuria: [C: 04-1] "Per conversation and after looking at: https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/ql/src/java/org/apache" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606162 (https://phabricator.wikimedia.org/T255660) (owner: 10Joal) [16:32:49] (03CR) 10Nuria: [C: 03+1] Make ActorSignatureGenerator non-singleton [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606162 (https://phabricator.wikimedia.org/T255660) (owner: 10Joal) [16:34:59] 10Analytics, 10Analytics-Kanban, 10EventStreams, 10Operations, and 2 others: EventStreams drops the connection after 15 minutes, which makes it unreliable - https://phabricator.wikimedia.org/T242767 (10Ottomata) Just parking a crazy idea I just had, mostly irrelevant to this ticket. > Large downloads are v... [16:35:08] ottomata: :( [16:35:23] elukey: ? [16:40:22] the kafka connect stuff [16:41:15] (03PS3) 10Joal: Add a corrected bzip2 codec for spark [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/603590 (https://phabricator.wikimedia.org/T243241) [16:42:37] just finished the procedure to upgrade hdfs in test, all good [16:42:43] will not finalize, and rollback tomorrow [16:42:57] I did everything with cumin, so I think I can make it a cookbook [16:43:10] (with some manual intervention) [16:45:10] \o/ this is awesome elukey :) [16:49:25] nuria: there is more variability on per-project family when we apply the change - variation is biger both ways (some removal, some addition) [16:50:19] nuria: I think the addition comes mostly from the gain in precision due to the hashes [16:50:30] nuria: and the removal from the bots [16:50:52] nuria: I have numbers in a notebook if you want to look at them - Otherwise could works :) [16:51:38] (03CR) 10Joal: [V: 03+2] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606233 (https://phabricator.wikimedia.org/T250744) (owner: 10Joal) [16:52:11] (03CR) 10Joal: [V: 03+2] "Re-tested on cluster after bugfix" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) (owner: 10Joal) [16:52:26] oh oh :P [16:52:35] ? [16:54:03] GOne for diner - back after [17:32:58] * elukey off! [17:36:08] (03CR) 10Nuria: [C: 03+1] "Looks good." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606233 (https://phabricator.wikimedia.org/T250744) (owner: 10Joal) [17:40:43] (03CR) 10Nuria: Add pageview_actor_hourly table and oozie job (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) (owner: 10Joal) [18:23:28] (03CR) 10Joal: [V: 03+2] Add pageview_actor_hourly table and oozie job (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) (owner: 10Joal) [18:26:02] (03PS6) 10Joal: Add pageview_actor_hourly table and oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) [18:26:23] joal: before you sigin off today i wanna talk you about null $schema in refine, i know why it happens, still thikning for a bit tho... [18:26:41] Heya ottomata - wanna talk now? [18:27:08] if you are about to levae sure! otherwise gimme 10ish minutes to wrap my brain around something [18:27:28] I'll be there for ~half an hour - can wait 10 minutes :) [18:28:52] k [18:28:53] (03PS3) 10Joal: Update unique-devices jobs to use pageview_actor_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606233 (https://phabricator.wikimedia.org/T250744) [18:30:20] (03PS2) 10Joal: Update clickstream and interlanguage jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606449 [18:31:21] ok joal ready [18:31:28] joining [19:07:16] Gone for tonight [19:11:08] a-team: thanks for your comments on T255716. I'm preparing a response for you. Please bear with me. [19:11:08] T255716: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 [19:25:02] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10leila) Thanks for chiming in, all. I'm coordinating the ask on behalf of my team to make sure I'm capturing the needs of all the folks on the team. I'm going to not ping specific... [19:33:21] 10Analytics: Refine drops $schema field values - https://phabricator.wikimedia.org/T255818 (10Ottomata) [22:34:59] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform - https://phabricator.wikimedia.org/T249261 (10Ottomata) Ah! Everything looked great on group0 today so I proceeded with group1 and th...