[05:13:29] (03PS1) 10Zhuyifei1999: Make surrogateescape the global default error handler for Unicode issues [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/472377 [05:14:43] (03CR) 10Zhuyifei1999: "@Framawiki, do you have time to test this and make sure things aren't badly broken?" [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/472377 (owner: 10Zhuyifei1999) [05:29:25] 10Analytics, 10Analytics-Kanban, 10New-Readers: Instrument the landing page - https://phabricator.wikimedia.org/T202592 (10Dzahn) >>! In T202592#4730348, @Nuria wrote: > Not sure if site is active for users yet. Technically it's live. Nothing should hold users back besides.. knowing the URL exists. [09:39:26] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Return to real time banner impressions in Druid - https://phabricator.wikimedia.org/T203669 (10elukey) @AndyRussG @Jseddon Hi! So I have something to show to you in: https://turnilo.wikimedia.org/#event_centralnoticeimpression We have a tool called Eventloggin... [09:51:33] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Update to cloudera 5.15 - https://phabricator.wikimedia.org/T204759 (10elukey) Upgrade scheduled for Nov 12 14:00 CEST :) [09:53:17] elukey: Good morning :) Would now being a good testing time for my hive-parquet patch? [09:55:13] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Drop old mediawiki_history_reduced snapshots - https://phabricator.wikimedia.org/T197888 (10fdans) Right now there are 3 snapshots, so it'll take another 4 months for this script to actually delete snapshots as per our mediawiki snapshot deletion rules (by... [09:59:14] joal: morning! sure [09:59:28] (sorry I was updating tasks) [10:00:16] IIUC you tested it and it is ready for prime time no? [10:00:41] 10Analytics, 10Cloud-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10JAllemandou) [10:00:48] correct elukey :) [10:00:57] elukey: I also subscribed you to --^ [10:01:50] elukey: about hive-parquet, do you prefer a manual test on stat1004, or do we merge? [10:02:03] I feel comfortable merging, but you decide :) [10:02:48] joal: if you have tested it I'd merge it [10:02:57] let's go :) [10:06:11] 10Analytics, 10Cloud-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10JAllemandou) [10:06:37] 10Analytics, 10Cloud-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10JAllemandou) [10:06:56] joal: deployed on stat1004 :) [10:07:05] Checking elukey :) [10:16:08] elukey: I think we're all god :) [10:16:20] s/god/good/ [10:16:22] :) [10:17:41] niceeee [10:18:59] cdh upgrade scheduled for monday [10:40:37] (03PS1) 10Fdans: Replace references to the Report Card with Wikistats 2 [analytics/wikistats] - 10https://gerrit.wikimedia.org/r/472407 (https://phabricator.wikimedia.org/T203128) [10:41:27] (03PS2) 10Fdans: Replace references to the Report Card with Wikistats 2 [analytics/wikistats] - 10https://gerrit.wikimedia.org/r/472407 (https://phabricator.wikimedia.org/T203128) [10:59:34] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10fdans) If it's ok I'll create subtasks to this one so that the owners of the schemas can respond there and choose which field to keep. [11:12:50] joal ani chance you got 2 min on the batcave? I have a couple doubts related to PII [11:15:13] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update to cloudera 5.15 - https://phabricator.wikimedia.org/T204759 (10elukey) Reference about how to upgrade reprepro: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration#Updating_Cloudera_Packages [12:32:44] elukey, hiiii [12:34:27] elukey, is it OK if we shift our meeting to 16:30h? [12:35:15] the other day you said it was better for you a bit later, but I didn't get if it was just that day or in general [12:35:32] if in general, I think it will be better for me to make it later [12:44:50] mforns: fine to me! [12:50:30] 10Analytics, 10Analytics-Kanban: MobileWebSectionUsage schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209049 (10fdans) p:05Triage>03High [12:50:38] 10Analytics, 10Analytics-Kanban: Print schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209050 (10fdans) p:05Triage>03High [12:50:47] 10Analytics, 10Analytics-Kanban: ReadingDepth schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209051 (10fdans) p:05Triage>03High [12:51:45] Hey fdans [12:51:48] I have time ;0 [12:52:13] joal: sorryyy I always ping you when I should know you're out :) [12:52:27] joal: mforns helped me already so no need, thank you! [12:52:49] fdans: unfortunately I'm not always out at the same time of days :) [12:56:14] 10Analytics: MobileWebSectionUsage schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209049 (10fdans) [12:56:20] 10Analytics: Print schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209050 (10fdans) [12:56:42] 10Analytics: ReadingDepth schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209051 (10fdans) [14:00:26] 10Analytics, 10Cloud-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10JAllemandou) After a chat in #wikimedia-cloud chan on IRC, here is what we found with @Krenair : The `comment` view is defined through linking to all parent tables (the t... [14:29:09] 10Analytics, 10EventBus, 10Growth-Team, 10MediaWiki-Watchlist, and 6 others: Clear watchlist on enwiki only removes 50 items at a time - https://phabricator.wikimedia.org/T207329 (10CCicalese_WMF) [14:50:38] 10Analytics, 10Data-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10Krenair) I'm wondering if MySQL's query plan for this (full table scan) is as efficient as it could be. That said I doubt it's something that could realistically be change... [14:53:33] 10Analytics, 10Data-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10Krenair) By the way, the query discussed on IRC was `SELECT MIN(comment_id), MAX(comment_id) FROM (select comment_id, convert(comment_text using utf8) as comment_text from... [15:46:56] 10Analytics, 10Data-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10Nuria) @Krenair: we are looking how to best import the public dataset from labs, we have already looked into scooping data from the non public data hosts and the sanitizat... [15:47:20] joal: hello, should we tag the DBas on this ticket? https://phabricator.wikimedia.org/T209031 [15:50:26] 10Analytics, 10EventBus, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), 10Core Platform Team Backlog (Later), 10Services (later): revision-create events are sometimes emitted in a secondary DC - https://phabricator.wikimedia.org/T207994 (10CCicalese_WMF) [15:51:03] fdans: hello, please do cc tilman in all tickets with sessionId/pageid as he can follow up with pms [15:51:42] 10Analytics, 10DBA, 10Data-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10Nuria) [15:55:55] 10Analytics, 10Analytics-Kanban, 10New-Readers: Instrument the landing page - https://phabricator.wikimedia.org/T202592 (10Nuria) Sounds fine, traffic is just real small, < 10 users per day. [15:59:01] (03CR) 10Nuria: [C: 031] Replace references to the Report Card with Wikistats 2 [analytics/wikistats] - 10https://gerrit.wikimedia.org/r/472407 (https://phabricator.wikimedia.org/T203128) (owner: 10Fdans) [16:03:07] 10Analytics, 10DBA, 10Data-Services: Add index to `comment_id` field in `comment` table (all wikis) - https://phabricator.wikimedia.org/T209031 (10Krenair) >>! In T209031#4732318, @Nuria wrote: > @Krenair: we are looking how to best import the public dataset from labs, we have already looked into scooping da... [16:09:02] 10Analytics: MobileWebSectionUsage schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209049 (10Jdlrobson) This schema is marked as inactive. We're not currently using it.. what am I missing? cc @Tbayer [16:19:39] nuria: on it [16:26:37] 10Analytics: MobileWebSectionUsage schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209049 (10fdans) @Jdlrobson there's data from this schema that is persisted beyond the 90 day window, which contains the fields above mentioned. Even if the schema is no longer active, we... [16:46:51] 10Analytics: MobileWebSectionUsage schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209049 (10Jdlrobson) Got it. I guess @tbayer is the person to talk to here. [16:53:58] nuria: we can flag DBAs on the ticket - I'm not sure however how they'll help :( [16:58:00] joal: wait, aren't they the ones that can create the indexes? Well nvm we can talk about it at standup [16:58:45] nuria: bstorm in cloud chan was talking about creating indices - not sure who's who though- let's talk at standup :) [17:00:57] ping joal [17:19:33] 10Analytics: Print schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209050 (10bmansurov) Sorry, @fdans, I won't be able to help. I no longer maintain that schema. I'll update the wiki page. [17:59:57] (03PS1) 10Joal: Prevent comment loss on sqooped logging table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 [18:00:44] mforns, fdans, nuria --^ please :) [18:00:56] No need to merge - +1 for validity [18:04:09] (03CR) 10Nuria: [C: 04-1] Prevent comment loss on sqooped logging table (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 (owner: 10Joal) [18:07:42] (03CR) 10Joal: Prevent comment loss on sqooped logging table (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 (owner: 10Joal) [18:08:01] joal: maybe i am totally off but don't column names need to change? [18:08:18] I don't think so nuria, why? [18:08:26] 10Analytics, 10Analytics-Kanban, 10New-Readers: Instrument the landing page - https://phabricator.wikimedia.org/T202592 (10atgo) That traffic is probably mostly me :P We aren't promoting the page yet. That'll go live next week (likely the 14th). @Prtksxna is on leave this week returning next, so I'm sure I'... [18:09:13] joal: but don't we want to get also the comment_text from comment table? [18:09:18] joal: ah sorry , i see it below [18:09:30] joal: BOTH fields exists , totally my mistake [18:09:45] joal: i thought we were NOT using the older column at all [18:09:48] nuria: we use old comment when it has a value, comment from new table when no comment [18:09:53] joal: k [18:10:02] (03CR) 10Nuria: [C: 031] Prevent comment loss on sqooped logging table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 (owner: 10Joal) [18:10:08] well, I have seen values in old comment field :S [18:10:25] Ok trying to sqoop with that patch [18:14:00] 10Analytics, 10EventBus, 10user-Clarakosi, 10MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), and 3 others: Convert all hooks to EventFactory - https://phabricator.wikimedia.org/T204575 (10Clarakosi) [18:14:13] 10Analytics, 10EventBus, 10user-Clarakosi, 10Services (later), 10goodfirstbug: EventBus should not use service container in application logic - https://phabricator.wikimedia.org/T204296 (10Clarakosi) [18:14:23] 10Analytics, 10EventBus, 10user-Clarakosi, 10Services (later), 10goodfirstbug: EventBus should make better use of DI - https://phabricator.wikimedia.org/T204295 (10Clarakosi) [18:15:36] what? I just noticed irc kicked me out hours ago! sorry teammm.. [18:17:00] mforns, nuria: can you folks double check this please: https://gist.github.com/jobar/abe0aa79f7a1e02853da82430743ef30 [18:17:07] sure [18:17:19] wow actually one error - destination path [18:17:34] joal: yes [18:17:56] joal: destination path will override what we already have scooped [18:18:07] Changing for my folder [18:18:39] joal: where is command loging to? [18:18:59] last line, I'll run onto an-coord1001 [18:19:15] (03PS1) 10Milimetric: [HOTFIX] [do not merge] add logging_with_comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 [18:20:06] joal: you are missing -o /var/log/refinery/sqoop-mediawiki.log [18:20:07] joal, -s snapshot is like a prefix? [18:20:13] joal: right? [18:20:17] correct [18:20:27] to have a different prefix for monthly cu_changes [18:20:29] (03CR) 10Fdans: [C: 031] Prevent comment loss on sqooped logging table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 (owner: 10Joal) [18:21:02] Thanks fdans ;) [18:21:24] nuria: last line is -o o? [18:21:27] no? [18:21:57] joal: ah i see it now, my reader was cutting it [18:22:05] (03PS2) 10Milimetric: [HOTFIX] [do not merge] add logging_with_comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 [18:24:14] nuria, mforns: starting the command? [18:24:27] joal, makes sense to me, the only comment... [18:24:48] the query is preferring to keep existing log_comment, if it is available [18:25:07] a-team: going to dinner, ping me if you need me! (felt pretty useless.. BUT I may have found the memcached key that I was looking for! - https://phabricator.wikimedia.org/T203786) [18:25:22] Enjoy diner elukey :) [18:25:24] are all present log_comments complete and correct? or might it be better to default the other way round? [18:25:26] Thanks for listening [18:25:54] (03CR) 10Nuria: [C: 031] [HOTFIX] [do not merge] add logging_with_comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 (owner: 10Milimetric) [18:26:21] (03CR) 10Mforns: [C: 031] Prevent comment loss on sqooped logging table (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 (owner: 10Joal) [18:26:27] mforns: Good idea :) Patching [18:28:38] (03PS2) 10Joal: Prevent comment loss on sqooped logging table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 [18:28:52] mforns: --^ [18:28:59] lookin [18:29:42] (03CR) 10Mforns: [C: 031] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472506 (owner: 10Joal) [18:29:55] joal, LGTM [18:30:23] ok, starting the sqooping [18:37:07] (03CR) 10Mforns: [C: 031] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 (owner: 10Milimetric) [18:47:49] (03CR) 10Amire80: Add scheduling for Content Translation MT engine data (032 comments) [analytics/limn-language-data] - 10https://gerrit.wikimedia.org/r/469390 (https://phabricator.wikimedia.org/T207765) (owner: 10Amire80) [18:48:12] (03PS3) 10Amire80: Add scheduling for Content Translation MT engine data [analytics/limn-language-data] - 10https://gerrit.wikimedia.org/r/469390 (https://phabricator.wikimedia.org/T207765) [18:51:57] 10Analytics, 10Product-Analytics, 10Reading-analysis: [EventLogging Sanitization] Update EL sanitization whit-elist for field renames in EL schemas - https://phabricator.wikimedia.org/T209087 (10mforns) [18:52:47] joal, milimetric have meeting in 10 minutes, will check after how are we doing. [18:53:20] I'm still struggling with very bizarre sqoop errors, but joseph got the labs sqoop started [18:53:33] milimetric: want to bc, i ahve 10 mins [18:53:37] ? [18:53:38] sure [19:29:35] (03PS3) 10Milimetric: [HOTFIX] [do not merge] add logging_with_comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 [19:33:31] milimetric: back! [19:33:37] milimetric: bc? [19:33:57] omw [19:34:27] nuria: in bc [20:17:15] 10Analytics: unique devices monthly should be configured - https://phabricator.wikimedia.org/T209103 (10Nuria) [20:24:52] 10Analytics: unique devices monthly should be configured with default "monthly" granularity in turnilo - https://phabricator.wikimedia.org/T209103 (10Nuria) [20:58:38] bye teaaam [21:02:39] leaving for tonight as well - stuff seems going gently [21:02:48] see you tomorrow team [22:04:27] (03PS4) 10Milimetric: [HOTFIX] [do not merge] add logging_with_comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 [22:05:43] (03CR) 10Milimetric: "Joseph / Nuria: the hql file I just pushed here is the select that would join the privately sqooped file along with the currently sqooped " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/472508 (owner: 10Milimetric) [22:10:41] 10Quarry: Create a beta host - https://phabricator.wikimedia.org/T209119 (10Framawiki)