[00:08:07] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10Bstorm) Sounds like that takes care of my concerns. [04:58:11] 10Analytics, 10Core Platform Team, 10DBA, 10Blocked-on-schema-change: Schema change for refactored actor and comment storage - https://phabricator.wikimedia.org/T233135 (10Marostegui) [07:24:58] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) @MoritzMuehlenhoff @ArielGlenn any concerns from your side? [07:25:11] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) a:05Milimetric→03elukey [09:21:04] (03CR) 10Awight: New report for Reference Previews (032 comments) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/542419 (https://phabricator.wikimedia.org/T231529) (owner: 10Awight) [09:24:14] (03PS5) 10Awight: New report for Reference Previews [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/542419 (https://phabricator.wikimedia.org/T231529) [09:24:25] (03CR) 10Awight: New report for Reference Previews (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/542419 (https://phabricator.wikimedia.org/T231529) (owner: 10Awight) [10:02:44] Hi mgerlach - I have seen your question in T234188 - I'm at the ApacheCon conference this week but will try to answer this evening [10:02:45] T234188: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 [11:40:23] 10Analytics, 10Product-Analytics: Create a reports directory under analytics.wikimedia.org - https://phabricator.wikimedia.org/T235494 (10Neil_P._Quinn_WMF) We discussed this in a meeting yesterday, and the consensus was that setting up a second system of syncing folders would be too much work. However, @mfor... [14:47:24] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10ArielGlenn) @elukey Just want to doublecheck that CPU resources used by the hadoop client won't be so much.... [14:58:08] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10CCicalese_WMF) Sorry, for the delay, @Nuria. Yes, that sounds correct. I will create a sep... [15:01:18] (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/542419 (https://phabricator.wikimedia.org/T231529) (owner: 10Awight) [15:03:05] (03CR) 10Mforns: [V: 03+2 C: 03+1] Removing editCountBucket from Popup schema [analytics/refinery] - 10https://gerrit.wikimedia.org/r/542540 (owner: 10Nuria) [15:09:27] 10Analytics, 10Analytics-EventLogging: Update pingback reports to use heartbeat pings to filter data - https://phabricator.wikimedia.org/T236178 (10CCicalese_WMF) [15:16:13] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10Ottomata) The CPU usage will be minimal. It will mostly just be network just like rsync is. [15:17:30] fyi a-team: gerrit lost some patch sets, releng is working on it, best to not push new ones until they're done [15:18:08] ok, thanks [15:23:57] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10ArielGlenn) Rsync can be a big drain both on CPU and memory resources, depending on the size and number of f... [15:26:30] (03CR) 10Nuria: New report for Reference Previews (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/542419 (https://phabricator.wikimedia.org/T231529) (owner: 10Awight) [15:27:11] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10Ottomata) The hadoop client itself shouldn't cause any extra CPU usage, it is just a dumb file transfer agen... [15:38:42] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): Create client side error schema - https://phabricator.wikimedia.org/T229442 (10jlinehan) Taking over on this as per WG meeting. Waiting until we get a better idea of what the client wi... [15:40:38] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): Create client side error schema - https://phabricator.wikimedia.org/T229442 (10jlinehan) a:05Tgr→03jlinehan [15:47:39] !log start backfilling of mediarequests per file from 2015-01-02 to 2019-05-17 after ok vetting of 2015-01-01 [15:47:41] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:48:52] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10RStallman-legalteam) NDA is signed and on file. Thanks! [15:51:13] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10ArielGlenn) Thumbs up from me then. [15:59:49] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Nuria) Thank you! @lexnasser: please ping @Dzahn with your e-mail address/user password for wikitech [16:15:07] 10Analytics, 10Android-app-Bugs, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.29x-N-Nanaimo-Bar): App requests classified as pageviews that probably should not be so - https://phabricator.wikimedia.org/T229068 (10Charlotte) Hi @Nuria - is this issue still occurring? [19:07:44] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10Isaac) > Do we think those were the reason for the missing events? What are the next steps?... [19:23:08] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10Jdlrobson) @Isaac having some data post Thursday would be useful! [19:52:51] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10JAllemandou) Hi @MGerlach - There are two things I can think of to try to help: - Technical trick - Increase the default value of Spark-SQL partitioner. By default, spark uses 200 partitions for datas... [19:58:55] 10Analytics, 10CPT Initiatives (MCR), 10Multi-Content-Revisions (Tech Debt), 10Schema-change: Once MCR is deployed, drop the rev_text_id, rev_content_model, and rev_content_format fields from the revision table - https://phabricator.wikimedia.org/T184615 (10JAllemandou) [20:00:10] 10Analytics, 10Core Platform Team, 10DBA, 10Blocked-on-schema-change: Schema change for refactored actor and comment storage - https://phabricator.wikimedia.org/T233135 (10JAllemandou) >>! In T233135#5592951, @Nuria wrote: > Pinging analytics temporarily so we know these changes are happening, it shoudl no... [20:18:46] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10EdErhart-WMF) [20:22:17] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Nuria) Is CherRaye Glenn a contractor? If so when does the contract expire? [21:06:18] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Varnent) >>! In T236209#5596762, @Nuria wrote: > Is CherRaye Glenn a contractor? If so when does the contract expire? Ch... [21:06:52] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Nuria) [21:08:45] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Nuria) Then she should be added to wmf LDAP group after @Heather's approval ping @Dzahn which I... [21:14:45] 10Analytics, 10Android-app-Bugs, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.29x-N-Nanaimo-Bar): App requests classified as pageviews that probably should not be so - https://phabricator.wikimedia.org/T229068 (10Nuria) In the 2.7 app, yes, this continues to happen but i cannot see it happen on... [21:22:34] 10Analytics, 10Product-Analytics: Wikistats API for legacy pagecounts does not have mobile data before October 2014 - https://phabricator.wikimedia.org/T235143 (10Milimetric) > - Erik compiled data from sampled logs which are no longer available yes, exactly > - We may have some aggregate data that Erik... [21:40:45] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Dzahn) @lexnasser Within max. 30 minutes this should work for you now. Please take a look at https://wikitech.wikimedia.org/wiki/Production_access... [21:41:09] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Dzahn) 05Open→03Resolved If any unexpected issues please just reopen the ticket. [21:41:33] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Nuria) +1 , also let's make sure to go over the Data guidelines before working with the data. [21:46:29] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Dzahn) >>! In T235688#5587345, @Nuria wrote: > And also we need to add lex to nda group for access to turnilo and superset Done! @lexnasser You... [21:49:26] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Dzahn) Actually that is @colewhite this week but we are on it. [21:59:00] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Heather) Approved. Thanks, everyone! [22:16:23] 10Analytics, 10Product-Analytics: Wikistats API for legacy pagecounts does not have mobile data before October 2014 - https://phabricator.wikimedia.org/T235143 (10Nuria) >Yes, definitely, his aggregated data is available there and in CSV files, Do we know where are these files? It seems https://github.com/... [22:27:13] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, 10wikimediafoundation.org: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10colewhite) a:03colewhite [22:34:20] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, and 2 others: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10colewhite) p:05Triage→03Normal [23:35:39] 10Analytics, 10LDAP-Access-Requests, 10Operations, 10SRE-Access-Requests, and 2 others: WikimediaFoundation.org analytics access for CherRaye Glenn - https://phabricator.wikimedia.org/T236209 (10Dzahn) 05Open→03Resolved done. she has been added to the "wmf" group [23:37:04] 10Analytics, 10Analytics-Kanban, 10LDAP-Access-Requests, 10Operations, and 2 others: Analytics Access for Grant (groups cn=wmf and analytics-privatedata-users) - https://phabricator.wikimedia.org/T235260 (10Dzahn) [23:38:02] 10Analytics, 10Analytics-Kanban, 10LDAP-Access-Requests, 10Operations, and 2 others: Analytics Access for Grant (groups cn=wmf and analytics-privatedata-users) - https://phabricator.wikimedia.org/T235260 (10Dzahn) a:05herron→03colewhite L3 has been signed. This is unblocked. [23:42:20] musikanimal: looked at all pageview spikes you passed along and made a handy jupyter notebook to plot details about suspected spikes, pretty simple but if you want to know where it is or how to use it let me know and i will pass it along