[05:20:17] 10Analytics, 10Event-Platform, 10User-Elukey: Create EventStream's equivalent to irc.wikimedia.org's #central channel - https://phabricator.wikimedia.org/T240182 (10Masumrezarock100) Just to note, @natuur12 and @Steinsplitter are Commons admins, they likely use this channel to track vandals, LTAs etc. RxyBot... [05:31:17] lexnasser: will check! [08:02:14] Hi team [08:05:22] 10Quarry, 10Cloud-Services, 10Commons: S4 replication lab on Wikimedia Clouds is over 70 hours - https://phabricator.wikimedia.org/T243488 (10Mike_Peel) [10:03:41] 10Quarry, 10Cloud-Services, 10Commons: S4 replication lab on Wikimedia Clouds is over 70 hours - https://phabricator.wikimedia.org/T243488 (10Marostegui) That's due to a compression going on on the sanitarium host as part as {T232446} I think it will be done in around 24h or so. There are also some tables b... [10:10:07] joal: just so you know, you are not alone! <3 [10:10:37] <3 fdans :) https://www.youtube.com/watch?v=pAyKJAtDNCw [10:13:13] hehe [11:54:45] 10Analytics, 10Event-Platform, 10User-Elukey: Create EventStream's equivalent to irc.wikimedia.org's #central channel - https://phabricator.wikimedia.org/T240182 (10Steinsplitter) >>! In T240182#5825579, @Masumrezarock100 wrote: > Just to note, @natuur12 and @Steinsplitter are Commons admins, they likely use... [11:58:26] 10Analytics, 10Datasets-General-or-Unknown, 10good first task: Add checksums pageviews dataset - https://phabricator.wikimedia.org/T199461 (10rajeshkumargp) Can I get IRC chat contacts for this pageview project ? or Whom do I need to contact to get details on PageView Project ? [12:47:17] 10Analytics, 10Datasets-General-or-Unknown, 10good first task: Add checksums pageviews dataset - https://phabricator.wikimedia.org/T199461 (10Reedy) >>! In T199461#5814897, @Nuria wrote: > @rajeshkumargp our code is in scala/java and oozie: https://github.com/wikimedia/analytics-refinery/tree/master/oozie/pa... [14:16:15] 10Analytics, 10Analytics-Kanban, 10serviceops: Clarify multi-service instance concepts in helm charts and enable canary releases - https://phabricator.wikimedia.org/T242861 (10Ottomata) > K cool, let's figure out a different name. I like service.name best, just don't want to confuse it with k8s Service. Sho... [14:28:12] (03PS2) 10Joal: Update mediwiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/566609 (https://phabricator.wikimedia.org/T243427) [14:32:30] (03CR) 10jerkins-bot: [V: 04-1] Update mediwiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/566609 (https://phabricator.wikimedia.org/T243427) (owner: 10Joal) [14:36:24] o/ [14:40:52] \\o [15:03:46] (03PS3) 10Joal: Update mediwiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/566609 (https://phabricator.wikimedia.org/T243427) [15:04:04] Gone for kids, back for standup [15:08:28] map opinionated folks would like this one: [15:08:29] https://xkcd.com/2256/ [15:12:35] heh, I told fdans already that would be our new projection [15:12:45] anything else is too non-south-america-centric [15:15:18] (03PS1) 10Milimetric: Add imagelinks to sqoopable tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566757 [15:20:17] joal: ^ what do think about merging, deploying, and sqooping imagelinks to answer our question? [15:20:18] ^ [15:20:39] and do you have time to talk about the silly story I made, I see that nuria thought it was too silly :) [15:29:44] milimetric: NO SILLIES [15:30:10] xD [15:30:32] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog: Produce an instrumentation event stream using new EPC and EventGate from client side browsers - https://phabricator.wikimedia.org/T241241 (10Ottomata) [15:30:35] 10Analytics, 10Event-Platform, 10Multimedia, 10Tool-Pageviews: Mediaviewer views should be reworked to be an eventlogging event - https://phabricator.wikimedia.org/T239630 (10Ottomata) [15:31:22] (03PS1) 10Fdans: (wip) Change build configuration to put dist assets in subdirectory [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/566768 [15:31:34] milimetric ^ [15:32:20] looking [15:35:25] 10Quarry, 10Commons, 10Data-Services: S4 replication lab on Wikimedia Clouds is over 70 hours - https://phabricator.wikimedia.org/T243488 (10JJMC89) [15:43:16] 10Analytics, 10Analytics-Cluster: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10Ottomata) [15:43:17] milimetric: it is ALL GOOD [15:48:18] 10Analytics, 10Datasets-General-or-Unknown, 10good first task: Add checksums pageviews dataset - https://phabricator.wikimedia.org/T199461 (10Nuria) >Should we be removing #good first task I think so, it is really not well suited for acommunity contribution as it requires through testing on hadoop cluster [15:56:30] 10Analytics, 10Datasets-General-or-Unknown: Add checksums pageviews dataset - https://phabricator.wikimedia.org/T199461 (10Reedy) [16:10:44] 10Analytics, 10Analytics-Cluster: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10Ottomata) [16:25:09] (03PS3) 10Nuria: [WIP] Classification of actors for bot detection [analytics/refinery] - 10https://gerrit.wikimedia.org/r/562368 (https://phabricator.wikimedia.org/T238361) [16:25:48] joal, milimetric , mforns : changed naming and structure of 2nd patch for bots, do please take a look: https://gerrit.wikimedia.org/r/562368 [16:32:21] 10Analytics, 10Analytics-EventLogging, 10Front-end-Standards-Group, 10MediaWiki-extensions-WikimediaEvents, and 2 others: Provide a reusable getEditCountBucket function for analytics purposes - https://phabricator.wikimedia.org/T210106 (10phuedx) The Readers Web side of the conversation has been revived si... [17:00:35] LMAO this kerberos stuff is going to give me a heart attack. I just spent so long trying to debug why hive wasn't working for me in a notebook on SWAP and it's because kinit-ed in SSH, not a Jupyter Terminal and somehow those are different sets of credential caches! [17:00:41] ping joal , ottomata [17:01:02] bearloga: I CAN RELATE [17:01:38] nuria: I'm in SHOCK right now [17:01:55] nuria: <3 [17:02:57] klist kept telling me two different things based on where I ran it and ON A WHIM I decided to kinit AGAIN but in jupyter terminal and WOOOOW [17:03:23] that doesn't seem right!!! [17:04:12] okay, deep breaths [17:07:32] (03CR) 10Joal: [C: 04-1] "One change on table weight, rest looks good" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566757 (owner: 10Milimetric) [17:13:15] (03PS2) 10Milimetric: Add imagelinks to sqoopable tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566757 [17:13:48] (03CR) 10Milimetric: Add imagelinks to sqoopable tables (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566757 (owner: 10Milimetric) [17:16:10] (03CR) 10Joal: [V: 03+2 C: 03+2] "LGTM - Merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566757 (owner: 10Milimetric) [17:46:39] milimetric: you need to use zoom (in invite) [17:46:43] ugh [17:46:45] thx joal [17:48:21] !log launching a sqoop for imagelinks (will be slow because tuning sess) [17:48:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:48:40] milimetric: I was about to do it as well :) [17:48:55] joal: good I logged :) I'll do it no worries [17:49:03] ok milimetric :) [17:49:03] joal: did you have a chance to read my corny story? [17:49:05] what do you think? [17:49:56] I have - I like it very much! [17:50:30] milimetric: we'll talk possibly tomorrow? I'll not too late tonight (hopefully) [17:50:48] joal: anytime, no rush [18:44:23] (03PS1) 10Mforns: Change format of data_quality_stats data to parquet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566834 (https://phabricator.wikimedia.org/T241375) [18:46:29] (03PS1) 10Mforns: Change format of data_quality_stats to parquet [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/566836 (https://phabricator.wikimedia.org/T241375) [18:49:19] (03PS2) 10Mforns: Change format of data_quality_stats data to parquet [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566834 (https://phabricator.wikimedia.org/T241375) [19:05:55] milimetric, nuria, mforns - I you have a few minutes: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_history_dumps [19:06:30] will look jo [19:06:39] (just got another meeting and to finish this order :)) [19:06:47] \o/ [19:07:22] (03CR) 10Joal: "Confirmed working on the cluster" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/566609 (https://phabricator.wikimedia.org/T243427) (owner: 10Joal) [19:07:51] joal, looks great! thanks a lot for doing this :] [19:08:14] joal, I think the guava fix works! no errors so far :] [19:08:41] * joal growls of happiness :) [19:21:03] mforns: TEARS come to my eyes [19:29:12] 10Analytics, 10Event-Platform, 10Wikimedia-Extension-setup, 10Patch-For-Review, 10Wikimedia-extension-review-queue: Deploy EventStreamConfig extension - https://phabricator.wikimedia.org/T242122 (10Jdforrester-WMF) >>! In T242122#5823157, @Ottomata wrote: > Bump! One of our (O)KRs/goals this quarter is... [19:29:51] mforns: the larms we are seeing are from your testing correct? [19:30:46] Gone for diner team [19:36:48] 10Analytics, 10Event-Platform, 10Wikimedia-Extension-setup, 10Patch-For-Review, 10Wikimedia-extension-review-queue: Deploy EventStreamConfig extension - https://phabricator.wikimedia.org/T242122 (10Ottomata) [19:41:16] 10Analytics, 10Event-Platform, 10Security Readiness Reviews: Security Review For EventStreamConfig extension - https://phabricator.wikimedia.org/T242124 (10Ottomata) [19:43:04] 10Analytics, 10Event-Platform, 10Security Readiness Reviews: Security Review For EventStreamConfig extension - https://phabricator.wikimedia.org/T242124 (10Ottomata) @sbassett @Reedy is there anything I can do to help move this along? [19:44:14] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Security Readiness Reviews: Security Review For EventStreamConfig extension - https://phabricator.wikimedia.org/T242124 (10Ottomata) [19:45:04] (03CR) 10Mforns: "This depends on https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/566836/" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/566834 (https://phabricator.wikimedia.org/T241375) (owner: 10Mforns) [19:46:09] 10Analytics, 10Analytics-Kanban: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10mforns) [20:04:59] 10Analytics, 10Research, 10Privacy, 10Security: Release data from a public health related research conducted by WMF and formal collaborators - https://phabricator.wikimedia.org/T242844 (10Miriam) Hi @Nuria ! please let us know if you have any updates on this. Thanks! [20:31:15] (03CR) 10Joal: "We could go for actor_24h_rollup_hourly to be fully precise, but it's already good as it is (didn't check the content, only names)." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/562368 (https://phabricator.wikimedia.org/T238361) (owner: 10Nuria) [20:53:57] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Security Readiness Reviews: Security Review For EventStreamConfig extension - https://phabricator.wikimedia.org/T242124 (10sbassett) Hey @Ottomata - this is still in our backlog, so it probably won't be reviewed until after All-hands. I did have a quick... [20:54:20] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Security Readiness Reviews: Security Review For EventStreamConfig extension - https://phabricator.wikimedia.org/T242124 (10Ottomata) Great, thanks! [21:34:34] 10Analytics, 10Product-Analytics: Superset aggregation across edit tags uses all tags - https://phabricator.wikimedia.org/T243552 (10nettrom_WMF) [21:47:10] hi. I am looking to confirm a query (and the source) and I can't find this elsewhere so forgive me, https://quarry.wmflabs.org/query/41388 [21:47:36] I am looking for the number of new account registrations on tr.wikipedia.org (this is not my query, just to be clear) [21:48:01] does this seem fine? anything else I am missing here? like for example, how do we categorize new registrations? just the number or there is sometihng finer [21:58:13] sukhe: a key thing to know about user registration is that it's when the account was created on a particular wiki. if someone is logged in to their account globally, a new account is auto-created for them on every wiki they visit. for actual new account registrations on a wiki, you would need to use the logging table to exclude autocreated users and focus on create users (see https://www.mediawiki.org/wiki/Manual:User_creation) [21:58:36] sukhe: see https://www.mediawiki.org/wiki/Manual:Logging_table for more details on logging table (not sure how much of it is available on the public databases) [22:02:16] ah! thank you bearloga [22:02:30] that number did look a lot higher than I thought it would be [22:57:07] 10Analytics, 10Research, 10Privacy, 10Security: Release data from a public health related research conducted by WMF and formal collaborators - https://phabricator.wikimedia.org/T242844 (10Nuria) @Miriam We have met with @JFishback_WMF to ensure we work on the risk assessment together going forward so as to... [23:05:02] 10Analytics, 10Research, 10Privacy, 10Security: Release data from a public health related research conducted by WMF and formal collaborators - https://phabricator.wikimedia.org/T242844 (10Miriam) Hi @Nuria thank you so much for this! Sorry for not pulling in you earlier, it's good to know as best practice... [23:12:40] 10Analytics, 10Research, 10Privacy, 10Security: Release data from a public health related research conducted by WMF and formal collaborators - https://phabricator.wikimedia.org/T242844 (10Nuria) Given that the data is going to be released via a third party site the files need to be uploaded there, maybe th... [23:21:18] 10Analytics, 10Research, 10Privacy, 10Security: Release data from a public health related research conducted by WMF and formal collaborators - https://phabricator.wikimedia.org/T242844 (10Miriam) perfect, thanks! Is this something I can do, or should I ask someone from Analytics to help?