[00:13:51] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3393508 (10kaldari) 05Resolved>03Open @Ottomata: Something seems to be wrong. It looks like it is rec... [05:03:14] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Make non-nullable columns in EL database nullable - https://phabricator.wikimedia.org/T167162#3393806 (10Marostegui) Hey, I don't have much to said about what needs to be NULLABLE and what not really, that is for you guys to decide, as you have all the... [08:12:05] 10Analytics, 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): Archive analytics/kraken git repository - https://phabricator.wikimedia.org/T169303#3393987 (10hashar) [08:12:53] 10Analytics, 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): Archive analytics/kraken git repository - https://phabricator.wikimedia.org/T169303#3394001 (10hashar) [08:13:53] (03CR) 10Hashar: "recheck" [analytics/wikistats] - 10https://gerrit.wikimedia.org/r/349217 (owner: 10Milimetric) [08:15:52] 10Analytics, 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): Archive analytics/kraken git repository - https://phabricator.wikimedia.org/T169303#3394004 (10hashar) [08:23:43] 10Analytics, 10Continuous-Integration-Config, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): Archive analytics/kraken git repository - https://phabricator.wikimedia.org/T169303#3394011 (10hashar) 05Open>03Resolved [09:13:10] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Make non-nullable columns in EL database nullable - https://phabricator.wikimedia.org/T167162#3394148 (10elukey) I am currently running the alter tables without the any table having #rows > 250M. I'll leave them running until Monday since it seems that 2... [09:17:54] (03PS2) 10Elukey: Remove logrotate and syslog configuration [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/354223 (https://phabricator.wikimedia.org/T151748) [09:18:04] (03CR) 10Elukey: [V: 032 C: 032] Remove logrotate and syslog configuration [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/354223 (https://phabricator.wikimedia.org/T151748) (owner: 10Elukey) [09:29:24] (03PS1) 10Elukey: Revert "Remove logrotate and syslog configuration" [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/362354 [09:29:34] (03CR) 10Elukey: [V: 032 C: 032] Revert "Remove logrotate and syslog configuration" [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/362354 (owner: 10Elukey) [09:29:46] sigh --^ [09:31:32] oh no it was good [09:31:35] what the hell [09:32:40] going to kill the debian branch [09:32:42] too confusing [09:33:26] (03PS1) 10Elukey: Revert "Revert "Remove logrotate and syslog configuration"" [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/362356 [09:33:34] (03CR) 10Elukey: [V: 032 C: 032] Revert "Revert "Remove logrotate and syslog configuration"" [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/362356 (owner: 10Elukey) [09:38:43] (03PS1) 10Elukey: Release version 0.1.6-1 [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/362357 [09:39:06] (03CR) 10Elukey: [V: 032 C: 032] Release version 0.1.6-1 [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/362357 (owner: 10Elukey) [09:56:51] all right, new kafkatee uploaded finally to reprepro [10:04:55] joal: re: https://gerrit.wikimedia.org/r/#/c/362148 - one thing that I always forget/get-confused is that cron does not send emails when the return code is non-zero, but only where there is output from stdout/err [10:26:49] hm elukey [11:33:38] * elukey lunch! [11:56:21] fdans: I'm around early, and I didn't push [11:56:48] I realized that we needed to figure out the routing component, that it made no sense to hack around it [11:57:02] yeah totally [11:57:06] because it's complicated and doesn't make sense [11:57:12] milimetric: talk in 15? finishing lunch [11:57:15] so what I'm thinking is that the wiki selector changes the address [11:57:27] and the yeah, np [11:57:42] you don't have to reply if you're at lunch or on break man [11:59:24] ;) [12:04:11] fdans: I'd be happy to disturb you while you are eating! [12:04:14] :D :D [12:09:17] helloooo [12:16:29] heyyy mforns [12:16:38] milimetric: going to the cave [12:16:39] hi fdans ! [12:17:10] mforns: Riccardo is happy with the fix, he added some comments but I believe we found the final solution :) [12:17:12] elukey: how very unitalian of you luca [12:17:30] elukey, saw it! \o/ [12:17:34] \o/ [12:17:55] elukey, I'm just commenting in the alter table task, and will start applying ricardo's suggestions if you're ok [12:18:46] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Make non-nullable columns in EL database nullable - https://phabricator.wikimedia.org/T167162#3394724 (10mforns) Thanks @Marostegui for the note on alter table timing! Certainly it wouldn't be possible :] Awesome @elukey, let's see how fast those tables... [12:21:59] mforns: +1! Do you want to pair ? [12:23:05] (03PS1) 10Joal: Upgrade script dropping druid deep-storage data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362396 (https://phabricator.wikimedia.org/T168614) [12:23:15] elukey: --^ Please :) [12:24:36] tiny change! :D [12:27:19] elukey, yes pair! [12:29:59] joal: will review the code in a bit ok? [12:30:18] elukey: no bother - another one is for you in puppet ;) :_P [12:31:54] elukey, when you're ready, https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcave-2 ? [12:32:59] joal: https://gerrit.wikimedia.org/r/#/c/362148/3/modules/role/manifests/analytics_cluster/refinery/job/data_drop.pp - => would need to be aligned (very minor nit for the linter) [12:33:30] k elukey [12:34:17] joal: does the script emit any error in stdout/err when it fails? [12:34:29] because I am afraid that if not we'll not receive any email [12:35:16] elukey: It does, but since we redirect stdout and stderr to files, probably not [12:35:32] elukey: use tee instead of complete redirect? [12:39:46] elukey: Provided a patch with tee [13:10:11] 10Analytics, 10Analytics-Dashiki, 10Patch-For-Review: Create dashboard for upload wizard - https://phabricator.wikimedia.org/T159233#3394963 (10matthiasmullie) 05declined>03Open I'm not aware of any patches, open or abandoned? The data patch was merged awhile ago. But this is still something that we're i... [14:21:59] afk 5 minutes because I'm grumpy and I want an ice cream [14:34:32] now I want an icecream too [14:34:37] * elukey blames fdans [14:35:28] * fdans smirks at luca's misery while chomping his maxibon [14:55:37] (03CR) 10Joal: "Webrequest create modification for tag was submitted as part of split job (https://gerrit.wikimedia.org/r/#/c/357814/)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362310 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [14:56:04] (03CR) 10Nuria: Upgrade script dropping druid deep-storage data (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362396 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [14:59:40] (03CR) 10Nuria: ">Webrequest create modification for tag was submitted as part of split job" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362310 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [15:01:00] (03CR) 10Joal: "I don't mind, whatever you think suits better." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362310 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [15:49:20] (03CR) 10Joal: "Inline" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362396 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [15:51:47] 10Analytics, 10Operations, 10ops-eqiad: Smartctl errors for one kafka1012 disk - https://phabricator.wikimedia.org/T168927#3395413 (10elukey) >>! In T168927#3392189, @RobH wrote: > This system is out of warranty, and will require onsite spare disks to be used as replacement. Yes please, do we need approvals... [15:52:53] 10Analytics, 10Operations, 10ops-eqiad: Smartctl errors for one kafka1012 disk - https://phabricator.wikimedia.org/T168927#3395415 (10RobH) >>! In T168927#3395413, @elukey wrote: >>>! In T168927#3392189, @RobH wrote: >> This system is out of warranty, and will require onsite spare disks to be used as replace... [15:56:11] (03PS2) 10Joal: Upgrade script dropping druid deep-storage data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362396 (https://phabricator.wikimedia.org/T168614) [16:07:24] 10Analytics-Tech-community-metrics: Git's "Last Attracted Developers" lists established developers and developers without a First Commit Date - https://phabricator.wikimedia.org/T161309#3395429 (10Aklapper) 05Open>03Resolved Cannot reproduce this anymore on https://wikimedia.biterg.io/app/kibana#/dashboard/C... [16:08:52] 10Analytics-Tech-community-metrics, 10Developer-Relations (Apr-Jun 2017): Identify Wikimedia's most important/used info panels in korma.wmflabs.org - https://phabricator.wikimedia.org/T132421#3395450 (10Aklapper) [16:12:34] 10Analytics-Tech-community-metrics, 10Developer-Relations (Apr-Jun 2017): Identify Wikimedia's most important/used info panels in korma.wmflabs.org - https://phabricator.wikimedia.org/T132421#3395454 (10Aklapper) 05stalled>03Resolved This task has served its purpose: * All items marked as ✚ above in the ta... [16:18:31] 10Analytics-Tech-community-metrics: Provide equivalent of "SCR: Distribution of open changesets (by date of submission)" in Kibana - https://phabricator.wikimedia.org/T151556#3395463 (10Aklapper) 05Open>03declined I don't see a need for this specific metric, hence declining. https://wikimedia.biterg.io/app/k... [16:33:40] (03CR) 10Nuria: [C: 032] "I think this can be merged now. Thanks for upgrading the script." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362396 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [16:33:48] (03CR) 10Nuria: [V: 032 C: 032] Upgrade script dropping druid deep-storage data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/362396 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [16:45:27] * elukey afk! [16:45:32] havea a good weekend! [16:51:00] gone for tonight as well - Have a good weekend a-team :) [16:51:52] nite jo, have a good one [16:51:56] see you... Wednesday! [16:52:02] BYE JOSEPH [16:52:15] Ah ! see you lads, good luck with state-management [16:57:11] 10Analytics-Tech-community-metrics, 10Developer-Relations: Go through default Kibana widgets; decide which ones are not relevant for us and remove them - https://phabricator.wikimedia.org/T147001#3395563 (10Aklapper) p:05Low>03Lowest No urgency and not planning to work on this soon, hence moving to #Develo... [16:58:11] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Find out (and fix) why we have a higher number of identity entries than before switching to new Bitergia DB scheme - https://phabricator.wikimedia.org/T168217#3395568 (10Aklapper) [17:42:41] o/ [17:43:00] is there a way I can query something to get standardized metrics over the course of the last few months? [17:43:23] I need raw numbers for newly_registered_users, new_editors, surviving_new_editors, and productive_new_editors. [17:43:35] I need the raw number because I'm going to use them to do a statistical power analysis. [17:48:13] Hmm.. It looks like I might be able to lift them from the analytics dashboard. [17:48:13] https://analytics.wikimedia.org/dashboards/standard-metrics/ [17:48:26] Hover gives me the raw numbers. [17:48:49] Arg... rounded to ##.#k [18:01:07] Uhoh. looks like data stops at the end of 2016 [18:23:24] halfak: yes, those were computed as a one-off, I think I ran them against earlier versions of mediawiki_history [18:23:33] Gotcha. [18:23:34] I'll point you to the queries, one sec [18:23:39] That would be great [18:23:52] * halfak is running the old queries against analytics store now and it's SLOOOW [18:24:12] halfak: https://github.com/wikimedia/analytics-refinery/tree/master/oozie/mediawiki/history/metrics [18:24:36] so like surviving new editors would be: https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/history/metrics/monthly_surviving_new_editors.hql [18:24:53] and you can run those queries with the parameters as indicated in the comments [18:25:12] Thanks [18:25:20] except I think there's a new snapshot available, like 2017-04 or 2017-05 [18:25:34] snapshot output? [18:25:36] you can show partitions mediawiki_history [18:25:46] Oh I see. [18:25:49] snapshot here means "when the data was imported from mediawiki dbs [18:25:53] " [18:26:15] and you can set the wiki_db param if you're interested in just one wiki [18:26:47] lemme know if you need more help or what's under Usage throws errors (that's a pet peeve of mine) [18:29:24] Hi halfak [18:29:48] halfak: Id' say easiest is to use wmf.mediawiki_metrics [18:30:35] oh! sorry, that's updated! I forgot :( [18:30:55] duh, that's what the oozie job does :) [18:31:32] :) [18:31:39] Well, one of them :) [18:32:34] halfak: for instance: select distinct snapshot, metric from wmf.mediawiki_metrics where wiki_db = 'enwiki'; [18:32:59] halfak: Will tell you about available snapshots and metrics [18:34:21] halfak: another one: SELECT * from wmf.mediawiki_metrics where snapshot = '2017-05' and metric = 'daily_unique_editors' and wiki_db = 'enwiki' limit 20; [18:34:53] halfak: biggest issue of the 2017-05 (latest) snapshot is that, being from labs, it doesn have all wikis - misses most middle-size ones [18:35:13] halfak: for a full snapshot (not so old), you can use: 2016-12_private [18:35:57] halfak: I think that's all we have :) [18:38:17] Gotcha. That's great. Scoping it out now [18:38:49] And actually halfak thinking of that, you can also get it from files I think - let me check [18:39:53] halfak: ls /mnt/hdfs/wmf/data/wmf/mediawiki/metrics/snapshot\=2016-12_private/metric\=monthly_surviving_new_editors/wiki_db\=enwiki [18:40:17] halfak: head /mnt/hdfs/wmf/data/wmf/mediawiki/metrics/snapshot\=2016-12_private/metric\=monthly_surviving_new_editors/wiki_db\=enwiki/000000_0 [18:40:23] halfak: Might actually be easier :) [18:40:56] Hmm I need more recent data [18:41:05] Checking on that 2017-05 snapshot. [18:41:13] Should I expect all days in april to be represented? [18:41:13] halfak: 2017-05 is the best we have, but no mid-size wikis :( [18:41:23] That's cool. All I need is enwiki for right now [18:41:31] go for 2017-05 [18:42:03] halfak: tail /mnt/hdfs/wmf/data/wmf/mediawiki/metrics/snapshot\=2017-05/metric\=monthly_surviving_new_editors/wiki_db\=enwiki/000000_0 [18:42:03] When I get "monthly_new_editors" for a specific day, is it going back 30 days and doing a rolling count? [18:42:57] halfak: https://github.com/wikimedia/analytics-refinery/tree/master/oozie/mediawiki/history/metrics [18:44:01] Right. This is sort of hard to read [18:45:09] halfak: monthly_new_editor: Get event_user_id over a month having one revision created in that month that has less than 24 hours from the user creation [18:45:32] halfak: Might not be easier to read ... Ok ... [18:45:38] batcave halfak ? [18:45:48] So I'm really asking if this is a daily rolling metric or if it is only generated once per month [18:45:55] Because the dt field seems like it could do either. [18:45:56] once per month [18:46:05] no rolling [18:46:10] no metric we have is rolling [18:46:11] Gotcha. That's all I need. Thanks :) [18:46:16] no prob :) [18:46:28] Awesome. Rolling is dumb. I spent a long time arguing against that :) [18:46:33] halfak: in like, 6 or 7 days, you'll have a new snapshot (normally) [18:49:26] * joal gets back to family life :) [18:49:47] Thanks for your help :) [19:25:01] mforns: I FINALLY had time to push the smallpatch for tag UDF, please take a look, i think using a comparator is a more optimal way to define sorting, let me know if you disagree: https://gerrit.wikimedia.org/r/#/c/353287/ [19:36:38] (03PS21) 10Nuria: UDF to tag requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) [19:37:57] hey joal. do you have a few min? :) [19:38:55] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3395971 (10Nuria) Looks related to: https://gerrit.wikimedia.org/r/#/c/360698/ [19:41:38] oww. i see the comment about family life, joal. enjoy it. :) [19:57:58] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3395986 (10kaldari) Also, unlike the regular EventLogging tables, the tables generated from EventBus have... [20:16:58] lzia: can i help you, i am here for a bit [20:21:09] nuria_: in a meeting. I'll message in an hour or so. but I mainly wanted to see if Joseph will be around for an interview while Andrew is away. [20:42:31] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3396093 (10kaldari) @Ottomata, @Nuria: It looks like both `page-create` and (some seemingly random subset... [20:46:56] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3396097 (10kaldari) @Ottomata, @Nuria: Uh oh, it looks like `page-create` events are also going into the... [20:51:04] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3396102 (10Nuria) I see, need to look at this in more detail to see if issue is with insertion or with ev... [21:33:30] 10Analytics-Kanban, 10Analytics-Wikistats: Manage application state with vuex - https://phabricator.wikimedia.org/T169371#3396211 (10Milimetric) [21:37:34] ok yall, o/ have a great weekend/holiday