[00:39:46] 10Analytics, 10Analytics-Data-Quality, 10Contributors-Analysis, 10Product-Analytics: Resume refinement of edit events in Data Lake - https://phabricator.wikimedia.org/T202348 (10Milimetric) We have to run events through refine anyway to see what's wrong, since nobody remembers, but let's do this: * take t... [09:59:22] (03PS1) 10GoranSMilovanovic: Engine fix + Dashboard ergonomy improved [analytics/wmde/TW/AdvancedSearchExtension-Dashboard] - 10https://gerrit.wikimedia.org/r/455123 [10:00:00] (03CR) 10GoranSMilovanovic: [V: 032 C: 032] Engine fix + Dashboard ergonomy improved [analytics/wmde/TW/AdvancedSearchExtension-Dashboard] - 10https://gerrit.wikimedia.org/r/455123 (owner: 10GoranSMilovanovic) [10:59:12] https://archiva-new.wikimedia.org/ \o/ [10:59:18] joal: ---^ [10:59:33] \o/ ! Or maybe /o\ ;) [10:59:56] ahhaha [11:00:14] so I logged in as Elukey and got the right ops permissions, ldap works fine [11:00:25] trying now [11:00:41] you shouldn't be able to log in in theory [11:00:52] or maybe yes but getting only a default view [11:01:33] elukey: I can log, but empty stuff [11:03:11] super [11:04:18] now the hard part will be to decide what repo to set etc.. [11:04:28] IIUC you are not in favor of changing the status quo? [11:05:03] elukey: I don't see a lot of value of splitting mirror [11:05:24] elukey: I'm happy to change mind, but need to be convinced :) [11:09:21] I don't have any strong opinion, I think that the main point was clarity, since mirror is kinda of a big dependency black hole [11:09:58] but it shouldn't be too hard, if we'll ever want, to change the policy [11:12:37] k [11:16:02] anyhow, lunch! [11:16:06] * elukey lunch! [12:00:47] rolling restart of aqs in progress [12:24:12] moritzm: aqs should be up to date now [12:28:44] thanks [12:35:05] has turnilo also been tested? (used nodejs on thorium) [12:56:16] we are in the process of moving turnilo to analytics-tools1002 [12:56:20] so it will be soon update [12:56:28] *updated (and removed from thorium) [13:12:35] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install analyticsmaster100[12].eqiad.wmnet - https://phabricator.wikimedia.org/T201939 (10elukey) @Cmjohnson hi! A couple of (probably stupid) questions: * are the final node names analytics-master100[12] or analyticsmaster100[12]? I s... [13:26:08] 10Analytics, 10Wikidata, 10wikidata-tech-focus, 10User-Jonas: [Trailblaze] Use apache mahout item recommender for property suggestions - https://phabricator.wikimedia.org/T201168 (10Addshore) [13:26:19] Hi A team! [13:26:26] You don't run mahout by any chance do you? :P [13:26:56] looking at puppet you might :D [13:30:51] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Archiva (meitnerium) to Debian Stretch - https://phabricator.wikimedia.org/T192639 (10elukey) >>! In T192639#4527605, @Smalyshev wrote: > Sounds good, though about the authentication, I have a concern. In order to deploy to archiva... [13:32:56] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Archiva (meitnerium) to Debian Stretch - https://phabricator.wikimedia.org/T192639 (10elukey) Current status is: * archiva-new.wikimedia.org has been created (but it is empty for the moment). Tested that people in the `ops` LDAP g... [13:33:32] addshore: not that I know! [13:33:42] addshore: we don't use Mahout - Maybe the search-people do? If for ML algorithm, I suggest you try Spark MLLib [13:34:00] okay, I saw it in "analytics" while grepping through puppet :) [13:34:10] addshore: elukey's formulation is way better - I don't think we use mahout :) [13:34:34] class { '::cdh::mahout': } [13:34:52] I just spotted it in https://phabricator.wikimedia.org/T201168 so figured I would ask :) [13:34:56] addshore: possibly installed as part of CDH [13:35:02] whats is cdh? :D [13:35:19] addshore: cloudera packaging of hadoop & Co [13:35:25] gotcha [13:35:58] addshore: I think MLLib has a recommender too - And spark has a lot more traction than mahout nowadays - Just saying [13:36:02] Will comment on task [13:36:05] cool, okay, so it is installed on stat1005 for example [13:36:11] joal: yes, a comment on the task would be great! [13:36:19] This is something that we might start looking into next month [13:36:28] so any info / tips / brain dump would be appreciated [13:36:56] Gotcha - Will keep that in mind [13:37:02] thanks for the ping addshore ;) [13:37:50] 10Analytics, 10Wikidata, 10wikidata-tech-focus, 10User-Jonas: [Trailblaze] Use apache mahout item recommender for property suggestions - https://phabricator.wikimedia.org/T201168 (10Addshore) I grepped through puppet and found a reference to mahout already. Apparently it is installed as part of CDH (cloude... [13:38:43] its feeling like it is getting close to the time we are going to want to load all of wikidata into hadoop :P [13:38:56] addshore: I'm on it ;) [13:39:10] joal: if you have time, can you log in now in archiva-new? [13:39:16] you should have more powa [13:39:35] elukey: +Upload ! [13:39:41] \o/ [13:39:56] goood, it seems working then [13:40:47] disconnecting for ~1/2h, back then [13:40:51] 10Analytics, 10Wikidata, 10wikidata-tech-focus, 10User-Jonas: [Trailblaze] Use apache mahout item recommender for property suggestions - https://phabricator.wikimedia.org/T201168 (10JAllemandou) Hi @Jonas - A quick comment as per a quick chat with @Addshore on IRC. If you want to implement recommandation b... [13:42:47] Just got an update and we may not be doing it any time soon, but it is definitely being thought about in the backs of peoples heads [13:46:57] 10Analytics-Kanban: Set a timeout for regex parsing in the Eventlogging processors - https://phabricator.wikimedia.org/T200760 (10elukey) a:05elukey>03None [13:56:24] 10Analytics, 10Wikidata, 10wikidata-tech-focus, 10User-Jonas: [Trailblaze] Use apache mahout item recommender for property suggestions - https://phabricator.wikimedia.org/T201168 (10Jonas) Thanks @JAllemandou for your input! It seems I am very outdated with my knowledge about Hadoop :) I am looking forward... [14:01:11] 10Analytics, 10Wikidata, 10wikidata-tech-focus, 10User-Jonas: [Trailblaze] Create recommendation system prototype for property suggestions - https://phabricator.wikimedia.org/T201168 (10Jonas) [14:49:05] 10Analytics-Tech-community-metrics, 10Code-Health, 10Release-Engineering-Team (Kanban): Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites. - https://phabricator.wikimedia.org/T190891 (10Aklapper) [15:03:09] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Research: 20K events by a single user in the span of 20 mins - https://phabricator.wikimedia.org/T202539 (10Nuria) [15:38:21] 10Analytics, 10Discovery-Analysis, 10Product-Analytics, 10Reading-analysis, 10Patch-For-Review: Productionize per-country daily & monthly active app user stats - https://phabricator.wikimedia.org/T186828 (10JAllemandou) A comment on something that struck me today: for unique-devices computation using las... [15:50:14] * elukey off! Have a good weekeed folks! [15:52:52] 10Analytics: Ingest data from PageIssues EventLogging schema into Druid - https://phabricator.wikimedia.org/T202751 (10Tbayer) [15:53:50] 10Analytics: Ingest data from PageIssues EventLogging schema into Druid - https://phabricator.wikimedia.org/T202751 (10Tbayer) [16:01:16] 10Analytics, 10Page-Issue-Warnings, 10Readers-Web-Backlog (Tracking): Ingest data from PageIssues EventLogging schema into Druid - https://phabricator.wikimedia.org/T202751 (10ovasileva) [16:01:41] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Use kafka for communication from analytics cluster to elasticsearch - https://phabricator.wikimedia.org/T198490 (10debt) [16:33:06] 10Analytics-Tech-community-metrics, 10Code-Health, 10Release-Engineering-Team (Kanban): Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites. - https://phabricator.wikimedia.org/T190891 (10Aklapper) I made a bunch of updates today on https://www.mediawiki.org/w/... [17:41:02] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Research: 20K events by a single user in the span of 20 mins - https://phabricator.wikimedia.org/T202539 (10Nuria) Will be closing ticket as this is lawful traffic if not human. [17:41:09] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Research: 20K events by a single user in the span of 20 mins - https://phabricator.wikimedia.org/T202539 (10Nuria) 05Open>03Resolved [18:54:40] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Deploy mjolnir msearch daemon to the elasticsearch clusters - https://phabricator.wikimedia.org/T200740 (10EBernhardson) I've run some data sizing tests for the two processes that use msearch: query normalization and feature collection. The... [19:07:48] (03CR) 10Nuria: [C: 031] Adds empty dir removal to hive partition dropping jobs (#2) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/453010 (https://phabricator.wikimedia.org/T198600) (owner: 10Fdans) [19:11:46] (03CR) 10Nuria: [C: 031] "Have we tested this last version of the script running into the 3 dir limit?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/453010 (https://phabricator.wikimedia.org/T198600) (owner: 10Fdans) [19:45:33] (03CR) 10Fdans: "@Nuria yes, tested" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/453010 (https://phabricator.wikimedia.org/T198600) (owner: 10Fdans) [19:52:43] 10Quarry: Do the big Quarry migration - https://phabricator.wikimedia.org/T202588 (10zhuyifei1999) The current NFS implementation of resultsets is flawed. The runners have user 'quarry' with UID of 998, but main has it as 997. Currently it's not too bad, just that puppet keeps changing the ownership of `/data/pr... [21:03:28] (03CR) 10Framawiki: [C: 032] Implement user prefs and browser notifications [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/427952 (https://phabricator.wikimedia.org/T124625) (owner: 10Framawiki) [21:03:52] (03Merged) 10jenkins-bot: Implement user prefs and browser notifications [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/427952 (https://phabricator.wikimedia.org/T124625) (owner: 10Framawiki) [21:11:22] (03PS1) 10Framawiki: Use json.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 [21:11:38] (03CR) 10jerkins-bot: [V: 04-1] Use json.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 (owner: 10Framawiki) [21:11:52] (03PS2) 10Framawiki: Use flask.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 [21:12:07] (03CR) 10jerkins-bot: [V: 04-1] Use flask.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 (owner: 10Framawiki) [21:14:00] (03CR) 10Nuria: [V: 032 C: 032] "Merging, let's please monitor the change" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/453010 (https://phabricator.wikimedia.org/T198600) (owner: 10Fdans) [21:14:30] (03CR) 10Nuria: [V: 032 C: 032] "@joal, can you take 1 last look?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/453010 (https://phabricator.wikimedia.org/T198600) (owner: 10Fdans) [21:14:46] (03PS3) 10Framawiki: Use flask.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 [21:15:02] (03CR) 10jerkins-bot: [V: 04-1] Use flask.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 (owner: 10Framawiki) [21:16:43] (03PS4) 10Framawiki: Use flask.jsonify instead of json.dumps [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/455237 [21:24:47] 10Quarry, 10Patch-For-Review: Show desktop notification when a query is done - https://phabricator.wikimedia.org/T124625 (10Framawiki) 05Open>03Resolved