[09:49:42] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2666439 (10Pginer-WMF) @jmatazzoni I hav... [13:53:32] o/ Amir1 [13:53:40] Seems like there was a lot going on this weekend [13:53:52] hey halfak [13:54:30] yeah, got it worked out. Don't worry. On the bright side the labs issue is probably fixed now too (not so sure but let's wait and see) [13:55:08] Amir1, the one that keeps making ores-web-03 go down? [13:58:07] yup [14:01:41] Cool. You think it was logging? [14:03:11] halfak: yeah [14:03:32] Interesting. I wonder why that suddenly started being a problem [14:04:09] I'm guessing it ran out of space somehow [14:05:39] Like, maybe it keeps a buffer of logs in memory? [14:08:52] I guess it could be. That might be it's own problem. Also, it seems that it is only the precache utility that suffers. [14:09:15] Right now, the precached utility is using 2.0 GB on ores-web-03 [14:09:21] Giving it substantial memory pressure. [14:14:52] yeah, I think we can get that fixed easily [14:15:24] Yeah. I'm considering how to figure out what the issue is right now. [14:15:31] 'cause there hasn't been a code change [14:15:34] Hmmm [15:23:25] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2667479 (10jmatazzoni) Getting to Pau's... [15:43:49] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2667536 (10jmatazzoni) > 3. Why do we th... [15:47:54] halfak: I'll join five minutes late :( in quarterly review meeting of WMDE [15:49:08] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2667550 (10jmatazzoni) > 2. Why do we av... [15:54:20] Amir1, sure. Let's plan to start a whole 15 minutes late [15:54:43] It just finished [15:55:08] halfak: I'll be there in time :) [15:58:55] Amir1, DarTar is going to be 15 mins late anyway [15:59:24] * halfak is just finishing up a big cleanup of editquality, so he doesn't mind. [15:59:41] https://meta.wikimedia.org/wiki/Research_talk:Automated_classification_of_edit_quality/Work_log/2016-09-26 [16:02:13] halfak, I am working on the T146284 [16:02:13] T146284: Generate a monthly pageviews dataset - https://phabricator.wikimedia.org/T146284 [16:02:36] getting the pages titles from revids of the wikiclass datasets [16:02:52] so I can use the mwviews library [16:02:58] GhassanMas, nice. What strategy are you thinking using for aggregating views? [16:03:08] Why do you need the wikiclass datasets? [16:03:34] cool, I'll grab some coffee in the mean time [16:03:45] * Amir1 drinks coffee too much but never enough [16:04:07] But for which pages you want the monthly views dataset [16:04:27] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: Update editquality for revscoring 1.3.0 - https://phabricator.wikimedia.org/T146658#2667624 (10Halfak) [16:04:35] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: Update editquality for revscoring 1.3.0 - https://phabricator.wikimedia.org/T146658#2667650 (10Halfak) a:03Halfak [16:07:13] 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : Generate monthly article quality dataset - https://phabricator.wikimedia.org/T145655#2637137 (10Halfak) English is done. See https://datasets.wikimedia.org/public-datasets/enwiki/article_quality/enwiki-20160801.wp10.monthly.tsv.bz2 [16:12:55] 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : Generate monthly article quality dataset - https://phabricator.wikimedia.org/T145655#2667680 (10Halfak) I just started up the frwiki extractor [16:56:19] halfak, so for the whole 5M+ articles as I can see [16:56:25] but what about the month [16:56:28] for which one ? [16:56:36] GhassanMas, all the months we can [16:56:53] alright [16:57:12] but we represent the months as columns ? [16:57:43] each month have it's corresponding views [16:58:05] I think we'll have 5M rows for each month [16:58:12] And month will be a value [16:59:26] what about having the views as a value and we would have columns as a number of months [17:00:21] I remember once you said the views thing started only in 2007 is that right? [17:00:30] GhassanMas, it's easier, I think, when the month is a value rather than a column name [17:00:34] Yes [17:01:31] by that do you mean that we would have a data set for each month ? [17:02:00] or data set of 5M rows* number of months [17:05:25] Eventually a dataset of 5M rows * number of months [17:05:43] But we might create that one month at a time and then concat them together [17:10:11] halfak, So I will start with Aug-2016 to see how things would go [17:10:42] Sounds good :) [17:10:52] I got the a list of page wp10-scores-enwiki-20160820.tsv [17:11:03] wiki-ai/revscoring#827 (tune_json - 0f77027 : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/162842667 [17:44:30] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Quiet re. result.get in tasks - https://phabricator.wikimedia.org/T146680#2668134 (10Halfak) [17:45:17] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Quiet TimeoutError in celery logging - https://phabricator.wikimedia.org/T146681#2668149 (10Halfak) [17:45:36] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Quiet result.get Warning in tasks - https://phabricator.wikimedia.org/T146680#2668168 (10Halfak) [19:10:18] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES, 15User-Ladsgroup: Embed machine readable ores scores as data on pages where ORES scores things - https://phabricator.wikimedia.org/T143611#2668408 (10Catrope) >>! In T143611#2665760, @Ladsgroup wrote: > Okay, I checked and it's super easy to add i... [19:17:33] 10Revision-Scoring-As-A-Service-Backlog, 10MediaWiki-extensions-ORES: Introduce ORES rvprop - https://phabricator.wikimedia.org/T143614#2668433 (10Anomie) >>! In T143614#2665385, @Ladsgroup wrote: >>>! In T143614#2664820, @Anomie wrote: >> Although I note FetchScoreJob only does one revision at a time. > We... [19:49:23] (03CR) 10Catrope: "The ones I deleted either didn't do anything, or weren't needed once I rearranged the query. I'll go through them inline and explain." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/311652 (owner: 10Catrope) [19:55:42] (03CR) 10Catrope: Refactor and simplify changeslist/contribs queries a bit (037 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/311652 (owner: 10Catrope) [20:40:14] o/ [21:21:08] ITS WORKING [21:21:34] Ctrl-F for "halfak: It's working" in the IRC logs and you'll find my good days [22:18:15] ores-compute-01 is starting on a big compute job to regenerate all of the models using the new JSON-based patterns. [22:33:08] 10Revision-Scoring-As-A-Service-Backlog, 10Data-release, 06Research-and-Data, 10rsaas-articlequality : Ask Figshare to remove file upload limit for Article Quality Score dataset - https://phabricator.wikimedia.org/T146708#2669095 (10DarTar) [22:35:24] 10Revision-Scoring-As-A-Service-Backlog, 10Data-release, 06Research-and-Data, 10rsaas-articlequality : Mini blogpost for Article Quality Score dataset - https://phabricator.wikimedia.org/T146709#2669118 (10DarTar) [22:43:46] 10Revision-Scoring-As-A-Service-Backlog, 10Data-release, 06Research-and-Data, 10rsaas-articlequality : Mini blogpost for Article Quality Score dataset - https://phabricator.wikimedia.org/T146709#2669138 (10Ladsgroup) @MelodyKramer talked to me a little about it. I think we should keep in touch. [22:47:36] \o/ [22:50:32] :( [22:50:42] (Persian keyboard, sorry) [22:50:43] :) [22:54:43] lol [22:54:54] Persian keyboards make Amir sad ;) [23:10:24] 10Revision-Scoring-As-A-Service-Backlog, 10rsaas-articlequality : [Explore] Spam and Vandalism new page creation - https://phabricator.wikimedia.org/T135644#2669236 (10Halfak) I replicated some of the work here: https://meta.wikimedia.org/wiki/Research_talk:Automated_classification_of_new_article_quality/Work_... [23:10:42] 10Revision-Scoring-As-A-Service-Backlog, 10rsaas-articlequality : Generate spam and vandalism new page creation dataset - https://phabricator.wikimedia.org/T135644#2669237 (10Halfak) [23:11:08] 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : Generate spam and vandalism new page creation dataset - https://phabricator.wikimedia.org/T135644#2305525 (10Halfak) [23:24:33] 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : [Discuss] Hosting the monthly article quality dataset on labsDB - https://phabricator.wikimedia.org/T146718#2669300 (10Halfak) [23:25:04] 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : [Discuss] Hosting the monthly article quality dataset on labsDB - https://phabricator.wikimedia.org/T146718#2669332 (10Halfak) You can download the compressed dataset from https://datasets.wikimedia.org/public-datasets/enwiki/article_quality/enwiki-2016... [23:25:23] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2669333 (10jmatazzoni) In today's Collab... [23:29:41] 06Revision-Scoring-As-A-Service, 06Operations: halfak should get emails when ores.wikimedia.org goes down - https://phabricator.wikimedia.org/T146720#2669336 (10Halfak) [23:30:33] 06Revision-Scoring-As-A-Service, 10Icinga, 06Operations: halfak should get emails when ores.wikimedia.org goes down - https://phabricator.wikimedia.org/T146720#2669349 (10Halfak) [23:35:37] 10Revision-Scoring-As-A-Service-Backlog, 10MediaWiki-extensions-Wikilabels: [Discuss] Implement Wikilabels backend in MediaWiki? - https://phabricator.wikimedia.org/T146406#2669375 (10Halfak) [23:36:22] 10Revision-Scoring-As-A-Service-Backlog, 10MediaWiki-extensions-Wikilabels: [Discuss] Implement Wikilabels backend in MediaWiki? - https://phabricator.wikimedia.org/T146406#2669376 (10Halfak) [23:42:18] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2669384 (10jmatazzoni) **SUGGESTION #2**... [23:42:40] 06Revision-Scoring-As-A-Service: Create a tools project for hosting ORES datasets (in a labsDB database) - https://phabricator.wikimedia.org/T146722#2669386 (10Halfak) [23:42:53] 06Revision-Scoring-As-A-Service, 10ORES: Create a tools project for hosting ORES datasets (in a labsDB database) - https://phabricator.wikimedia.org/T146722#2669399 (10Halfak) [23:43:10] 06Revision-Scoring-As-A-Service, 10ORES: Create a tools project for hosting ORES datasets (in a labsDB database) - https://phabricator.wikimedia.org/T146722#2669386 (10Halfak) I suggest we name it "revscoring" or "ores" -- something like that.