[00:39:14] wikimedia/revscoring#1575 (extractor-deleted - b44c7c6 : Marius Hoch): The build passed. https://travis-ci.org/wikimedia/revscoring/builds/483174874 [10:14:40] o/ (I forgot, I have been around for a couple of hours already) [10:21:26] wikimedia/editquality#452 (eswikiquote2 - 64c5dd7 : Amir Sarabadani): The build passed. https://travis-ci.org/wikimedia/editquality/builds/483330789 [14:56:53] halfak: Hey… after another change to revscoring, I got the draftquality thing (well, it might still need some tweaking, but I'm currently waiting for it to tune) [14:57:21] do you have anything in mind which I could pick up next? [15:07:39] hoo, yeah. If this works, let's try words-to-watch on articlequality and editquality :) [15:10:34] SPAM IS COMING [15:10:36] halfak: it seems to have some (although very little) positive effect (still waiting for the tune results/ new parameters) [15:11:27] Gotcha. That's to be expected. I'm more interested in making sure ORES can catch an obvious "words to watch" than making the overall fitness that much better. [15:11:33] haha, the job stuck [15:11:35] We'll see how it plays out on recall. [15:11:38] :P Amir1 [15:11:58] 10ORES, 10Scoring-platform-team (Current): drafttopic score JSONSchema should specify array of strings for prediction type - https://phabricator.wikimedia.org/T205343 (10Ladsgroup) 05Open→03Resolved [15:12:00] 10Scoring-platform-team (Current), 10revscoring, 10artificial-intelligence: FeatureScalar appears in features list rather than something meaningful - https://phabricator.wikimedia.org/T209869 (10Ladsgroup) 05Open→03Resolved [15:12:02] 10Scoring-platform-team, 10Growth-Team (Current Sprint): Enable ORES filters on RC for Italian Wikipedia - https://phabricator.wikimedia.org/T211032 (10Ladsgroup) [15:12:04] 10ORES, 10Scoring-platform-team (Current): ORES Python client drowning in TIME_WAIT sockets - https://phabricator.wikimedia.org/T213582 (10Ladsgroup) 05Open→03Resolved [15:12:06] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Train/test damaging & goodfaith models for Italian Wikipedia (itwiki) - https://phabricator.wikimedia.org/T208779 (10Ladsgroup) 05Open→03Resolved [15:12:08] 10Scoring-platform-team (Current), 10articlequality-modeling, 10draftquality-modeling, 10drafttopic-modeling, and 3 others: Rebuild models for new revscoring (2.3.0) - https://phabricator.wikimedia.org/T212530 (10Ladsgroup) 05Open→03Resolved [15:12:10] 10ORES, 10Scoring-platform-team (Current): Add required arg to score_revisions for user-agent - https://phabricator.wikimedia.org/T206005 (10Ladsgroup) 05Open→03Resolved [15:12:12] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10artificial-intelligence: Build article quality model for Galician Wikipedia - https://phabricator.wikimedia.org/T201146 (10Ladsgroup) 05Open→03Resolved [15:12:14] 10Scoring-platform-team (Current), 10revscoring, 10User-Ladsgroup, 10artificial-intelligence: Rewrite scoring libraries to replace pywikibase with mwbase - https://phabricator.wikimedia.org/T194758 (10Ladsgroup) 05Open→03Resolved [15:12:29] 10Scoring-platform-team (Current), 10Bad-Words-Detection-System, 10revscoring, 10Patch-For-Review, and 2 others: Add language support for galician - https://phabricator.wikimedia.org/T201142 (10Ladsgroup) 05Open→03Resolved [15:12:31] 10Scoring-platform-team (Current), 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Deploy edit quality models for dewiki - https://phabricator.wikimedia.org/T130257 (10Ladsgroup) 05Open→03Resolved [15:12:33] 10Scoring-platform-team (Current), 10editquality-modeling, 10Epic, 10artificial-intelligence: [Epic] Edit quality models (damaging/goodfaith) - https://phabricator.wikimedia.org/T130213 (10Ladsgroup) [15:12:35] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review: ORES workers using dramatically higher CPU, increasing linearly with time - https://phabricator.wikimedia.org/T206654 (10Ladsgroup) 05Open→03Resolved [15:12:53] 10ORES, 10Scoring-platform-team, 10User-Ladsgroup: Run a test failover in labs before migrating prod to sentinel - https://phabricator.wikimedia.org/T210605 (10Ladsgroup) a:05Ladsgroup→03None [15:13:20] 10ORES, 10Scoring-platform-team, 10User-Ladsgroup: Implement sentinel for ORES production Redis - https://phabricator.wikimedia.org/T122676 (10Ladsgroup) [15:25:39] Amir1, I just saw that awight emailed Jon Robson about UI development. Could you sync with those two. I figure we'll need your help with getting the Jade UI together this quarter. [15:25:49] Missed a "?" in there. [15:26:06] 10ORES, 10Scoring-platform-team (Current), 10Performance, 10User-Ladsgroup: Make celery queues transient - https://phabricator.wikimedia.org/T210584 (10Ladsgroup) I hereby suggest closing this as declined. [15:28:03] halfak: I don't how much I can help. My frontend knowledge of mediawiki is not super much. We can build templates and put CSS for it but that's it. In mediawiki, we can't use fancy things like VueJS or react. We can use OOUI but it'll be hard to some degree [15:31:22] In general I would like a designer to handle the design first, why we are expected to deliever something when we don't have anyone with the required skills [15:32:14] It's like delivering a software in a team that doesn't have a software engineer [15:42:20] Amir1, I'm working on that. I want you and Adam to be ready to engineer the front-end. [16:08:23] Technical Advice IRC meeting starting in 0 minutes in channel #wikimedia-tech, hosts: @Thiemo_WMDE & @nuria - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [16:55:32] I can't be at the staff meeting. [16:55:52] Already updated the current work [18:42:09] relocating [20:19:07] 10Scoring-platform-team, 10artificial-intelligence: Implement NSFW image classifier using Open NSFW - https://phabricator.wikimedia.org/T214201 (10Legoktm) [20:19:46] 10Scoring-platform-team, 10artificial-intelligence: Implement NSFW image classifier using Open NSFW - https://phabricator.wikimedia.org/T214201 (10Legoktm) >>! In T214201#4903646, @MusikAnimal wrote: > Feel free to make this task public :) Thanks, and done. [20:20:03] O_O [20:20:10] fuck that :p [20:22:15] srrodlund is mentioning that https://www.mediawiki.org/wiki/Project:Calendar/How_to_schedule_an_event#Tech_talks could be a good venue for us [20:22:23] halfak: harej: e.g. for a Jade talk ^ [20:31:07] It would be really awesome if we could have a commitment for a tech talk (maybe for March) :-) [20:57:53] halfak: "2019-01-23 20:45:24,511 DEBUG:revscoring.utilities.tune -- Running cross-validation for LogisticRegression with timeout of 5400.0 seconds" (on draftquality) [20:58:11] Do you think it makes sense to raise the timeout further (already bumped it from 60m to 90m) [20:58:19] or just ignore these? [20:58:25] Oh that's just saying it is starting the job [20:58:32] It's not saying that the job timed out. [20:58:45] Sorry, copied the wrong line [20:58:46] So safe to ignore. [20:58:48] Oh! [20:59:05] "2019-01-23 20:45:18,321 WARNING:revscoring.utilities.tune -- Could not cross-validate estimator GradientBoosting" … "RuntimeError: Execution timed out at 5400 seconds." [20:59:17] 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 3 others: Introduce a new namespace for collaborative judgements about wiki entities - https://phabricator.wikimedia.org/T200297 (10Milimetric) This wikitext-in-JSON thing seems really complicated. I read through both comments above and w... [21:01:01] halfak: ^ [21:01:26] hoo, I might not worry about it unless they are all timing out. [21:01:39] Some combinations of parameters will just take an insane amount of time to train. [21:02:02] Not all of them (not even close) [21:02:08] but still a handful [21:02:12] also I have [21:02:13] np.exp(prob, prob) [21:02:30] /srv/home/hoo/venv/3.5/lib/python3.5/site-packages/sklearn/linear_model/base.py:340: RuntimeWarning: overflow encountered in exp np.exp(prob, prob) [21:02:45] I'm not sure what combination, but I'm guessing that if you turn "learning_rate" way down, then it'll take much longer to train. [21:02:59] Oh yeah. That seems to be an internal error. I think we're safe to ignore that. [21:03:18] Especially because the linear models are just there to give us a sense for how much better we can do with more advanced models. [21:03:30] "internal" == in scikit learn. [21:03:42] Good questions. :) [21:05:09] Good to know… let's see how much longer the tuning takes, it's running for ~5h now on stat1007 [21:05:23] (although I had to reduce the parallel threads because it ran out of ram) [21:18:24] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Change default serializer of celery from pickle to json - https://phabricator.wikimedia.org/T206333 (10Halfak) Merged the revscoring change, but I moved this back to "Active" because I guess we'll need to test in ORES. [21:18:55] hoo, oh yeah the ram is an issue. Sorry I forgot about that. [21:19:02] Good on you for working it out :) [21:19:16] I think that it might run overnight. [21:19:29] meh… good thing I have it in a screen [21:19:33] I can't remember how long it took me to tune it last time but it was a long time. [21:19:34] :) [21:47:59] File "/usr/lib/python3.5/multiprocessing/connection.py", line 251, in recv [21:47:59] return ForkingPickler.loads(buf.getbuffer()) [21:47:59] TypeError: __init__() missing 2 required positional arguments: 'info' and 'content' [21:48:48] Weird. I don't know what that is. Context? [21:49:09] revscoring extract articlequality.feature_lists.enwiki.wp10 … [21:51:07] I sometimes run into a problem with the code using the "articlequality" module that is sitting in the CWD. [21:51:25] Try doing a "python setup.py install" of articlequality and then trying the extract again. [21:52:16] didn't work :S [21:54:01] Can you give me the full trace in a paste? [21:55:45] https://phabricator.wikimedia.org/P8030 [21:55:50] halfak: ^ [21:56:44] hoo, what happens when you open a python repl and do "from articlequality.feature_lists.enwiki import wp10"? [21:57:14] Getting the deprecation warning only [22:03:59] interesting… it works locally [22:22:45] halfak: https://github.com/wikimedia/revscoring/pull/423 not it, but also occuring to me [22:22:53] but I can reproduce it locally now [22:23:01] will dig deeper in a bit (afk for now) [22:23:40] hoo, interesting. That must be a new issue because I have not seen it before. Does "and" work like that? [22:24:20] Aha! & and "and" have different meanings. [22:24:28] "and" is union and & is intersection. [22:25:01] Oh wait. "and" isn't union. It's something else. [22:25:03] Weird. [22:25:29] It looks like it just returns whatever is on the right side of the "and" assuming that the left side evaluates to true. [22:28:03] wikimedia/revscoring#1580 (fix-unsupported-op - a4a5369 : Marius Hoch): The build passed. https://travis-ci.org/wikimedia/revscoring/builds/483636418 [22:28:18] Heh. looks like we don't test that behavior [22:28:24] Damn APIs are hard to mock up. [23:02:02] halfak: Damn, it was my commit b44c7c6ceedad1e7e62753d642360b61ac530f5b [23:02:24] Woops. Which repo? [23:02:28] revscoring [23:02:43] Seems the MW api is not serving anything if you also want deletedrevisions but are not allowed to [23:02:54] (and articlequality didn't --login) [23:03:27] eg. look at https://en.wikipedia.org//w/api.php?action=query&revids=718263554&format=json&prop=revisions|deletedrevisions&rvprop=userid|timestamp|user|ids|size|contentmodel|content|comment logged out [23:04:21] "Permission denied" [23:05:32] So either we find out whether the user is allowed (how?) or we do two queries [23:05:45] Which is what fetch_text apparently does [23:05:58] (_get_live_text / _get_deleted_text) [23:07:30] halfak: ^ [23:07:47] I'll call it a day in a bit [23:09:21] Hmm. [23:09:23] * halfak thinking [23:11:25] I do like the two query pattern. [23:11:34] Even though one query is way more efficient. [23:11:51] We could ask the user for a "--get-deleted-revs" flag or something like that. [23:11:57] Rather than just "--login" [23:12:11] Because then erroring on "you don't have deletedrevs perms" would make more sense. [23:12:33] I'd support either change. [23:12:46] It's a painful work-around either way. :| [23:12:51] Ok, I'll have a look at what's nicer tomorrow [23:12:53] Indeed :/ [23:12:54] kk [23:12:58] Shall we revert the change for now? [23:13:14] yeah. I suppose so. [23:14:46] https://github.com/wikimedia/revscoring/pull/424 [23:15:16] Merged. [23:15:47] thanks :) [23:16:19] See you tomorrow :) [23:16:30] 10ORES, 10Scoring-platform-team (Current), 10Analytics: Backfill ORES Hadoop scores with historical data - https://phabricator.wikimedia.org/T209737 (10awight) Important change of plans—We're discussing backfilling, and it might be best to allow mismatched model versions in the dumps for now. In other words... [23:16:42] wikimedia/revscoring#1582 (revert-api-deletedrev - 83ada5f : Marius Hoch): The build failed. https://travis-ci.org/wikimedia/revscoring/builds/483655407