[00:55:12] 10Jade, 10Scoring-platform-team (Current): Rename the "Judgment" namespace to [new label] - https://phabricator.wikimedia.org/T212178 (10awight) p:05Triage→03Normal [00:56:06] 10Jade, 10Scoring-platform-team (Current): Rename the "Judgment" namespace to [new label] - https://phabricator.wikimedia.org/T212178 (10awight) [00:56:16] 10Jade, 10Scoring-platform-team: Explore alternative names for Jade data - https://phabricator.wikimedia.org/T200365 (10awight) [00:56:19] 10Jade, 10Scoring-platform-team (Current): Rename the "Judgment" namespace to [new label] - https://phabricator.wikimedia.org/T212178 (10awight) [00:59:06] 10Jade, 10Scoring-platform-team, 10Gerrit: Clone gerrit repo mediawiki/extensions/JADE to mediawiki/extensions/Jade - https://phabricator.wikimedia.org/T212180 (10awight) [01:01:32] 10Jade, 10Scoring-platform-team, 10Continuous-Integration-Config: Rename JADE->Jade in continuous integration - https://phabricator.wikimedia.org/T212181 (10awight) [01:02:29] 10Jade, 10Scoring-platform-team, 10MediaWiki-Configuration: Rename JADE->Jade in beta cluster configuration - https://phabricator.wikimedia.org/T212182 (10awight) [01:02:45] 10Jade, 10Scoring-platform-team, 10Gerrit, 10Patch-For-Review: Rename "JADE" extension to "Jade" - https://phabricator.wikimedia.org/T211046 (10awight) [01:03:40] 10Jade, 10Scoring-platform-team, 10MediaWiki-Configuration: Rename JADE->Jade in beta cluster configuration - https://phabricator.wikimedia.org/T212182 (10awight) [01:03:42] 10Jade, 10Scoring-platform-team, 10Continuous-Integration-Config: Rename JADE->Jade in continuous integration - https://phabricator.wikimedia.org/T212181 (10awight) [01:03:45] 10Jade, 10Scoring-platform-team, 10Gerrit: Clone gerrit repo mediawiki/extensions/JADE to mediawiki/extensions/Jade - https://phabricator.wikimedia.org/T212180 (10awight) [01:11:50] 10Jade, 10Scoring-platform-team, 10Patch-For-Review: Rename "JADE" extension to "Jade" - https://phabricator.wikimedia.org/T211046 (10greg) [01:41:33] 10Jade, 10Scoring-platform-team, 10Patch-For-Review: Rename "JADE" extension to "Jade" - https://phabricator.wikimedia.org/T211046 (10awight) >>! In T211046#4814541, @greg wrote: > Can this task be broken up into the sub-parts instead of a bigger multipart task? Eg: one for Gerrit, one for Phabricator, one f... [02:38:56] (03CR) 10Legoktm: "I don't know if there's a recommended way to do i18n plus mustache, I can't recall seeing anyone else do it." (031 comment) [extensions/JADE] - 10https://gerrit.wikimedia.org/r/479578 (https://phabricator.wikimedia.org/T211346) (owner: 10Awight) [04:15:36] (03PS2) 10Awight: Kludge to localize string constants in template [extensions/JADE] - 10https://gerrit.wikimedia.org/r/479578 (https://phabricator.wikimedia.org/T211346) [04:15:44] (03CR) 10Awight: Kludge to localize string constants in template (031 comment) [extensions/JADE] - 10https://gerrit.wikimedia.org/r/479578 (https://phabricator.wikimedia.org/T211346) (owner: 10Awight) [09:23:47] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) [13:00:32] Hey everyone! I've got a few more questions on ORES metrics. Would now be a good time to ask them? [13:45:01] thresholdT: hey sure [13:47:39] hi Amir! Sorry for double-asking (by mail too) but I'll just go ahead on here now. So let's say I call the API with https://ores.wikimedia.org/v3/scores/enwiki/?models=damaging&model_info=statistics . 1) Do I understand it correctly that counts -> labels -> F & T are just the number of edits predicted as false & true on a sample size of n=19428 edits? [14:28:26] PROBLEM - ORES worker production on ores.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - 6539 bytes in 5.049 second response time [14:29:16] thresholdT: hey, please ping me. I don't get notified :D I will answer you in email ASAP [14:29:37] I need to dig into code [14:33:56] RECOVERY - ORES worker production on ores.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 972 bytes in 1.088 second response time [14:41:14] i. [14:41:15] *o/ [14:42:05] thresholdT, you understand the "counts" field right. [15:26:55] fyi, network maintanance in 30mins. (https://phabricator.wikimedia.org/T210447). ores2001 will not be happy, but the service should be fine [15:29:36] Thanks akosiaris [15:59:50] Amir1, I'm considering cutting a new version of revscoring and rebuilding all of the models today. What do you think about trying to get all that deployed this week? [16:00:31] halfak: it makes sense but you need you rework the features in wikidata [16:00:35] it's lots of work [16:00:44] I don't know what you are talking about. [16:00:51] we can also wait until we get datasource json-serializable [16:00:54] You brought this up before but I didn't see an explanation for what needs to be done. [16:01:05] let me find it [16:01:13] Amir1, we're blocked on deploying models for 4 wikis now. [16:01:20] We could still deploy with old revscoring, I guess. [16:02:11] yes, if we deploy with old revscoring it's fine [16:02:20] but with the new one, you need to update this: https://github.com/wikimedia/editquality/blob/master/editquality/feature_lists/wikibase.py [16:02:23] and some other things [16:02:41] Why does it need to be updated? [16:02:51] All of the old features should still work. [16:03:49] https://github.com/wikimedia/revscoring/pull/406/files [16:03:57] if you say so, I'm not sure if there is b/c [16:04:22] See line 71 in the right side of the diff [16:04:50] good [16:05:04] :) Maybe some day we can trim those away, but today is not that day [16:05:31] Also, I caught our TWN vandal with my new iteration on the damage detection model. [16:13:14] Turns out unicode ranges and some knowledge of translatewiki can go pretty far :) [16:14:46] PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [16:42:42] RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [17:23:57] (03PS1) 10Paladox: Disable failing test that prevents merges across WMF-deployed repos [extensions/ORES] (REL1_31) - 10https://gerrit.wikimedia.org/r/480544 (https://phabricator.wikimedia.org/T198201) [17:24:36] (03CR) 10Paladox: "This should fix the tests for PageTriage which kept failing with:" [extensions/ORES] (REL1_31) - 10https://gerrit.wikimedia.org/r/480544 (https://phabricator.wikimedia.org/T198201) (owner: 10Paladox) [17:32:48] halfak or Amir1 hi, around? [17:33:03] In meeting but it's almost done [17:33:06] What's up? [17:33:36] halfak wondering if you could +2 https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/ORES/+/480544/ please? :) [17:33:41] (03CR) 10Ladsgroup: [C: 03+2] Disable failing test that prevents merges across WMF-deployed repos [extensions/ORES] (REL1_31) - 10https://gerrit.wikimedia.org/r/480544 (https://phabricator.wikimedia.org/T198201) (owner: 10Paladox) [17:33:43] ah [17:33:46] Amir1 got it :) [17:33:48] thanks! [17:34:17] nice [17:34:22] Thanks Amir1 [17:34:28] :) [17:37:11] (03Merged) 10jenkins-bot: Disable failing test that prevents merges across WMF-deployed repos [extensions/ORES] (REL1_31) - 10https://gerrit.wikimedia.org/r/480544 (https://phabricator.wikimedia.org/T198201) (owner: 10Paladox) [17:39:10] (03CR) 10jenkins-bot: Disable failing test that prevents merges across WMF-deployed repos [extensions/ORES] (REL1_31) - 10https://gerrit.wikimedia.org/r/480544 (https://phabricator.wikimedia.org/T198201) (owner: 10Paladox) [17:49:43] 10Scoring-platform-team, 10Release Pipeline, 10Wikibase-Containers, 10Wikidata, and 3 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10thcipriani) [18:09:43] have fun: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/480554 [18:09:48] :D [18:13:28] * halfak looks for awight [18:15:44] maybe I'll do lunch instead. [18:49:31] o/ awight [18:51:23] hey [18:55:34] srrodlund, awight: I propose we skip docs today. [18:56:42] That would work for me--note that we won't have another until after the vacation [18:59:29] No worries here. BTW, I saw that HareJ already noted he wasn't going to attend. [19:00:26] srrodlund: ^ ? [19:02:33] Related, I'm planning to do a rush job on the feature injection documentation so that we can call that "done" :) [19:04:33] wikimedia/editquality#427 (translatewiki_fixed - e73f914 : halfak): The build was fixed. https://travis-ci.org/wikimedia/editquality/builds/469661533 [19:06:31] Goddamn finally. [19:06:34] awight, ^ [19:06:40] merge plz <3 [19:06:55] I'm going to rebuild all the models with new revscoring code :) [19:07:05] Unblocking a bunch of things we worked on this quarter :D [19:08:34] awight, I see you note, but all checks have passed. [19:08:39] Maybe you need to refresh? [19:08:46] gotcha [19:09:03] merged \o/ [19:10:09] Thank you! [19:10:33] Is there documentation about calculating pop-rate, btw? [19:11:18] I think you were saying it's not an easy candidate for automation, because the human has to reason about how data sets are collected? [19:20:30] awight, good Q. I think we have a few different patterns we'll want to follow. [19:20:43] I think with a bit of config we can approximate pop-rates. [19:21:02] I want to keep that out of my brain this week though. [19:21:13] :) making a task though... [19:21:18] Instead, I want to focus on outstanding goals that we can finish off. [19:21:26] +1 [19:21:32] Cool! I'll drop some notes in it before I flush this idea. [19:21:39] (from my brain) [19:21:54] kk we have a task for pop rates, T173252 [19:21:54] T173252: Automation and intermediate storage for population rates - https://phabricator.wikimedia.org/T173252 [19:22:21] i.e. I no longer feel an anxiety to put anything else about that on paper for now. [19:27:09] 10Scoring-platform-team, 10revscoring, 10artificial-intelligence: Automation and intermediate storage for population rates - https://phabricator.wikimedia.org/T173252 (10Halfak) This is complicated but it should work OK in most cases. 1. in the simple case where we don't do stratified sampling of "needs_re... [19:27:22] OK note added. [19:27:26] * halfak flushes brain toilet [19:28:26] I just started a complete model rebuild for editquality with revscoring 2.3.0 on stat1007 [19:28:36] Oh I should start the other model repos too. [19:43:36] OK. I have started articlequality now too. [19:43:55] We'll get some new sample data for trwiki and frwiki because those datasets come from XML dumps. [19:46:38] HareJ, do you have a good doc for the Jade MVP? [19:47:06] I'm thinking about saying we'll get the basic integrations in place next quarter as a goal. That sound crazy? [19:47:08] awight, ^ [19:53:15] Let's do it. [19:53:47] Deployment will have to include some level of integration, so it seems safe to goal-ify. [19:54:12] I think I want to boldly say that we're going to focus on that, strategy, annual planning and *nothing else*. [19:54:38] Save maybe some planned ORES maintenance/robustness work. [19:55:28] I like it [20:00:55] Cool. Tomorrow at staff, we can fill in some details. [20:01:17] I'll consider that done for now. I'm going to deal with an email wave and then take on the Feature Injection documentation work :) [20:38:56] Point of interest--I found the capacities of WMF's HDFS cluster: [20:38:56] Present Capacity: 3180826602530646 (2.83 PB) [20:38:57] DFS Remaining: 1218502361233272 (1.08 PB) [20:38:57] DFS Used: 1962324241297374 (1.74 PB) [20:44:01] halfak: I’m not sure what document you’re asking for. I did write up a summary of my Jade strategy for Daisy and Prateek; would you find that interesting? [20:55:37] HareJ, was hoping for a definition of MVP and maybe a discussion of the various integrations we're targetimg. [20:55:52] I'm not assuming you have this -- more wondering if you have anything in this direction. [21:06:38] halfak: closest thing I can think of that exists is https://phabricator.wikimedia.org/T210535 + its subtasks. by the completion of those tasks, we'll have something that will be ready for user testing [21:08:50] adamwight/revscoring#10 (cache_tool - 400fbc1 : Adam Wight): The build passed. https://travis-ci.org/adamwight/revscoring/builds/469716696 [21:09:20] adamwight/revscoring#11 (master - c0e4735 : halfak): The build passed. https://travis-ci.org/adamwight/revscoring/builds/469716915 [21:09:54] the email I sent to Daisy and Prateek also includes a discussion of the use cases we are targeting [21:13:39] HareJ, cool. I think we should work with that tomorrow to put all of the things we want into a Q3 basket. :) [21:14:47] Also, my question from the other day: do you know where Q2 goals were written down + do we have a draft yet for Q3 goals? [21:16:08] No draft yet for Q3. [21:16:23] I can link Q2, https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2017-18_Q2#Program_5._Scoring_Platform_(ORES) [21:16:30] https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC5:_Scoring_Platform/Goals [21:17:00] Interesting that we have a program that's just the name of our team. [21:17:12] * awight looks at Q2 2017 for some reason [21:17:39] As we begin thinking about medium-term planning, we might want to think about how to make our programmatic work explicitly... programmy. [21:17:41] it was hard to tell because Jade goals haven't changed :D [21:20:07] HareJ, it's pretty common. [21:20:18] But I agree that CDP's are where it's at. [21:20:38] it == extra resources and extra reporting load [21:20:42] Heck, even single-department programs that are structured around a programmatic objective [21:27:04] Not sure what you see in a name specifically, but I can imagine a branding reason. [21:38:47] HareJ: What are some ideas for more programmy programs? Like, "newcomer retention"? [21:40:14] Or "curator productivity" or something like that. Of course, more important than the name is that the program is oriented around some kind of goal/outcome. [21:40:48] Scaling the wikis. Interrogating systemic inequalities. [21:40:56] Reticulating splines. [21:43:38] halfak: Have you read this yes? https://www.mediawiki.org/wiki/Wikimedia_Audiences/2018_Product_points_of_view/Augmentation [21:43:42] *yet [21:44:03] Sort of. I read a couple of iterations of it. [21:44:06] kk [21:44:28] It's always a good sign when other teams write our grant proposals ;-) [21:44:36] Right :) [21:45:05] I think I just witnessed a CPU ghost. [21:45:21] Magic smoke release? [21:45:27] Run script. Crashes. Implement try-catch with logging to see what crashed. Script now doesn't crash. [21:45:38] No logging. [21:45:40] WTF [21:46:09] Maybe logging is sent to /dev/null? Can you log at the same level outside the try-catch... [21:46:19] yeah. Hmmm [22:29:26] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) I'd love some advice on how to proceed with this. The new stream's structure will be slightly different than mediawiki.revision-score, the biggest change is that we want... [22:42:09] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) Refinery's json_refine_job seems to fulfill the function of Connect, e.g. https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/387838/7/modules/role/manifests/analytics_... [22:49:52] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Nuria) @awight kafka topics (sometimes called streams) are set by schema and each schema is propagated to a different table in hive. Hopefully this makes sense. This means that ev... [22:51:58] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Ottomata) @awight, I think it would be better to transform the data in event.mediawiki_revision_score to your format in Hadoop. You can do this via Hive or Spark, possibly even r... [22:54:50] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Ottomata) @nuria, we talked about making a new stream for each model a lot in {T197000} but ultimately decided against it. @awight should split the existing data in hive into a n... [23:01:03] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) >>! In T209732#4832862, @Ottomata wrote: > @awight should split the existing data in hive into a new Hive tables by model. Thanks for the suggestions! Would this be a Re... [23:02:15] https://www.mediawiki.org/wiki/ORES/Feature_injection [23:03:03] Got something together. It's OK. I want to add more case studies. [23:03:19] I've got to run and do some evening chores. Have a good one, folks. :) [23:04:12] Be ready to think about goals tomorrow. I want to hear about what you really aspire to. Then we can crush your dreams so they fit nicely in a quarterly cycle ;) [23:08:46] halfak: Right on, I'll review the docs [23:10:19] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Ottomata) I think it depends on what you want to do. Hm, actually, this is not a 'Refine' job. (We use the term 'refine' to mean a 1 to 1 dataset job. Take one dataset in, enhan... [23:20:46] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) >>! In T209732#4832890, @Ottomata wrote: > The tricky part with ORES models is that they don't share a common schema. Good point, I guess the key-value array for probabil... [23:25:16] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:26:08] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Ottomata) Ah ok, cool! Good to know @JAllemandou is involved there. I think he can help you figure out how this job/query would look. That partitioning scheme makes sense. If... [23:30:42] 10ORES, 10Scoring-platform-team, 10Analytics: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Nuria) I see, rather that a new stream of data this is a pivot of data we already have so spark job makes sense. [23:32:49] 10ORES, 10Scoring-platform-team, 10Analytics, 10Dumps-Generation, and 3 others: [Epic] Make ORES scores available in Hadoop and as a dump - https://phabricator.wikimedia.org/T209611 (10awight) I'm reducing the scope of this task to just one pilot integration, for wikidata. [23:33:17] 10ORES, 10Scoring-platform-team, 10Analytics, 10Dumps-Generation, and 3 others: [Epic] Make ORES scores for wikidata available as a dump - https://phabricator.wikimedia.org/T209611 (10awight) [23:35:28] 10ORES, 10Scoring-platform-team: Precache should include bot edits to wikidata - https://phabricator.wikimedia.org/T212264 (10awight) p:05Triage→03Normal