[07:41:34] PROBLEM - https://grafana.wikimedia.org/dashboard/db/ores-extension grafana alert on icinga1001 is CRITICAL: CRITICAL: https://grafana.wikimedia.org/dashboard/db/ores-extension is alerting: Service hits for obtaining thresholds alert. [07:43:50] RECOVERY - https://grafana.wikimedia.org/dashboard/db/ores-extension grafana alert on icinga1001 is OK: OK: https://grafana.wikimedia.org/dashboard/db/ores-extension is not alerting. [07:57:47] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, 10User-Ladsgroup: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) a:03Marostegui I have taken a look at the status of the `tmp_1 or tmp_2 or tmp_3` status. s1 : Has `KEY... [08:10:58] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, 10User-Ladsgroup: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) [08:18:24] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) I have deployed this index on db1096:3316 (s6) and I will leave it like that for some time to see if there is any... [08:19:15] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) s6 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore2001 [] dbstore1002 [] dbstore1001 [] db209... [08:19:41] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) [10:14:42] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, 10Release Pipeline (Blubber): Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10hashar) I am not that familiar with #blubber , I would guess that when one does: ` python: version: pyt... [12:26:05] Amir1: btw, I have a simple puppet patch for you. I expect it to literally be a noop uwsgi wise, but still https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/426017/ [12:26:59] akosiaris: sure thing. Does this need fixing on ores too? [12:27:23] oh btw. I'm almost done with blubberizing wikilabels :D [12:27:45] nice [12:28:28] no, ORES does not pass anymore either --die-on-term (has been moved to become a default in puppet) nor --autoload [12:29:08] ok. I 'll merge, lemme know if I break anything and I 'll revert [12:29:30] Sure [12:29:32] Thanks [12:30:51] akosiaris: do you think we can move wikilabels to prod if the blubber stuff is done? [12:31:01] just a wild idea TBH [12:31:55] Amir1: that's more like up to you, isn't it? If your team feels it's ready to be in production and it's worthwhile to be in production, sure [12:32:14] as long as it's a decent service and maintained, I don't see why SRE would object [12:33:37] I'm not sure. The reason I want this to be in production is *only* to have one service shipped to kubernetes before starting ores [12:34:11] that does not sound like a really good reason [12:34:17] :D [12:35:00] I know :P [13:21:08] 10ORES, 10Scoring-platform-team: Support CIDR range whitelist for ORES throttling - https://phabricator.wikimedia.org/T210103 (10Ladsgroup) This contradicts with the metal design I had in mind for one reason. PoolCounter is a generic lock manager and can be used in other contexts too. So anything related to ha... [14:30:01] o/ [14:33:32] wikimedia/wikilabels#438 (blubber - c558a10 : Amir Sarabadani): The build was fixed. https://travis-ci.org/wikimedia/wikilabels/builds/459773066 [14:41:03] 10Scoring-platform-team: Design how we'll train models which depend on private data - https://phabricator.wikimedia.org/T168908 (10Ladsgroup) p:05Triage>03Low [14:42:07] wikimedia/ores#1154 (score_request_json - de0ea97 : Amir Sarabadani): The build passed. https://travis-ci.org/wikimedia/ores/builds/459777047 [14:42:22] \o/ [14:43:35] halfak: is it fine if I spend some time triaging tasks on the backlog board? [14:43:46] Yes certainly :) [14:43:50] also this is for review: https://github.com/wikimedia/ores/pull/293 :D [14:44:39] 10ORES, 10Scoring-platform-team: [Investigate] ORES worker threads shouldn't use Redis connection pool - https://phabricator.wikimedia.org/T174403 (10Ladsgroup) p:05Triage>03Normal [14:45:14] 10Scoring-platform-team, 10User-Zppix: Wiki-ai Travis-CI Image upgrade - https://phabricator.wikimedia.org/T183214 (10Ladsgroup) p:05Triage>03Normal [14:48:44] Amir1, I don't understand what this PR is doing. [14:48:48] It's still going to get pickled. [14:50:06] yup but when we move to using json it won't error out because it doesn't send the object and sends a dictionary [14:50:30] this is one step towards using json, it's not all [14:51:54] You don't re-inflate it on the other side :P [14:52:19] I do, I already did [14:52:31] https://github.com/wikimedia/ores/pull/293/files#diff-66e23a96b1f807238a43d2a27c9099ffR60 [14:53:58] halfak: the reason is that I did them separately is that if we deploy both changes together it will cause an error (some nodes using the old thing, some nodes using the new thing) [14:57:42] 10ORES, 10Scoring-platform-team, 10Graphite, 10goodfirstbug: Look at additional uWSGI metrics for potential use in the ORES dashboard - https://phabricator.wikimedia.org/T182915 (10Ladsgroup) We already have busy workers: https://grafana.wikimedia.org/dashboard/db/ores?refresh=1m&panelId=13&fullscreen&orgI... [14:57:52] 10ORES, 10Scoring-platform-team, 10Graphite, 10goodfirstbug: Look at additional uWSGI metrics for potential use in the ORES dashboard - https://phabricator.wikimedia.org/T182915 (10Ladsgroup) p:05Triage>03Low [16:19:33] Amir1, I see. That makes sense. [16:20:25] halfak: hiyaaaa [16:20:32] o/ ottomata [16:20:44] care to join -services for a minute, got a quick q for ya [16:32:49] 10ORES, 10Scoring-platform-team, 10Scap: scap deploy --service-restart doesn't affect ORES celery - https://phabricator.wikimedia.org/T182912 (10thcipriani) I noticed that `ores/deploy` has the configuration `service_name: uwsgi-ores` set. `--service-restart` will only restart services listed there. You coul... [16:56:57] 10ORES, 10Scoring-platform-team, 10TestMe: Celery task pool doesn't degrade nicely. - https://phabricator.wikimedia.org/T175875 (10Ladsgroup) p:05Triage>03Normal This should be rechecked because of upgrading to celery4, it has more robust way of handling tasks. [16:57:19] 10ORES, 10Scoring-platform-team, 10TestMe: Tasks should not be marked permanently revoked due to high system load - https://phabricator.wikimedia.org/T175860 (10Ladsgroup) p:05Triage>03Normal This should be rechecked because of upgrading to celery4, it has more robust way of handling tasks. [16:59:41] 10ORES, 10Scoring-platform-team: Celery task result is cached for an unnecessarily long time - https://phabricator.wikimedia.org/T179683 (10Ladsgroup) p:05Triage>03Lowest We don't have storage issues for redis. [17:15:39] 10Scoring-platform-team (Current): Create project page about Newcomer quality - https://phabricator.wikimedia.org/T210211 (10notconfusing) Did a lot of work on this over the weekend. [17:19:15] 10ORES, 10Scoring-platform-team, 10Patch-For-Review: Refactor ORES puppet for Kubernetes - https://phabricator.wikimedia.org/T182332 (10Ladsgroup) p:05Triage>03Normal [17:19:36] 10ORES, 10Scoring-platform-team, 10Continuous-Integration-Config: Migrate ORES CI to Stretch - https://phabricator.wikimedia.org/T186239 (10Ladsgroup) p:05Triage>03Low [17:21:35] 10ORES, 10Scoring-platform-team, 10Continuous-Integration-Config: Daily build integration test to prove that ORES makefiles are sane - https://phabricator.wikimedia.org/T192606 (10Ladsgroup) p:05Triage>03Lowest [17:21:47] 10Scoring-platform-team: Template makefiles in articlequality, draftquality, and drafttopic - https://phabricator.wikimedia.org/T193424 (10Ladsgroup) p:05Triage>03Low [17:29:44] 10ORES, 10Scoring-platform-team: Use ORES "promote" check to actually check our services - https://phabricator.wikimedia.org/T188341 (10Ladsgroup) 05Open>03declined I decline this as we are moving to k8s and helm will handle this. It's not important enough to fix it right away. [17:34:18] 10ORES, 10Scoring-platform-team, 10Multi-Content-Revisions, 10Epic: MCR support in ORES - https://phabricator.wikimedia.org/T195779 (10Ladsgroup) p:05Triage>03Low [17:42:24] 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: Change default serializer of celery from pickle to json - https://phabricator.wikimedia.org/T206333 (10Ladsgroup) [17:42:51] 10ORES, 10Scoring-platform-team, 10Operations: [Epic] Deploy ORES in kubernetes cluster - https://phabricator.wikimedia.org/T182331 (10Ladsgroup) [17:43:41] 10ORES, 10Scoring-platform-team: Jinja error in ORES - https://phabricator.wikimedia.org/T183949 (10Ladsgroup) p:05Triage>03Lowest [17:44:28] 10Scoring-platform-team, 10revscoring, 10artificial-intelligence: Write model-centric integration tests for revscoring - https://phabricator.wikimedia.org/T187819 (10Ladsgroup) p:05Triage>03Low [17:45:02] 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: articlequality scores a very short article really high in Persian Wikipedia - https://phabricator.wikimedia.org/T202573 (10Ladsgroup) p:05Triage>03Low [17:46:24] 10ORES, 10Scoring-platform-team: Code generation should assert configuration basic sanity - https://phabricator.wikimedia.org/T192118 (10Ladsgroup) p:05Triage>03Low [17:46:59] 10ORES, 10Scoring-platform-team: Use docker for ORES travis-ci tests - https://phabricator.wikimedia.org/T195073 (10Ladsgroup) p:05Triage>03Low [17:51:19] 10ORES, 10Scoring-platform-team: Exception killing threads in ORES celery workers - https://phabricator.wikimedia.org/T182862 (10Ladsgroup) p:05Triage>03Normal [17:54:26] notconfusing: Are you coming to this next meeting? It's very apropos [17:54:45] I'm going to deploy this change on beta and maybe prod (if I'm around by then) [17:54:57] notconfusing: I just sent you an invite [17:54:58] notconfusing, o/ [17:55:30] We should move https://www.mediawiki.org/wiki/ORES/Newcomerquality#Proposed_Initial_Experiment:_TeaHouse_invites to Meta, I think. I see now what you are documenting. Strong norms for documenting research projects on meta. We should document the model itself on mediawiki.org [17:55:54] 10ORES, 10Scoring-platform-team: Update deprecated cv_train usage - https://phabricator.wikimedia.org/T192412 (10Ladsgroup) p:05Triage>03Normal [17:56:42] notconfusing, could we get a sample of people who would be invited by the AI-powered HostBot? That might be a good way to put people at ease about running the experiment. [17:57:10] that is the *very* thing I'm putting thi finishing touches oon! [17:57:15] 10ORES, 10Scoring-platform-team: Use docker for ORES travis-ci tests - https://phabricator.wikimedia.org/T195073 (10Ladsgroup) [17:57:18] 10ORES, 10Scoring-platform-team: Make Celery `result_backend` and `broker_url` configurable by environment variable - https://phabricator.wikimedia.org/T195074 (10Ladsgroup) 05Open>03declined This is not needed as we have docker tests coming from a dedicated yaml file. Right? Reopen if that's not the case. [17:57:22] * harej will be preoccupied by a 2.5 hour meeting for the next... 2.5 hours [17:58:48] @half, i'm thinking about making a page that for a given day 1) all the registered users and some stats 2) the output of HostBot's invite-selection query 3) the output of what HostBot-AI would invite (and maybe the probability scores associated) [17:59:21] halfak: ^ [17:59:23] (03PS1) 10Ladsgroup: Bump ORES to HEAD [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/475800 [17:59:25] notconfusing: Final plug for this meeting (in 2 minutes), the Research team often has useful feedback for us and would be very interested in your experiment. [17:59:41] (03CR) 10Ladsgroup: [V: 032 C: 032] Bump ORES to HEAD [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/475800 (owner: 10Ladsgroup) [18:00:00] ok i'll join [18:00:05] awight: link? [18:00:12] U should have an invite [18:00:22] +1 notconfusing :) [18:00:28] Sounds great [18:28:14] 10Scoring-platform-team (Current): Create project page about Newcomer quality - https://phabricator.wikimedia.org/T210211 (10notconfusing) but need to move the experiment page to meta [18:37:18] ! [18:38:54] 10ORES, 10Scoring-platform-team, 10Analytics, 10Dumps-Generation, and 3 others: [Epic] Make ORES scores available in Hadoop and as a dump - https://phabricator.wikimedia.org/T209611 (10awight) @bmansurov This might be interesting to you. Please let us know if the design will be compatible with your articl... [18:40:27] notconfusing: Thanks for joining! Would you like to be on that invite going forward? [18:42:10] 10ORES, 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Ottomata) Woot, we have a Hive event.mediawiki_revision_score table! :) [18:43:27] harej: Do you have a minute to look at the indexing suggestions from TechCom? [18:43:30] I did find it informative [18:43:36] sure, it's good to be in there [18:44:11] 10ORES, 10Scoring-platform-team, 10Analytics, 10Dumps-Generation, and 3 others: [Epic] Make ORES scores available in Hadoop and as a dump - https://phabricator.wikimedia.org/T209611 (10Ottomata) Hey heyyy! We deployed changes for T197000 today. I also re-enabled Hive refinement of this data, so we now ha... [18:44:57] notconfusing: Cool. Yeah, I've really enjoyed getting the fresh eyes. [18:47:30] 10ORES, 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, and 3 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Pchelolo) 05Open>03Resolved Ok. The events with the new schema are being emitted, there are no rejection... [18:52:03] wikimedia/ores#1157 (json_again - 85e71d7 : Amir Sarabadani): The build failed. https://travis-ci.org/wikimedia/ores/builds/459890657 [19:01:40] awight: halfak|Lunch: One quick thing: https://github.com/wikimedia/ores/pull/294 [19:02:11] Amir1: got it [19:02:39] Thanks [19:03:24] 10ORES, 10Scoring-platform-team, 10Analytics, 10Dumps-Generation, and 3 others: [Epic] Make ORES scores available in Hadoop and as a dump - https://phabricator.wikimedia.org/T209611 (10bmansurov) @awight thanks for the ping. I'll keep an eye on the task. [19:05:13] 10Scoring-platform-team, 10revscoring, 10artificial-intelligence: Include training performance metrics in model_info - https://phabricator.wikimedia.org/T197013 (10Ladsgroup) p:05Triage>03Lowest [19:05:57] 10ORES, 10Scoring-platform-team: Sort classes when printing model_info - https://phabricator.wikimedia.org/T194221 (10Ladsgroup) p:05Triage>03Lowest [19:06:36] 10Scoring-platform-team: Set up a docker-compose containing ORES services required to run tests - https://phabricator.wikimedia.org/T195077 (10Ladsgroup) Isn't it done? [19:09:43] 10ORES, 10Scoring-platform-team: Use docker for ORES travis-ci tests - https://phabricator.wikimedia.org/T195073 (10awight) [19:09:45] 10Scoring-platform-team: Set up a docker-compose containing ORES services required to run tests - https://phabricator.wikimedia.org/T195077 (10awight) 05Open>03Resolved a:03awight >>! In T195077#4775097, @Ladsgroup wrote: > Isn't it done? Thanks for the nudge! [19:09:59] 10Scoring-platform-team: Set up a docker-compose containing ORES services required to run tests - https://phabricator.wikimedia.org/T195077 (10awight) a:05awight>03Oriolsoriano [19:11:33] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: articlequality makefile has hardcoded path - https://phabricator.wikimedia.org/T192324 (10Ladsgroup) p:05Triage>03Low [19:16:28] 10Scoring-platform-team, 10Release-Engineering-Team, 10Performance: Investigate PEP 503 repo for production deployment of python wheels - https://phabricator.wikimedia.org/T192478 (10Ladsgroup) By using [[https://wikitech.wikimedia.org/w/index.php?title=SSDD|SSDD]], this would be obsolete. Because the contai... [19:16:28] 10[1] 04https://meta.wikimedia.org/wiki/https://wikitech.wikimedia.org/w/index.php%3Ftitle%3DSSDD [19:17:17] 10Scoring-platform-team, 10drafttopic-modeling: Rewrite draft topic scripts to fetch linked pages and prepare training data - https://phabricator.wikimedia.org/T193594 (10Ladsgroup) p:05Triage>03Low [19:17:57] 10Scoring-platform-team, 10revscoring, 10artificial-intelligence: Build w_cache data in an intermediate file to allow reuse - https://phabricator.wikimedia.org/T191473 (10Ladsgroup) p:05Triage>03Lowest [19:26:19] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) >>! In T209731#4757923, @JAllemandou wrote: >>>! In T209731#4754979, @Nuria wrote: >> It is worth looking at already existing event data, if we want to reuse... [19:29:44] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10JAllemandou) I support the idea of using model name and version as partitions. Wiki_db would possibly be another good fit if requests will most often be on singular... [19:31:09] 10Scoring-platform-team, 10Release-Engineering-Team, 10Performance: Investigate PEP 503 repo for production deployment of python wheels - https://phabricator.wikimedia.org/T192478 (10Imarlier) @Ladsgroup I don't think that's correct. There still needs to be a way to install dependencies into containers at s... [19:36:36] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10Nuria) Rather than backfilling which implies you are "filling a hole of data" this is a model recalculation completely. As in: you are recalculating scores due to e... [19:41:19] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) >>! In T209731#4775194, @JAllemandou wrote: > I support the idea of using model name and version as partitions. Wiki_db would possibly be another good fit if... [19:47:56] halfak: Are you watching InSight? [19:48:08] Our 1:1 is at a crazy time :p [19:50:58] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10Nuria) @awight: Can you explain a bit the consumer use cases? Data in hive of this nature is mostly consumed by automated processes that create derived datasets. Do... [19:56:42] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) >>! In T209731#4775228, @Nuria wrote: > Rather than backfilling which implies you are "filling a hole of data" this is a model recalculation completely. As... [20:02:04] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) >>! In T209731#4775303, @Nuria wrote: > @awight: Can you explain a bit the consumer use cases? Data in hive of this nature is mostly consumed by automated p... [20:07:46] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10JAllemandou) > Will the order of partitions make a difference? For example, if consumers are more likely to get multiple models of scores for a single wiki, vs. mul... [20:10:08] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10Nuria) >Some processes such as the recommendation API are already running on Hive, and for example might benefit from the new scores table, by finding the top 1% qua... [21:14:17] I have to run out for ~1hr [21:14:51] awight: when you get back, can you link me to the indexing suggestions? [21:20:17] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) >>! In T209731#4775407, @Nuria wrote: >>Once we've piped a mediawiki_revision_score event into our new table, we'll never need to read that event again. Does... [21:20:20] harej: will do! [22:01:40] harej: thanks, and sorry I couldn't get that out before leaving. [22:02:43] harej: https://phabricator.wikimedia.org/T200297#4767231 [22:04:01] The first point is straightforward [22:04:17] The second one is more interesting, and depends on the use cases we're planning to support. [22:04:38] AFAICT, these are the last blockers to getting Jade deployed. [22:06:36] the latter one, is the idea that we can make Jade entries visible based on what the preferred judgment happens to be? [22:07:30] yeah, to filter and highlight [22:08:26] The open questions are, which schemas are we going to include, and how will we add new schemas to this index in the future. [22:09:05] It would be consistent with ORES usage if we include goodfaith and damaging [22:09:30] If we're limiting ourselves to those two in the medium-term, we can put the data right into the link table. [22:11:11] However, if we want to leave the ability to add schemas later, we would probably include the data into yet another table, and join to that. This is flexible and extensible, but would be more expensive to maintain in terms of code written, and the resulting query is more complex, so it's not a clear win either way. [22:11:51] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10Nuria) Process above seems a bit error prone (as you do not want to hit your live pipeline to recalculate scores for events 15 years behind), those two calculations... [22:12:53] Do you see a use case for including content quality judgment values? [22:13:21] Or draft topic? [22:14:12] Is the idea that the more we add to the link table, the more complex and expensive the queries get? [22:14:32] No, extra data in the link data is basically free. [22:14:53] However, an alter to the link table is insanely expensive, like we might have to wait a year to get a DBA to do it. [22:15:14] It will require manually applying the alter to various servers, down time, etc. [22:15:55] Adding the data in new, separate tables is cheap from the DB perspective, but the join queries become more complex due to the several additional tables. [22:30:18] When would we actually alter the link table? Hopefully never [22:31:42] We would alter it only if we * are keeping judgment content in the link table, and * want to add more schemas. [22:43:01] awight: in theory, once we deploy additional models/schemas, we'd want them to be equal citizens with other schemas. but having to wait a year for a DBA to manually apply a change to the link table sounds painful. Is there no way around this? [22:43:26] Well, there's the additional table approach [22:44:49] That would look like, one link table as we have now, which is indexed by target revision ID and has a primary key. Then, we have another table for each schema, indexed by primary key to the judgment link table. [22:45:49] I agree that we want everything to be an equal citizen but if there's no use case which requires the extra data to be efficiently joined in MediaWiki indexes, then we can simplify. [22:46:41] For example, searching through judgment text for "{{WP:OR}}" is best handled by using the search engine. [22:46:41] 10[2] 04https://meta.wikimedia.org/wiki/Template:WP:OR [22:48:37] Oh that's horrible, I just learned that Isaac Asimov died of AIDS and family+doctors covered it up. [22:50:30] "In 1977, Asimov suffered a heart attack. In December 1983, he had triple bypass surgery, during which he contracted HIV from a blood transfusion.[72] When his HIV status was understood, his physicians warned that if he publicized it, the anti-AIDS prejudice would likely extend to his family members. He died in New York City on April 6, 1992 and was cremated." [22:50:30] El búfer 72 está vacío. [22:50:48] What a different time. I hope AIDS isn't as stigmatized as it was then. [22:50:59] Isn't as stigmatized today, that is. [22:52:23] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) Thanks for all the help! @JAllemandou Can you confirm that we should partition on model_version? Will that make it possible to efficiently purge all data f... [22:52:30] for real [22:53:04] Incredible that people were worried about catching a case of anti-AIDS prejudice [22:55:03] harej: I'm leaning towards sticking damaging and goodfaith directly into the link table. I haven't come up with a good use case for the content quality data yet. We can still add future schemas as additional tables if necessary. [22:55:19] I think that will work fine [22:55:47] That will allow us to get a balance between query complexity and having to ALTER the links table [22:56:03] "Show me all the recent changes with a content quality judgment" is reasonable, but I'm not sure if the judgment value matters? [22:56:48] FWIW, there's no RC filter for ORES article quality. [22:57:02] * awight quickly confirms [22:57:22] confirmed [22:58:02] That seems like a good, crude proxy for whether there would be interest in Jade contentquality showing up in RC views. [22:58:21] My thinking at this point is that the existence of a judgment (has the edit been judged) is more important than the actual value, for screening purposes. [22:58:41] That's my instinct, too. [22:58:53] Should we push back on the idea that actual value must be available? [22:59:51] "show me watchlist entries where Jade damaging judgment != ORES damaging prediction" does seem useful [23:00:03] sort of... [23:00:20] it's interesting to model developers, but not really to anyone else [23:00:45] hmm [23:00:46] And I think we can extract that information through other means. [23:00:56] are there any use cases for the actual judgment value, then? [23:01:03] perhaps we should document those before coding for this [23:02:50] I think the question more specifically is, "do we need to surface the judgment value in RC or watchlists" [23:02:58] and I don't see the fit [23:04:14] Who is saying we need to show the value there? I think it came up in the TechCom meeting but I'm not sure a particular reason was ever stated. [23:04:24] "show me all judgments of my contributions in which the judgment is 'damaging'"? [23:04:35] I can dig up the lines from IRC [23:04:46] The reason given was roughly, "people are going to ask for it" [23:06:23] I don't think that's a strong enough reason, even if it is true. [23:06:41] Given the significant costs that seem to be there. [23:07:21] well, adding the damaging/goodfaith data is pretty quick and is cheap to operate [23:09:03] (chat logs: https://wm-bot.wmflabs.org/logs/%23wikimedia-office/20181122.txt ) [23:12:35] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10Nuria) >Will that make it possible to efficiently purge all data from an old model version? Yes, it would. Purging means you are going to drop the "whole" "model ver... [23:13:02] harej: oops, I've been pinging you in #wikimedia-office [23:13:22] I just saw. Based on that I think it might be fine in the future, depending on if people ask for it? [23:13:56] I think there's a chance, but remember it's a PITA to add to the schema later [23:14:36] What if we add it to the link table for those that want the data, but don't design any UIs around it? [23:14:52] That's a good way to go [23:15:07] It'll be something we make available to gadget designers and the like. [23:15:34] As a feature it's not a priority to me, but as you say, it's easier to add it now than add it later. For future Jade schemas we can create additional tables. [23:16:08] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) Wonderful, that answers all the questions I have from our perspective. I'll update the task description… [23:17:31] 10ORES, 10Scoring-platform-team, 10Analytics: Choose HDFS paths and partitioning for ORES scores - https://phabricator.wikimedia.org/T209731 (10awight) 05Open>03Resolved [23:17:33] 10ORES, 10Scoring-platform-team, 10Analytics, 10Dumps-Generation, and 3 others: [Epic] Make ORES scores available in Hadoop and as a dump - https://phabricator.wikimedia.org/T209611 (10awight) [23:17:55] harej: kk, or we can ask for the "alter" if there's enough runway. [23:34:14] OK done with 1:1 marathon. I'm out for the day. [23:34:20] have a good one, folks. :) [23:34:25] o/ [23:45:35] 10Scoring-platform-team (Current): Create sample Newcomer quality predictions for TeaHouse hosts to sanity check - https://phabricator.wikimedia.org/T209607 (10notconfusing) DONE: https://meta.wikimedia.org/wiki/Research:ORES-powered_TeaHouse_Invites/Comparison [23:50:46] 10Scoring-platform-team (Current): Evaluate Newcomer Model - https://phabricator.wikimedia.org/T208364 (10notconfusing) Final metric used was precision at k=300 (300 recommendations needed per day). Booth LR and gradient boosting have >97% p@k=300, less than 2% sensitive to test/train split. {F27316481}