[00:12:05] 10Scoring-platform-team, 10ORES, 10Reading-Infrastructure-Team-Backlog, 10Trending-Service, 10Services (designing): Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3488161 (10Jdlrobson) [07:33:25] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3488568 (10awight) I'm doing another iteration of this experiment, addressing the critiques that came up: * Omit approvals where mor... [09:10:14] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3488726 (10awight) There are many more approval logs than I had realized at first. log_params was only serialized beginning in Dece... [09:51:48] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3488756 (10awight) This script gives us 310k rows in the desired format, but this form will only work on "stat" machines. Needs to... [09:59:28] 10Scoring-platform-team-Backlog, 10Wikilabels, 10editquality-modeling, 10Spanish-Sites, 10artificial-intelligence: Complete eswiki edit quality campaign - https://phabricator.wikimedia.org/T131963#3488768 (10MarcoAurelio) 51% and [[ https://es.wikipedia.org/w/index.php?title=Wikipedia:Caf%C3%A9/Archivo/T... [10:20:59] (03PS1) 10Hashar: build: bump grunt and drop grunt-cli [extensions/ORES] - 10https://gerrit.wikimedia.org/r/369253 [10:31:20] 10Scoring-platform-team-Backlog, 10Wikilabels, 10editquality-modeling, 10Spanish-Sites, 10artificial-intelligence: Complete eswiki edit quality campaign - https://phabricator.wikimedia.org/T131963#3488797 (10Ladsgroup) Great! Thank you :) [11:58:44] 10Scoring-platform-team, 10translatewiki.net: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3488931 (10Amire80) [11:59:01] 10Scoring-platform-team, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3488945 (10Amire80) [12:04:20] 10Scoring-platform-team, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3488958 (10Ladsgroup) How I can delete a message from translatewiki? Do I have th... [12:07:17] 10Scoring-platform-team, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3488965 (10Amire80) As I wrote, if it's correctly deleted from the code, then it... [14:40:10] * halfak digs into travel budget stuff. [14:40:22] Amir1, I might be coming to wikidatacon after all. [14:40:33] I know you said it was unlikely you could go, but i thought I'd say :) [14:40:52] halfak: oh, I meant the other way around [14:40:59] it's not 100% but very very likely [14:41:04] Oh! great :) [14:41:15] because even using my money I will go there [14:41:24] If I can work this out soon enough, we can probably also get awight and have a bit of an offsite. :D [14:41:35] AI PARTY HAUS [14:41:52] and probably by that time, I'm already living in Germany (around 70%-90%) [14:43:12] Coool! PARTY HAUS @ AMIR'S [14:45:34] YEAH [16:04:52] 10Scoring-platform-team, 10ORES, 10Scap, 10Release-Engineering-Team (Watching / External): Simplify git-fat support for pulling from both production and labs - https://phabricator.wikimedia.org/T171758#3489639 (10Halfak) Gotcha. Thanks. I think we're interested in putting some energy behind this if you c... [16:07:19] 10Scoring-platform-team, 10ORES, 10Wikimedia-Incident: Blog post about regex-pocalypse - https://phabricator.wikimedia.org/T172200#3489647 (10Halfak) [16:08:12] o/ [16:08:34] \o [16:11:46] anything need merging or looked at while im here? halfak [16:11:59] Hmm... nothing right now, no. [16:12:15] would be good if I could be reviewing your PRs though ;) [16:12:35] True, ill be looking today for stuff to work on [16:12:41] i've been preoccupied with my own projects [16:15:59] halfak i thought someone was already doing T172049 [16:15:59] T172049: Get signal from adding/removing images - https://phabricator.wikimedia.org/T172049 [16:16:28] Doesn't look like it. [16:17:12] thats a little too complex but i saw it and thought i say something [16:18:07] what is T172200 about halfak maybe i can assist (you?) with that [16:18:08] I'd tag that one "easy" [16:18:08] T172200: Blog post about regex-pocalypse - https://phabricator.wikimedia.org/T172200 [16:18:22] That's very complicated [16:18:23] halfak im not to smart with revscoring yet [16:18:23] :P [16:19:11] its a blog post surely its not to complex :P [16:24:28] How much do you know about python's global interpreter lock? [16:24:36] And recursive backtracking in regular expressions [16:25:11] a little bit [16:25:16] not a whole bunch [16:32:42] 10Scoring-platform-team, 10ORES, 10Wikimedia-Incident: Blog post about the ORES regex-pocalypse - https://phabricator.wikimedia.org/T172200#3489726 (10Halfak) [16:45:28] halfak i can help proofread and help make sure the post makes sense before publishment [16:45:38] (thats probably not even a word) [16:48:51] The word you're looking for is 'publication' [16:59:06] Zppix, OK that'd be helpful. [16:59:13] Will let you know when I can actually finish the draft :/ [17:00:43] Amir1, just found out that I'm again, not going to Wikidatacon >:( [17:00:53] They just cancelled the management thing that was going to be there. [17:01:12] Hmm... Maybe I can turn this into an offsite for me and awight to join you in Berlin though. [17:01:28] pl [17:01:28] ok* [17:01:50] * halfak grabs lunch and joins his 4th meeting of the day. [17:02:37] you could go anyway though couldnt you halfak? [17:03:05] Too expensive. :/ Needs a good justification [17:08:14] halfak: meeting? [17:09:59] o/ codezee [17:10:05] Zppix: o/ [17:10:21] halfak: awe are waiting for you [17:14:26] o/ Adam (awight) [17:15:04] bonjour Zppix ! [17:16:58] halfak Amir1 O_o wikidatacon. Unfortunately, I’m moving around that time thus will not be able to raise tha roof of das haus [17:20:16] halfak: not sure what's going on there. Sumit cannot hear me, and we are only two at the meeting you scheduled. Maybe you're waiting for us somewhere else? [17:21:41] fajne_: ono! I don’t see anything on his calendar, which means a robot did not drag him in front of the videochat rig :-/ [17:22:36] awight: https://usercontent.irccloud-cdn.com/file/7ZHg3tgh/image.png [17:23:22] something fishy? [17:23:36] ooh sorry. I did see that but thought it was unrelated! [17:24:09] but apparently aaron has joined some meeting: "* halfak grabs lunch and joins his 4th meeting of the day." [17:24:33] ok, Sumit. I'm leaving you alone. You cannot hear me anyway. [17:24:37] me too [17:25:33] lol that does smack of foul play. [17:26:23] or aarons acc is taken over by malware sending invites on his behalf :D [17:27:20] .The “management” version of G-suite [17:29:42] awight: )))) aha, like, Aaron, you have this unnecessary 4th meeting today, i am cancelling it. go to lunch. [17:30:07] mmm? I have no meetings today. It’s weird [17:30:07] :P [17:30:10] but yes, run away! [17:30:28] I feel like my schedule today must be the eye of a storm. [17:30:42] but at least In know this story behind ACTRIAL now (https://wikistrategies.net/actrial/ ) [17:30:43] * awight shelters in place :p [17:31:19] Thanks! I’ve been hoping to learn the context myself. [17:32:45] fajne: looks like some kind of volunteer feedback [17:33:20] rather, a reporter's work. [17:35:25] what follows from this article is that 500 volunteers are more likely to create a fruitful communication between each other than a WMF and a W community. [17:44:25] fajne, codezee: sorry about the ACTRIAL stuff. [17:44:41] I'll explain in PM [17:50:44] So, generally, I think that we can help by focusing on building useful predictions for new page patrollers. [17:50:53] Right now, we have the "draft quality" model. [17:51:19] And that is useful, but I think we can also look into building models that help identify statements of notability. [17:51:24] * awight ruminates that we might need a private group channel [17:51:34] https://en.wikipedia.org/wiki/Wikipedia:Criteria_for_speedy_deletion#A7._No_indication_of_importance_.28people.2C_animals.2C_organizations.2C_web_content.2C_events.29 [17:51:46] This is the most common reason that new pages get deleted. [17:52:04] I think we can also help by trying to route reviewers to the new page creations that are within their content space. [17:52:19] E.g. we could train a model to predict which WikiProjects will tag an article. [17:52:34] And then route those new page creations to WikiProject specific backlogs. [18:00:25] halfak: how would "useful predictions" for new page patrollers be different from draftquality, which already tries to address the issues mentioned in WP:CSD I presume? [18:00:25] (I am always interested in building out tools to help WikiProjects do the work. *Availability* is another thing, but definitely interested.) [18:01:13] codezee, did you read the messages that I just sent? [18:01:51] harej, could be a "if you build something interesting and useful, they will come" situation [18:01:53] halfak: oh, i see [18:01:54] Even if not, it' [18:02:00] ll be useful for other reasons :) [18:02:14] At least research and analysis if not actually -- you know -- use. [18:02:42] I think I'd like to put priority on the WikiProject topic predictor model next quarter. [18:09:19] codezee, harej, fajne: so here's my vision of this. The first thing that happens with a new article is https://meta.wikimedia.org/wiki/Grants:IdeaLab/Fast_and_slow_new_article_review [18:09:31] Which roughly describes the use-case of the draftquality model. [18:10:09] * awight raises hackles at eager deletionism [18:10:12] Then we'd apply the "notability assertion detector" model to articles that enter the "slow" queue to aid the new article creator. [18:10:42] Then we'd route all "slow" (not obviously awful) articles to WikiProjects depending on the apparent topic of the content. [18:11:08] It’d also be real nice to just kick new articles into a user draft space rather than deleting. [18:11:20] awight, that's doomed too. [18:11:32] * awight stops shuffling to find a link [18:11:52] https://www-users.cs.umn.edu/~halfak/publications/Accept_Decline_Postpone/schneider14accept.pdf [18:12:11] we *need* readers to see new article drafts. [18:12:19] Lots of early contribs come from anons. [18:12:25] They look like drive-by contributions [18:13:16] Woops! I need to pedal to the university. [18:13:17] o/ [18:13:22] back in ~45 [18:13:30] harej, codezee, fajne ^ [18:15:04] * Zppix wonders how much exercise and weight loss is gained by halfak pedalling constantly to the university [18:15:12] Neat, I made it through the abstract and will bookmark for a commute read :-) [18:15:45] Zppix: It’s mostly to avoid meetings :p [18:17:39] awight: interesting use of commute time :P [18:18:53] Happily, it’s nearly impossible to drive to WMF, so I get lots of time on the train... [18:19:56] In the words of “Repo Man”’s Miller, I do my best thinking on the bus ;-) [18:35:56] awight there's a phab migration tommror :). iridium -> phab1001 (new server and new os) [18:36:55] I will reserve judgment about whether updating Phabricator ever brings about joy and happiness. [18:37:19] I love how the “add task” button is now a… star. It’s a star. [18:37:28] * awight short-circuits [18:37:54] awight new os not new softwear :) [18:38:13] lol thanks for the reality check [18:38:29] * awight mutters something about landmines [18:38:52] lol [18:40:19] dang, this toolforge user db is being aggressively pruned [18:40:24] https://quarry.wmflabs.org/query/20647 [18:40:43] * awight resolves to use create temporary table twice as much, until it’s accepted by the community [18:40:56] oh. haha I did it to myself: temporary table. [18:40:58] lol [18:41:09] double facepalm [18:41:34] awight the move change feature from ui is live at http://gerrit-new.wmflabs.org/r/c/71 now :) [18:41:46] Administrator [18:41:46] Change destination moved from master to mass [18:41:50] Administrator [18:41:51] Change destination moved from mass to master [18:41:52] heh [18:43:00] Congrats, that’s a nice accomplishment! [18:43:53] yep :) [18:44:18] awight /me will be using patch set descriptions when we upgrade to gerrit 2.14 on gerrit.wikimedia.org :) [18:44:50] What’s that? comments per patch set iteration? [18:45:13] if i were to strip everything but the first letter of the most recent human chatters in this channel from the last hour teachers would go crazy with the amount of AP tests they would have to give out :P [18:45:30] awight you can set patchset descriptions ie [18:45:33] rebased [18:45:37] or fixed tests [18:45:40] or any thing :) [18:45:55] http://gerrit-new.wmflabs.org/r/?polygerrit=1/c/71 [18:45:56] see [18:45:57] that [18:46:06] wow, add me to the list of people thrilled to hear that has earned first-class citizenship. [18:46:07] see test 1 2 3 [18:47:04] Zppix: yes but it would boost our grade-point average by up to 25%! [18:48:03] awight i think you mispelt down xD [18:48:24] lol [18:48:31] paladox: Where does the patch set description get surfaced, though? [18:48:45] in the changescreen [18:48:47] Seems like it should appear in the patch set list to be useful [18:48:49] only works on polygerrit though [18:49:11] I think that’s it [18:49:26] I needed to “click to use new UI” hehe [18:49:29] awight see https://phabricator.wikimedia.org/F8926031 [18:50:36] there it is! [18:51:27] 10Scoring-platform-team-Backlog, 10Research Ideas, 10Research-Backlog, 10Wikimedia-Hackathon-2017, and 2 others: General image classifier for commons - https://phabricator.wikimedia.org/T155538#3490277 (10Strainu) I'm very interested in seeing some of the work you're doing. I've personally invested some ti... [18:59:32] I think Quarry is OOM’ing on large result sets. [19:22:31] awight its query killer i bet [19:23:40] I would have thought so as well, but I stripped the real work out and put everything into a table for the last select statement to pick up (and do joins). It only takes a few seconds from a toollabs console... [19:28:33] awight theres also others using quarry aswell though :P [19:28:42] so it takes more time for cpu to process all the requests [19:29:24] It’s pretty mysterious…. there might be logging or diagnostics? For now, I’m just working around it by using the console. [19:31:37] anyone mind if i close T166472 as declined i mean its redundent to choose where to oauth from cause its all the same really [19:31:37] T166472: Wikilabels should authenticate on the right wiki - https://phabricator.wikimedia.org/T166472 [19:32:55] Zppix: I agree with tgr|away_ ’s last comment, unfamiliarity has a real psychological effect on users. [19:34:49] awight but how likely are they to notice a difference... i mean they are going to be focused on the popup oauth window not everything behind it [19:35:48] Since i18n is involved, I see that as a very big deal [19:35:53] lemme see if I can demonstrate... [19:37:41] awight, do you have two columns with the same name? [19:37:52] I don’t think so [19:38:15] * awight tries to sync with halfak’s invisible screen [19:38:18] awight the wiki we use would support the langs for l18n [19:38:32] * Zppix plugs in the invisable hdmi cable for awight [19:38:55] Zppix: Check this out, https://www.mediawiki.org/wiki/?uselang=zh [19:38:56] quarry is known to barf on queries that name two columns the same. [19:39:14] Zppix: those are a lot of characters I wouldn’t recognize if I didn’t do the Latin alphabet [19:39:29] halfak: Can you… sorry but, paste me the URL?> [19:39:53] Sorry just came in and saw you talking about hitting the query killer but it works on the console and took a guess :) [19:39:56] awight thats with contenttranslation im talking about oauth l18n which in theory is different [19:40:42] Zppix: cool, you might be right. If you can get a screenshot of what the oauth process would look like, that might be the evidence we need to close the task. [19:40:46] d'oh [19:40:51] awight i will asap [19:41:01] :D thanks for taking a look [19:41:31] halfak: it’s a crazy scene. I did this whole user database, temporary table dance [19:41:37] and it’s still not raining. [19:42:14] Hit by lots of surprises: can’t set permissions myself, I’d have to file a ticket, so I worked around by running all the inserts and updates from the console, leaving just the glorious select for Quarry [19:42:16] that still didn’t work [19:42:32] Here’s the select itself: https://quarry.wmflabs.org/query/20647 [19:43:27] The whole query is “fun”, I’ll push that in a minute for the curious adventurer. [19:44:22] Query status: Complete [19:44:23] :P [19:45:31] baahaha [19:45:32] ty [19:45:39] \o/ [19:45:56] * awight throws out tsv2json nonsense [19:46:01] Noooo :P [19:46:11] Oh wait. Maybe yessss [19:46:15] depends :D [19:47:46] brb [20:02:56] interesting, i am comparing enwiki's AfC with ruwiki's Incubator, and not in favor of AfC so far [20:04:20] Yeah. It's not great. I think the idea is a good one, it just misses something really important about the way that wikis function. [20:04:40] in the Incubator, a newbie is given a month for developing his stub when no one can delete it and even edit without an invitation [20:06:14] after the newbie decides he's done, the article undergoes a review, after which it transplants into mainspace [20:07:33] Oh goodness. That seems like a very bad idea [20:08:09] https://www.irccloud.com/pastebin/cWGbDVin/ [20:08:13] Why’s that? Cos of your findings that draft space inhibits new editors? [20:08:44] Because it breaks down why wikis work in the first place. [20:08:50] Open collaboration and all that [20:08:59] Though I do like the deletion delay, I feel like it should be more like an hour. [20:09:46] it's still open [20:09:57] * awight determines to read Schneider14 so I can jump into this debate [20:10:12] they just give you time to finish your piece [20:10:14] fajne: It sounded like it was only open by invite, though? [20:10:23] and it's only for newbies [20:12:46] hm, no, an invitation here is how i tries to translate "it is not recommended to intervene with someone's process of writing an article unless it is asked for" [20:13:47] Does that mean you can still go in to fix grammar, add citations, and rewrite awkward sentences? [20:14:02] Also, part of open (that I argue for in that paper) is *visibility* [20:14:16] When we hide drafts from the world, they don't get any drive-by editors [20:15:45] are drive-by editors more helpful than not? [20:25:35] fajne, 95% helpful :) [20:25:44] Especially on new pages. [20:25:54] I think vandals target already-popular pages. [20:27:15] Just spot-checking some autolabeling, this was the only edit reverted_for_damage=true: https://fi.wikipedia.org/w/index.php?diff=10979620 [20:27:19] strange… [20:27:51] N=10, so this isn’t a complain that they should be more common, only that it seems to be mislabeled. [20:28:30] 10Scoring-platform-team-Backlog, 10Research Ideas, 10Research-Backlog, 10Wikimedia-Hackathon-2017, and 2 others: General image classifier for commons - https://phabricator.wikimedia.org/T155538#3490759 (10Basvb) >>! In T155538#3490277, @Strainu wrote: > I'm very interested in seeing some of the work you're... [20:28:58] nvm, I’ll look for something more interesting. [20:30:21] some ruwiki stats on new articles: newly registered and unregistered users are equally harmful/helpful; 1/3 of their input stays, 2/3 gets deleted. Their joint input that stays constitute about 15% of the articles survived the review [20:37:14] hey awight how would i make it so the lang i use isnt english for oauth [20:37:53] Zppix: Good question—some of our users might run into the same problem fwiw, since they won’t necessarily be logged in. [20:38:08] Generally you can use the ?uselang=zh URL param [20:38:21] (zh is an example, obviously) [20:52:13] awight if there arent logged in they would be redirected to the standard login page [20:52:28] which even if they arent english speakers its the same setup in all langs [20:52:51] * awight hunts for an oauth app to replicate findings [20:53:04] is wikilabels down? [20:53:10] oh nevermind [20:53:15] its labels not wikilabels. [20:54:08] it doesnt work [20:54:11] or i did it wrong [21:00:22] I’m having trouble getting a translated oauth page at all. In private mode on Firefox, with my content language set to es, I see English at quarry.wmflabs.org (my sample oauth consumer), Spanish at wikipedia.org, but English at https://meta.wikimedia.org/w/index.php?title=Special:UserLogin&returnto=Special%3AOAuth%2Fauthorize&returntoquery=oauth_token%3Da50d7d97352ccbbe0444c4d3f081c842%26oauth_consumer_key%3D0adfc217578f02b52d57a570027405c0 [21:01:35] Although I’m enjoying the opportunity to dive into wiki oauth, there must already be a best practice around where to send users for a 3-leg login. [21:02:24] That’s worth writing a subtask for, if neither of us can find an answer in the docs, yet: https://www.mediawiki.org/wiki/Extension:OAuth [21:05:54] i wonder if theres a special:mylang varient for oauth [21:07:35] \o/ sorry was away for coffee [21:09:02] awight, https://fi.wikipedia.org/w/index.php?title=Merkityt_versiot_-kokeilu/Testisivu_3&diff=next&oldid=10979748 [21:09:35] Zppix: ooh I think you’re suddenly close to capturing the flag [21:11:03] argh—but we’re already on a special page. [21:15:17] Well, I think something like MyLanguage would be a great medium-term goal, but we can workaround in the near-term by sending to .wiki*.org/wiki/Special:UserLogin, and it also gains us overall user familiarity. [21:15:30] awight: it's a normal special page, honors uselang and user settings, otherwise just uses contentlang [21:16:14] and the wiki is fairly visible, partly because the dialog does not appear immediately [21:16:43] tgr: I guess I was surprised that browser accept-language headers don’t take precedence over contentlang when nothing else is available to base a guess upon. [21:17:11] awight: you must be new here :) [21:17:17] rofl [21:17:39] almost nothing in MediaWiki does sane language negotiation [21:17:43] You should assume I am, joking aside. [21:17:46] But if you need some credit cards processed I’m your man. [21:18:00] yeah, I noticed that when I tried to hack IETF language tag in [21:18:16] the universal whatever does something approaching sanity but has been disabled on most (all?) wikis [21:19:03] to be fair, browser Accept-Language headers are not all that reliable [21:19:23] um… I think I was confounding content and UI language as well, I would have expected the UI to change but not content. [21:19:40] I’ve been afraid to read about why Universal Language Selector is maligned [21:19:42] content language never changes [21:19:47] +1 [21:20:24] unless you are looking at something made with Translate and using Special:MyLanguage [21:20:28] and good point about accept-language, it just takes one layer of public web kiosk to mess that up. [21:20:29] FWIW, google also ignores Accept-Language [21:20:35] To my intense frustration [21:20:36] It honors [21:20:54] at least does so for me :) [21:21:00] (which I think just uses the interface language logic) [21:21:28] CentralNotice banners enjoy this complexity, too [21:21:32] Not when I'm in Germany. I get German language google. but my accept-language is "accept-language:en-US,en;q=0.8" [21:21:33] are they content or interface... [21:21:52] halfak: aha, yeah huh. They have some geo thing that overrides accept-lang [21:22:07] but if you open a private window and set your preferences, you’ll get the translated site [21:22:17] maybe translated but not localized-nationalized [21:22:36] Die Bastarde! [21:22:41] * awight secretly dreams of working with the i18n team one day [21:22:45] ^ german [21:22:53] Not a death threat [21:22:55] lol [21:23:00] It means The Bart, The [21:23:59] anyway, if you want localized UI while authorizing OAuth on a multilang wiki, you'll have to do language negotiation in the app and set uselang, for now [21:24:45] https://www.google.com.mx/?hl=zh&gws_rd=ssl [21:25:13] harej: lol [21:34:11] 10Scoring-platform-team-Backlog: Progress indicator for editquality autolabel utility - https://phabricator.wikimedia.org/T172225#3491077 (10awight) [21:34:20] 10Scoring-platform-team-Backlog: Progress indicator for editquality autolabel utility - https://phabricator.wikimedia.org/T172225#3491089 (10awight) p:05Triage>03Low [21:35:45] awight, just noticed that https://github.com/wiki-ai/revscoring/pull/338 is still out there. [21:36:10] I'm OK with merging but I thought maybe you'd want to address the last couple of notes. [21:36:25] Up to you [21:36:29] halfak: We were previously mixing 50k approved revisions with 20k labeled stuff. Now it’s 300k approvals. Should I adjust anything in the training steps to deal with the imbalanced samples? [21:36:39] halfak: uh oh, I started developing on that branch again. [21:36:53] Might as well leave it for a big fat review in a day or two. [21:37:35] awight, I'd keep going as you were before and we'll iterate if we get weirdness in the output/ [21:37:54] It'll be nice to have the "thresholds" thing to look for indicators of "weirdness" [21:38:10] I think that we should work on revscoring 2.0 @ wikimania. [21:38:12] Meanwhile, what notes did I not address yet? I think the last patch set might have addressed? [21:38:18] I think that'll be my goal. [21:38:27] https://github.com/wiki-ai/revscoring/pull/338/files#r130016695 [21:38:47] sure—even better would be if we could roll out 2.0 and have it as a springboard for the hackathon [21:39:08] awight, that'll be possible. I'll definitely need to front-load some work [21:39:26] I should get into that codebase too [21:39:29] But this weekend is busy for me. I've got an endurance non-race on Saturday. [21:39:43] earn my pre-tax bart here :) [21:39:54] week *end*? [21:39:54] lol [21:39:58] what gives [21:40:06] I'd love to be in shape but that would require fundamental changes to my lifestyle :/ [21:40:06] what do you mean? [21:40:21] pls don’t do “2.0” anything on the weekend [21:40:41] lol [21:40:42] ha. You know me. I love this stuff. Also, I like to keep my Saturday morning open for "office hours" of the working folks [21:40:47] unless it’s measured in km [21:40:49] lol [21:40:56] I worked with mneisler last weekend and it was awesome. [21:41:07] on what? [21:41:09] Working on the article quality predictions dataset :) [21:41:19] like, spot-checking? [21:41:29] Taking new measurements with it [21:41:35] * halfak gets an example [21:41:43] Anyway, I love this stuff but I also love stuff that is not this [21:41:59] oh cool [21:42:04] that does sound fun [21:42:12] and like something you should pay… yourself to do [21:42:33] lol [21:42:45] I wish they'd boost my travel fund in exchange for weekend work. [21:42:55] I feel like that would be a fair trade. [21:43:08] Paying out of pocket for work travel feels shitty [21:43:38] But I get in debates with finance/c-levels about whether or not presenting papers I wrote while on the clock is "work travel" or "fun" [21:44:04] Damn. Where is my query? [21:44:14] https://quarry.wmflabs.org/query/20169 [21:44:16] Got it. [21:44:23] Check this out: ^ [21:44:29] harej, you'll be interested in this too [21:44:39] It measures the quality of articles in a WikiProject over time [21:44:45] And it runs pretty quickly [21:44:56] I am interested in that very thing, thank you. I didn't know you could access ORES from Quarry!!! [21:45:06] * halfak worked really hard on that [21:45:16] My NIOSH-boss *loves* metrics. [21:45:21] I had a task that was almost 1 year old [21:45:33] I have that task's birthday on my calendar ;) [21:45:45] halfak: dude that is shameful of WMF. [21:45:58] Na. Getting this amount of data in MariaDB is *hard* [21:46:05] You should be supported in your career development, which includes delivering papers [21:46:10] What we really need is a *better* public query engine. [21:46:18] Oh wait. Yeah. That does [21:46:26] That's a pile of BS [21:46:27] :) [21:46:39] +1 query engine, I actually love Quarry for all its lack of maintenance. but yeah I meant the upside-down priorities [21:46:52] grrr [21:46:54] Hadoop for Everyone! [21:47:01] * paladox wonders when google is open sourcing there search engine lol [21:47:10] * awight smiles at harej for lightening the mood [21:47:21] paladox, as soon as it is totally impractical for anyone else to run it. [21:47:33] lmao [21:47:43] lol [21:48:11] You'll notice some quirks in this output that Nettrom is looking into right now. [21:48:22] I'm having fun with SSDB, which is like Redis but with hard drive as the primary storage, except I keep DoSing it :( [21:48:22] E.g. sometimes redirects get predicted to be FA-class. [21:48:47] Because people sometimes people don't move the talk page when they move an FA-class article. :\ [21:49:02] lol harej [21:49:21] Weighted sum is weird, I want to see that graph normalized by N [21:49:33] "Please give me the names of all 18.8 million hash tables." "Connection reset by peer." [21:49:35] though I can see why the area of quality x volume matters too [21:49:41] lol [21:49:53] awight, I want to intersect with importance too [21:50:00] Which you can do with quarry using categories [21:50:06] That'd be cool [21:50:19] Might try to get mneisler to look at that next ^_^ [21:50:42] * halfak schemes about how to get hopeful new data scientists publishing as soon as possible [21:50:51] our public query engine 2.0 should be able to do visualizations, M-d manifolds in N-d space and stuff [21:51:22] awight, https://phabricator.wikimedia.org/T169452 [21:53:00] * awight lobs a cactus of approval [21:53:33] A migration plan with the original (awesome) author’s blessings… that seems like it should get funded too. [21:53:53] * halfak plants money trees in his garden. [21:54:00] Maybe in 2017’s annual plan 8D [21:54:06] oops I meant 2018 [21:54:28] lol my WMF laptop just made an ominous noise I haven’t heard before [21:54:35] lol [21:54:47] They detected your insolence [21:55:07] you have a wmf laptop [21:57:01] paladox: found it in the stairwell :p [21:57:06] lol [21:57:12] paladox: believe it or not, I’m a staff member <3 [21:57:21] lol, /me knows :) [21:57:24] they let anyone in this place [21:57:27] lol [21:57:39] me has a mac and windows pc [21:57:46] my windows pc is so slowwwwwwwwww [21:57:47] paladox is in every channel and Knows much :p [21:57:53] lol :) [21:57:59] i am in most channels but not all [21:58:03] ugh I keep windows in a VM these days [21:58:14] lol [21:58:32] better to be polyscient than omniscient, I would wager [21:58:34] when i got my windows laptop it was £299 [21:58:51] i will never be buying a laptop from very [21:58:59] my mac is so much faster [21:59:04] but it was very very expensive [21:59:18] halfak: I have to explicitly grep out reverted_for_damage: true, right? [21:59:33] That’s an impressive deal. [21:59:49] yeah on macs it’s probably best to buy a few years back, unless you have a stipend [21:59:56] paladox: that's how it goes, unfortunately... you get a very good computer, but you have to pay a premium for it [22:00:03] lol [22:00:07] harej i got it for a deal [22:00:09] though [22:00:10] awight, I think you want to match "review_reason": "reverted edit" [22:00:10] no need for early-adopting rubbish like “bold” lack of ports [22:00:19] my dad brought it all the way from the us :) [22:00:26] halfak: ty [22:00:30] Which includes all reverts :) [22:00:54] paladox: sweet—yeah it probably dropped in price by 50% on the boat ride over [22:01:03] lol [22:01:12] he took british airways :) [22:01:21] halfak: gives me ideas about how to eventually disappear the makefile [22:02:04] awight though my dad did something like convert pounds to dollers or the other way [22:02:10] carn't remeber but it was cheaper [22:02:51] extracting my people’s wealth. very well played [22:03:12] gotta love arbitrage [22:03:16] * awight carries on at pick-axing little scraps of copper out of the cave walls to fuel the tech revolution [22:03:36] arbitrage has *got* to share a root with arbitrary [22:03:46] lol [22:03:55] (what’s the name for that? nyetomology?) [22:04:30] * halfak wonders how awight is going to replace the Makefile-o-doom [22:05:13] halfak: I had a look at my code to try to improve the sampling strategy, and cried a little… looks like I need to rewrite/refactor my code, but that’s a good thing [22:05:50] (the code that generates the published quality assessment dataset, that is) [22:05:53] Boy have I been there. [22:06:01] :) Nettrom [22:06:05] halfak: T168455 [22:06:05] T168455: Investigate code generation for model makefile maintenance - https://phabricator.wikimedia.org/T168455 [22:06:13] Do you want to update the extractor in wikiclass or have something separate? [22:06:24] "Sure, I'd be happy to implement this new feature for you." ...a month later... "So it turns out I have to pretty much rewrite the entire thing." [22:06:25] I was thinking about trying to use the wikiclass extractor and setting date limits. [22:06:32] + redirect detection etc. [22:06:51] That in the past two months alone I've done two rewrites... [22:06:59] Of two different things, not of the same thing, thankfully [22:08:25] halfak: yeah, the code I have uses old rating detection code, better to switch to wikiclass for that [22:08:46] also not sure if the redirect problem shows up in the sampling or when we’re examining the rating changes, I’ll check both [22:09:26] I also want to update the code that ORES/wikiclass has that can do sampling, to make sure that’s also useable [22:09:43] * halfak tries to 'splode regular expressions for a demonstration and fails. [22:09:44] first I’d like to fix my code in order to generate a new version of the dataset [22:09:49] Nettrom, when you say "sampling", what do you mean? [22:10:02] 5k of each class? [22:10:23] yeah, it grabs samples of all classes except FAs… it should remove all redirects, but I’m not 100% sure it does [22:11:47] the second part of my code is the one that examines article histories to find the right revision, and I don’t think that does redirect detection [22:11:59] but I’ll look into that later, right now I need some coffee [22:12:42] :) [22:12:50] Gotcha. [22:12:57] halfak: You know of existing python to translate e.g. f(obj, “a.b”) and give you back obj[“a”][“b”]? [22:13:24] awight, nope. I wrote a thingie that does that in revscoring 2.0 [22:13:32] Oh and in json2tsv [22:13:50] ha. perverse [22:13:59] I’ll hardcode for now, cos this might be a throwaway. [22:14:02] https://github.com/tapilab/json2tsv/blob/master/json2tsv/tsv2json.py#L89 [22:14:12] & will port to live in revscoring2.0 otherwise [22:16:04] halfak: may i continue working with edit quality instead of switching to draft quality? [22:16:19] fajne, totally! [22:17:37] Why can't I make this backtracking problem happen!? [22:20:37] Got it! [22:34:46] awight i had a hard time following what you found out while i was away (you were talking more about mylang) so can you recap please [22:35:46] Zppix: My mini-conclusion was that there’s currently no way to give users a friendly, familiar experience aside from going to Special:UserLogin on their home wiki [22:38:22] awight so we have to send each user to home wiki? [22:38:49] awight that doesnt seem quite right... surely to god oauth has the ability to do what we are wanting [22:38:58] who maintains wiki oauth? [22:39:27] Security, I guess [22:39:32] Used to be csteipp [22:40:46] halfak ill confirm then if so ill contact whom does and ask and confirm this isnt possible, i'd hate for us to waste time on something we can do in 5 seconds [22:42:52] wait a min awight what if we use meta instead of mediawiki.org for the oauth [22:43:00] wonder if that would make a difference [22:43:12] since meta is the "hub" [22:44:55] 10Scoring-platform-team-Backlog, 10Mediawiki-extensions-PropertySuggester, 10Wikidata, 10artificial-intelligence: [Spike] Use suggested properties to get signal for completeness - https://phabricator.wikimedia.org/T158430#3491296 (10Bugreporter) [22:46:50] im talking with bryan davis hes a member of the ext project on phab [22:48:59] zppix: totally agree with you that there must be an established best practice [22:49:02] ty for investigating! [22:50:36] awight bryan davis says use meta [22:52:47] I had to disconnect to get my properly cap'd nick [22:52:54] freenode is just the bestest [22:53:10] Did you register your nick? If so you can do /ns ghost Zppix [22:54:03] harej yeah but i regain without capping the Z and then it said Zppix was temp unavaliable [22:54:06] it was weird [22:54:52] OK I'm out of here for the day. [22:55:03] Have a good one folks! [22:55:30] you too [22:55:36] don’t do 2.0 anything! [22:56:21] :) [22:56:24] o/ [22:57:21] am i the only one trying to decode what adam just said [23:00:07] lol