[07:42:47] PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [08:12:21] RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [14:29:46] i/ [14:29:49] *o/ [14:30:20] o/ [14:30:41] o/ [16:29:43] o/ [16:49:38] I'm stepping away a bit for lunch. [16:49:45] Back in about an hour. [18:03:46] 10Scoring-platform-team, 10draftquality-modeling, 10artificial-intelligence: Perform basic analysis of enwiki article creation rates - https://phabricator.wikimedia.org/T156494#4187681 (10Halfak) I think this would be a great task to have a newcomer try out at the hackathon. E.g. how many new articles coul... [18:17:42] Bah! I forgot that I need to prepare slides for the research showcase tomorrow. [18:17:53] Looks like I'll be focusing on that this afternoon (next 4 hours) [18:18:50] o/ ragesoss! I plan to talk a bit about your work using the ORES wp10 models. [18:19:09] Would you be interested in joining me on the call for the showcase tomorrow? [18:19:23] halfak: yeah, probably! what time? [18:19:37] 1830 UTC. [18:19:49] So, 11:30 AM PDT, I think [18:20:19] also, on my todo list is to figure out what JADE is all about. I followed a mailing post link and read about the concept, but never found API info for how to actually play with it. [18:20:20] I just sent you an invite. [18:20:39] ragesoss, there's some stuff in beta to play with. Not much yet. [18:20:50] E.g. https://en.wikipedia.beta.wmflabs.org/wiki/JADE:Diff/376901 [18:21:13] TL;DR: We opted for using MediaWiki + ContentHandler to track structured data about human judgements. [18:22:08] halfak: is there a POST API for this so that 3rd party apps can contribute human judgements? [18:22:53] ragesoss, yup! same old MW API stuff. [18:23:06] We'll be working on something that looks like wikibase's structured API for editing items. [18:23:27] So you wouldn't have to rewrite the entire JSON blob if you just want to record a single judgment. [18:23:57] where in the MW api is this? [18:24:18] https://www.mediawiki.org/wiki/API:Edit [18:27:09] halfak: so, JADE is a namespace and I create a page called `Diff/` that looks like what the json is supposed to look like? [18:29:19] Right. [18:29:34] This is sort of what wikibase does. [18:29:41] But we're gonna be a lot less complex than wikibase :) [18:29:53] cool. I'll definitely wait for the wikibase-style option, then. [18:29:56] :) [18:30:10] We seriously considered just using Wikibase and doing this entirely out of MW. [18:30:12] Cool :) [18:30:35] We'll be spending some more focus on that soon -- as soon as we have a product manager who can help us get organized :D [18:30:39] Soon - July [18:30:41] *= [18:31:07] cool beans! hiring a new PM, or shifting one from elsewhere in the org chart? [18:31:21] A little bit of a shift. [18:31:24] * awight high-fives around the room [18:31:26] * halfak looks at harej [18:32:16] so, wikibase is basically a mashup between mediawiki and NoSQL, right? [18:32:51] It's hard to call it NoSQL, but I guess that JSON blobs are commonly used in NoSQL options. [18:33:12] In this case, the MWAPI could be interpreted as a NoSQL interface to a database :) [18:33:42] ragesoss, I'll plan for you to give a 5 minute overview of what you have been up to with the wp10 model tomorrow. Does that sound OK? [18:34:00] sure, sounds good. [18:34:05] Thank you! [18:35:31] Harej: does that mean you're moving teams, or just getting loaned out, or what? That'll be fun stuff! [18:37:52] I think harej will be spending some time in cloud and some time with us -- if all works out as planned. [18:39:44] halfak: Thanks for the CR! I’ll try to enable IRC notifications btw [18:40:35] Oh yeah. Good call. [18:40:59] U okay with me stubbornly not making the changes? [18:45:04] 10Scoring-platform-team: GitHub IRC integration is being deprecated - https://phabricator.wikimedia.org/T194070#4187763 (10awight) [18:45:47] awight, I'm still concerned about the thread count. [18:45:58] Less so the json vs. yaml. [18:45:59] 10Scoring-platform-team, 10Release-Engineering-Team: GitHub IRC integration is being deprecated by October 1st, 2018 - https://phabricator.wikimedia.org/T194070#4187776 (10awight) p:05Triage>03Low [18:46:07] WHY [18:46:09] Damn it. [18:46:25] halfak: I think the thread count is appropriate, actually. Lemme dig up evidence [18:47:03] (cos I could be very wrong) [18:49:23] http://baddotrobot.com/blog/2013/06/01/optimum-number-of-threads/ [18:49:35] 10Scoring-platform-team (Current), 10JADE: [discuss] JADE schema format (endorsements?) - https://phabricator.wikimedia.org/T193643#4187791 (10Halfak) Edit comments would not tie explicitly to the judgement being discussed. Still, upon review, I think that the "comment" should be shared across judgments -- no... [18:49:54] for io-bound threads, the punchline is: t = c / (1 - w) [18:50:00] awight, I'm not worried about our resources. I'm worried about using the API responsibly. [18:50:11] aha sorry [18:50:27] The plan is half Cloud Services half Scoring Platform. Pending approval and all that [18:50:42] :110%: [18:50:48] Yes, being organized is important :) [18:51:05] How do you currently decide what to work on? [18:51:35] Overarching vision + judgment calls. [18:51:52] Trying to balance tech debt and making critical advancements. [18:52:55] 10Scoring-platform-team (Current), 10JADE: [discuss] JADE schema format (endorsements?) - https://phabricator.wikimedia.org/T193643#4187803 (10Halfak) FWIW, I think modeling our data based on deliberation (talk pages) is wise. Not falling prey to the problems with talk pages (e.g. structurelessness) is good t... [18:58:29] halfak: This is strange, I don’t see our N=1000 extraction job having any effect, https://grafana.wikimedia.org/dashboard/db/api-summary?orgId=1&from=1525284000000&to=1525471200000 [19:01:14] We need a “non-MediaWiki” filter on these API stats... [19:05:53] https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2018/Program#How_to_schedule_breakout_sessions [19:06:06] Think we should schedule something for ORES/JADE? [19:06:51] Maybe we should schedule a "Have ORES score your stuff" workshop [19:07:10] Were we show people how to play with the wp10/damaging models. [19:08:10] heck yeah [19:08:11] https://etherpad.wikimedia.org/p/Scoring_platform_hackathon_ideas [19:08:30] Cool. I'm going to work on a task quick for the workshop and get it scheduled. [19:11:26] halfak: thanks! [19:11:41] halfak: Hey also, lmk how you think we should resolve the thread count. [19:11:55] I ran it on May 4, c. 20:00-21:30 [19:12:12] It worked well, completed quickly, and I can’t find any evidence that it hurt the API servers [19:12:21] N=1000 fwiw [19:12:52] 10Scoring-platform-team, 10ORES, 10Wikimedia-Hackathon-2018: Using ORES to score your stuff - https://phabricator.wikimedia.org/T194076#4187934 (10Halfak) [19:12:56] If you think it’s worth reducing just out of caution, I’m okay with postponing any more investigation... [19:13:01] otherwise, it seems harmless [19:13:20] 10Scoring-platform-team, 10ORES, 10Wikimedia-Hackathon-2018: Using ORES to score your stuff - https://phabricator.wikimedia.org/T194076#4187946 (10Halfak) [19:13:53] awight, ask about running that many parallel requests in wikimedia-dev and see if anyone gets mad :) [19:14:02] If no one gets mad, then I'm cool with it. [19:14:05] +1 thanks for the channel [19:14:31] I’m genuinely curious to see where this goes, hopefully nobody thinks I’m trolling :) [19:16:06] * halfak microwaves the popcorn. [19:16:54] halfak, awight, amir1: I've put my talk page scraping code up on github https://github.com/ewhit51/talkpage_scraper/blob/master/Talk%20Page%20Scraper.ipynb [19:17:28] I'd like to make the ipynb a bit clearer for people who've never used mwapi before, but this should be clear enough that you can tell what I'm doing [19:17:48] \o/ ewhit_ is this a good time to code review, or should we look for other stuff? [19:18:01] now is fine! [19:18:06] OK we're on the schedule. [19:18:07] nice, ty [19:18:14] We're in the room right after newcomer matching :D [19:18:15] halfak: {{done}} [19:18:15] You rule, awight! [19:18:18] * halfak is sneaky [19:18:19] aww [19:18:28] ooh well done [19:18:40] https://phabricator.wikimedia.org/T194076 [19:19:17] ewhit_: Maybe dump the extremely long text into files and provide links in the .md? [19:19:35] * awight puts ice on the scrolly wheel [19:19:43] will do! Sorry, didn't realize that would upload as well. This will take a minute to re-run to get rid of that [19:20:35] 10Scoring-platform-team, 10ORES, 10Wikimedia-Hackathon-2018: Using ORES to score your stuff - https://phabricator.wikimedia.org/T194076#4187971 (10Halfak) [19:21:09] 10Scoring-platform-team, 10ORES, 10Wikimedia-Hackathon-2018: Using ORES to score your stuff - https://phabricator.wikimedia.org/T194076#4187934 (10Halfak) [19:22:42] ewhit_: I like that all the text is in .py comments rather than markdown blocks, that totally works IMO. But you might benefit from markdown where we’re supposed to e.g. click on links or imagine bullet points [19:24:34] ok, cool, will add. Thanks! I want to fix this up quite a bit, but I'd really like to get an idea of where(if) the mwchatter is before I jump into fixing up the formatting. [19:25:16] *mwchatter issue, whoops [19:28:42] halfak: revscoring datasources are rad. [19:30:01] \o/ :D [19:32:30] halfak: so, JADE is designed to store a single, canonical human judgement (plus page history) about some diff? As opposed to, eg, multiple possibly different evaluations by different people? [19:32:59] ragesoss, the current beta only stores one "true" judgement. [19:33:12] But we're discussing having the system look more like a wiki-straw poll [19:33:48] With multiple potential judgments, a mechanism for filing "endorsements" (like !votes) and flagging consensus via a "preferred" flag [19:33:52] as in, you store an array of users who 'voted' each way? [19:34:08] Right. [19:34:39] I'm guessing that 99% of the time, there will be one judgment, one endorsement, and that'll be it. [19:34:56] But that 1% of the time, we'll want to allow people to disagree when ultimately, they're making a subjective call. [19:35:18] Further, machines that use the data will want to be able to "know" about the disagreement. [19:35:19] cool. this is relevant to my interests for other stuff, in particular Commons image evaluation tools. I've sort of started an app that will be like hot-or-not for Commons images in its first iteration. [19:35:32] Yes. I think that is very relevant. [19:36:36] I'm going to start off with just spinning up my own backend with Rails for it, but if the concept works in terms of user uptake, it'll be worth figuring out how to make it fit with some sort of wikibase-ish thing. [19:37:01] (Harej: that's the non-facetious answer to your question from facebook ^) [19:38:15] awight: I've gotten rid of the v long text file, I'm working on formatting now. Thanks! [19:41:12] +1 ragesoss. We'll be doing something similar with wikilabels eventually. [19:41:24] JADE should provide you with a useful repository to put a front-end in front of. [19:41:42] I imagine you'll still want infra for deciding what images to put in front of users. JADE won't help with that directly. [19:42:09] halfak: no, I'm not worried about that side of it. [19:42:49] random and/or category trees and/or queries on Depicts (once SDC has that) should be plenty to handle the 'what images' question. [19:43:53] Gotcha :) We're in the same boat with Wikilabels. We do a lot of random sampling with Quarry -- sometimes using categorlinks. [19:47:44] halfak: I realized that the extractor parallelism might not be the correct number to pay attention to. The number of active requests to the API is some kind of proportional to N, but not 1:1, so I used a napkin to determine that we’re making 20 requests/second, or at most 0.5% of the API load. The total number of requests will be 93k regardless of paralellism. [19:47:53] Still unsure how to evaluate that rationally, though. [19:48:42] I think we’re cool? [19:49:55] Make sure to put something useful in the user-agent and let's call it good. [19:54:10] OK I have done no slides, but I've got the contract stuff moving for Amir1 and hoo|away. [19:54:26] Taking a quick break to water the lawn and then I'll be back to work on slides for tomorrow :| [20:06:02] halAFK: and to CR https://github.com/wiki-ai/drafttopic/pull/21 ? [20:06:37] * awight eyes at Amir1 [20:07:00] * Amir1 comes to save the day, like Batman [20:07:05] It was Batman, right?> [20:11:07] awight: I've added more formatting as well, hopefully the links should make what's happening clearer. I'd still like to add info for people who haven't used mwapi before, but this should be pretty clear for mwchatter issue finding purposes, hopefully [20:11:41] Amir1: That might be ubermensch [20:13:21] I don't have the moustache :/ The only reason is I can't grow it :( [20:13:36] awight: this is rather big, Can I take some time? [20:13:51] Amir1: totally, this is kinda bonus stuff. [20:14:37] Amir1: ya know, I’m just realizing that the end of this PR is an even bigger WIP [20:15:07] I think what I’ll do is split out easier pieces into separate PRs. [20:15:23] I just hate how GH does PRs dependent on one another, though. [20:16:52] halfak: nvm my other CR request [20:17:31] kk [20:18:22] ewhit_: Can you check in the CSV as well? [20:18:37] What I’d like to see ideally is a tiny sample of it, inline in the ipynb... [20:18:44] * awight is too lazy to run code [20:24:21] halfak: user-agent has been tweaked [20:27:13] * awight glowers at travis-ci [20:28:06] FYI: We're going to have some scheduled downtime in Wikilabels next week. [20:28:36] It'll happen during our offsite @ 1500 UTC on Wednesday. [20:34:31] https://etherpad.wikimedia.org/p/scoring_offsite_schedule [20:34:38] ^ Adding some things. [20:35:31] 17 [20:35:50] Hey Platonides! [20:36:05] hey halfak [20:36:08] Were you the person who called the eswiki conversations about PatruBot to our attention? [20:36:09] sorry for the noise [20:36:17] No problems :D [20:36:18] I think so [20:36:28] unless someone else did, too [20:36:38] I'm hoping to give a very light overview of what is happening there during the "Research Showcase" tomorrow. [20:36:51] Any chance you'll be online at 1830 UTC? [20:36:53] when is that? [20:37:08] 1830 UTC is 20:30 CEST [20:37:19] I may be able to be online [20:37:44] I thought it would be cool to have you around in case people had questions or if I got something badly wrong :| [20:38:16] Have you ever "attended" a research showcase before? [20:38:57] no, I think I haven't [20:39:20] We stream it on youtube and maintain a bach-channel conversation in #wikimedia-research. [20:39:29] although perhaps I have been on a similar enough event, though not named that way [20:39:46] If you are available, it would be cool if you hopped into that channel to chime in :) [20:39:53] jem, ^ [20:40:27] * Platonides preemtively joins [20:40:57] would be nice if you could ping me there when you begin [20:41:08] Will do :) [20:41:30] :) [20:41:30] * halfak finally starts on his slide deck [20:43:29] awight: I've added a line from the csv (only 1, b/c they're long). It looks like it still needs some parsing to be legible for anyone on something like MTurk, beyond just what mwchatter did [20:43:56] halfak: kick the PR and I’m happy :) [20:44:15] The UA is enhanced [20:44:36] Stupid travis-ci stopped being pedantic about whitespace [20:45:23] * halfak kicks [20:51:29] ewhit_: More minutae, it would be nice if the sample text were printed programatically so that the person running the notebook can try new things. [20:51:56] Sample text might look better with “json.dumps(message, indent=4)” [20:52:40] I actually don’t understand what mwchatter is doing for us here [20:52:42] awight: I can try to add that, but other than your example, I'm not quite sure what you mean [20:52:52] that looks like raw mwapi output, right? [20:52:59] Theoretically, it should make the outputted talk page cleaner [20:53:27] I think what might be happening is that when it's put into a csv, it gets re-condensed. I'm running it again to double-check that [20:54:15] ewhit_: oh sorry, I see the parsed stuff now, it’s under the “cosigners” key [20:54:17] ragesoss, got a CC licensed version of the graph here?: https://wikiedu.org/blog/2016/09/16/visualizing-article-history-with-structural-completeness/ [20:54:27] it’ll be clearer when it’s indented, of course [20:55:18] By “printed programatically”, I mean that we still have the “content” variable available in the Py kernel, so you have have a text block that prints the example text, like: [20:55:23] halfak: it's a screenshot from the site, so it's CC-by-sa by default. [20:55:37] page_id = 23716981 (or whatever) [20:55:39] Okay! [20:55:55] print(json.dumps(content[page_id], indent=4)) [20:56:29] awight: I'll try that, but something in my code broke and I'm trying to fix it rn [20:56:38] +1 [21:01:34] ragesoss, want to talk about https://upload.wikimedia.org/wikipedia/commons/thumb/7/7c/Wiki-Education-CP-spring_2017-ores-6000.jpg/1024px-Wiki-Education-CP-spring_2017-ores-6000.jpg too? [21:01:43] I love that graph :) [21:01:56] yes, will talk about it. I love that one too! [21:02:40] Headed to a root canal, I probably won’t be back until the design call tonight. [21:03:24] halfak: the origin of that one came from trying to recreate the similar graph that we got for hand-assessed before-and-after quality ratings for the public policy initiative in 2010-11, but using ORES data instead of our bespoke 26-point rating system. [21:03:43] Ooh, pass this along… https://mathbabe.org/2018/05/04/speaker-series-mathematics-and-democracy/ [21:04:27] ragesoss, I didn't know about that. Do you think it would be interesting to compare that to the past work re. public policy? [21:12:35] halfak: finally found it! https://outreach.wikimedia.org/wiki/Public_Policy_Initiative_Learning_Points#/media/File:Article_Quality_of_New_and_Pre-Existing_Articles_Before_and_After_Student_Work.png [21:13:30] Public policy initiative! [21:15:34] I can certainly say some things comparing that work — ie, what a huge uphill battle it was to coordinate volunteers to do assessments and collect the data, and how ORES provides compellingly similar data to what a human-created dataset showed based on evaluating articles on the Wikipedia 1.0 quality rubric [21:27:09] brb meeting [22:03:52] done with meeting [22:03:54] * halfak reads [22:04:27] Oh yeah. I think that's actually a great example. One of the points I'd like to make is that ORES makes a lot of different types of work easier (analysis is a critical work because it leads to funding!) [22:04:44] The other point is that ORES allows people to appropriate prediction models for their own user. [22:04:46] *use [22:05:14] I love how you redefined wp10 to mean "structural completeness" It's both accurate and appropriate for your use cases. [22:05:55] By deciding on your own interpretation of what ORES was useful for and defining terms -- you've used ORES for its intended purpose. [22:06:08] And in a lot of ways, designing for re-appropriation in ML is very novel. [22:10:46] * halfak stops ranting. [22:10:51] ragesoss, ^ [22:15:55] halfak: I've updated the github page for the scraper, hopefully making it clearer, based on feedback from awight. I will say, mwchatter is helpful, but there's quite a bit of cleaning that still needs to be done, even after running it. https://github.com/ewhit51/talkpage_scraper [22:17:40] ewhit_, thanks for the update. Hopefully we can make mwchatter better in the process of getting your research done :) [22:18:12] I'm going to head out for a while. I'll be back on in 2.5 hours for a meeting with one of our volunteer designers. [22:18:16] Have a good evening! [22:18:34] Hopefully. You too!