[06:57:17] halfak: i remember trying to model article importance like two years ago [06:57:27] this was my first, and last, foray into ML [06:57:52] i ended up not being able to get it to work because my two measures of importance were at such odds with each other that I couldn't reasonably balance them [13:55:13] harej, wat. They seem to work pretty well to me :) [15:46:12] halfak: congratulations. you're a computer science phd and i'm not. anyways, once you have a API up it'd be great to incorporate them into reports bot as i do the quality predictions [15:47:11] harej, sorry. Was just confused what you were talking about when you said they "didn't work". I thought maybe you were talking about how they are hard to compute, but maybe you were talking about how it seemed like they didn't carry any signal. [15:47:27] right [15:47:31] Right now, I'm not sure we'll have the model up for a little while due to the hard-to-computeness problem. [15:48:18] it was difficult to weigh the sources in any given way such that useful information could come out of the process. but it is wonderful if you figured it out [15:50:34] (how is it hard to compute?) [15:50:43] harej, gotcha. Yeah, Nettrom can say more, but we're getting a reasonable amount of fitness. [15:51:24] harej, not all inlinks are equal -- we want organic inlink counts. We want to only count links that appear in the text of a page and exclude links that appear in templates. [15:51:33] ooh, nifty [15:51:41] but that's more computationally expensive, i imagine [15:51:51] simple table join vs. actually parsing each dump [15:51:56] Also, view rates are complex. Right now the pageview API provides daily granularity. We just want to quickly get a "view rate" for a page. [15:52:19] In either case, the table join can take a couple of seconds when limiting to a wikiproject. [15:52:37] So we want to have that indexed. Scoring should take less than 1 second if possible. [15:53:46] as a tangential development, my script (now three scripts) that add "cites" statements to Wikidata is running faster than ever [15:54:22] which is especially important as the citation graph gets bigger and bigger [15:54:26] we are nearly at 3 million [16:02:48] citation graph? Are you loading in the citation network datasets? [16:03:01] That's awesome. I'm excited to see that working :) [16:03:25] yes, originally from pubmed central but now crossref as well [16:03:34] Nice [16:14:54] my goal for wikicite is 3 million. which is 0.3% of my overall goal of 1 billion. [18:51:45] J-Mo1: gotten a chance to do anything with HostBot, re: the training modules research? [18:52:55] ragesoss ohai! yes, I did some coding. not ready yet, but getting pretty close. Any news on the "welcome bot" discussion front? [18:53:49] it's remained quiet... I think there was enough positive response, with no objections, to demonstrate that this can move forward. [18:55:04] Once we have the code side of it ready, I'll create a new BAG request link to it from the previous one, and address any new concerns that come up. [18:55:09] ah, the quorum of whoever shows up [18:57:21] sounds reasonable, ragesoss. I'll try to get the code running before the end of the week. I'm fairly sure I can hit that goal. I've had to do some refactoring in order to be able to test training module template invites (on testwiki) while still delivering daily teahouse invites (on enwiki). [18:58:02] ah, the joys of software. [18:58:27] yeah. fwiw: I'm still concerned that we're going to face steep resistance to this proposal. [19:10:09] J-Mo1: Your shell access to stat1003 should work again, using the new RSA key you sent. [19:13:01] thanks mutante, trying it now [19:14:32] J-Mo1: you gotta change the username from "jtmorgan" to "jmorgan" [19:19:11] mutante: I'm in! Thank you for your help. [19:19:25] i see it worked now :) ok, cool. you're welcome [19:20:03] J-Mo1: i'll call the ticket resolved then. you may also want to create a new key for labs btw. since DSA is not supported anymore but we are supposed to have separate keys for prod and labs [19:20:30] I'll do that mutante. Thanks again! [19:20:40] yw, thanks, cu