[15:29:30] Hello, world of science. [15:30:17] hello guillom [15:30:26] _o/ harej [15:30:28] Having replaced my bullshit with a proper multiclass classifier, I am now up to 58% accuracy [15:30:40] harej: yay! [15:30:43] harej, nice! [15:30:51] Counter({3: 1378, 2: 238, 1: 7}) [15:30:58] Is that better than your former guesstimate? [15:31:10] It predicts zero articles top importance, seven articles high importance, 238 articles mid importance, and 1,378 as low importance [15:31:13] o/ hey folks [15:31:19] halfak: Welcome back! [15:31:21] guillom: prior guesstimate was 53-55% [15:31:35] Thanks guillom [15:31:46] * halfak goes through email backlog and VE experiment questions. [15:31:53] I am wondering if the n is too low to be able to properly predict top or high importance. I am going to test it out now with football, where *all* the numbers are big. [15:32:07] harej, what kind of model are you using. [15:32:14] What's the N of each class you are training on? [15:32:25] halfak: There hasn't been a day since you left on vacation where someone didn't say "Where the hell is halfak when you need him" in this channel. [15:32:32] ha! [15:32:41] I had 27 notifications and not enough scrollback to find them all. [15:32:59] halfak: https://www.irccloud.com/pastebin/5YEevluo/ [15:33:00] thats cute [15:33:02] halfak: :) [15:33:07] halfak still kayaking? [15:33:20] halfak: there was a major multi day labs NFS outage. ORES was untouched throughout, labels went down though. I brought labels backup today. [15:33:22] ircing while kayaking would be hardcore. :D [15:33:41] White_Cat: What could go wrong? [15:33:49] YuviPanda, thank you for bringing up labels. [15:33:57] That pastebin is reality; predictions were 0, 7, 238, and 1378 respectively [15:34:21] guillom umm typos probably [15:34:28] YuviPanda, do you have some docs I can follow to convert Wikilabels into something that'll bring itself up/something you'll be able to work more easily with? [15:34:56] halfak: not at the moment - we hadn't quite decided if we wanted to move it to tools or keep it on its own instanc.e.. [15:35:06] White_Cat, canoe :P A canoe is sort of like a water truck. A kayak is more like a personal transportation device. [15:35:08] and I'm still up to my neck in cleaning out all damage from last week's outage... [15:35:25] YuviPanda, I'll follow your lead. :) [15:35:49] I can imagine. I saw your "NFSNFSNFS" nick switch in the scrollback ;) [15:36:07] hehehe :) [15:36:48] halfak: It is a one-vs-rest classification system [15:36:50] scikit learn! [15:37:36] harej, but what model? [15:37:41] random forrest? [15:38:07] Linear SVC? [15:38:38] Gotcha. Seems like that should work OK :) [15:38:47] What kinda predictors do you have? [15:38:58] Behold! https://github.com/harej/wikiproject_scripts/blob/master/predictor.py [15:39:17] I would want two measures of pageviews. One for a long-term average and another for the tallest spike. [15:39:36] harej, I'ma convert you to revscoring :) [15:39:45] Ohhhh? [15:40:47] It would be good if we could stand up a PV stats service after the PV service comes online that will cache and give you these numbers. [15:41:19] Right. The plan is to throw out my current page view cruncher in favor of the new API [15:41:29] halfak IS BACK [15:41:31] YAAAAAY [15:41:33] WE MISSED YOU [15:41:42] o/ Ironholds [15:41:42] halfak, you missed so many things! [15:41:43] :) [15:41:49] What things?!? [15:41:50] :D [15:41:53] I turned into a Python user, we hired me an assistant [15:41:58] lotsa things [15:42:06] also I was Backup Halfak when you were out JFYI [15:42:09] (Though for testing purposes, I've written all the page view numbers to file once and have read from that file; not necessarily timely data but fine for my purposes [15:42:14] I went around being sensible and kind and freaking people out [15:42:38] Ironholds, lol woot! CONVERTS [15:42:41] Who is this assistant? [15:42:52] Mikhail! [15:42:59] henceforth "backup oliver" [15:43:10] except in machine learning where I am "backup mikhail" [15:43:17] Cool! :) [15:43:22] How long until he starts? [15:43:27] 27th July! [15:43:32] which is good because I will be back by then [15:43:42] https://github.com/Ironholds/floccus - seriously, I'm writing an actual module [15:44:00] did someone say machine learning?????? [15:44:26] harej, yeah we're getting an ML nerd [15:44:29] and by we I mean search [15:47:26] * harej waves Nettrom [15:48:01] * Nettrom waves harej [15:48:19] o/ Nettrom [15:48:24] harej: I plan to carve in a couple of hours of programming today so WP X can get some suggestions [15:48:29] hurrah! [15:48:32] not sure I'll get it working, but there'll be progress [15:48:36] halfak's back! [15:48:54] Nettrom: the only thing I really need is for the bot to be able to accept requests from outside user space? [15:48:59] it works really well otherwise! [15:49:40] harej: that's basically what I'm doing, sub-classing my existing code, grabbing suggestions from the template or from your config URL, then posting with it's own template [15:50:44] Sounds delightful [15:51:09] it'll also be Python 3, and on GitHub :) [15:51:15] WHOOOAAAA [15:51:28] I'm sure halfak is happy I'm starting to write Python 3 code and ditching 2.7 :D [15:51:38] I am very happy too. I am the zealous Python 3 enforcer. [15:51:58] Nettrom, welcome to a decade ago :P [15:52:14] halfak: I'm old and curmudgeonly, it's fitting :P [15:52:18] Python 3 revolution is slow, but it will come just in time to screw over python 4. [15:52:19] :) [15:52:41] * halfak is not aware of plans for python 4. [15:53:47] I... need to add python3 uwsgi support to toollabs at some point [15:54:42] YuviPanda: thanks again for getting scipy installed yesterday, it's working beautifully and I got something checked off my todo list :) [15:55:10] * harej managed to get around scipy dependency hell by installing the Anaconda stack [15:55:40] Nettrom: :) feel free to open bugs / poke me for new python3- installs [15:56:28] YuviPanda: thanks, will do if there's something else I come across :) [15:56:36] cool :) [16:05:40] signs you're Jewish: [16:05:53] you go out for drinks with a friend and realise halfway through the conversation that you're related [16:06:41] Why would that be a sign of Jewish and not a sign of small town? [16:06:50] Ironholds, ^ [16:07:13] halfak, she's New Jersey born and bred and I'm from the UK? [16:07:26] but we're both Katzenellenbogen, turns out [16:07:44] That's a word. [16:07:54] https://en.wikipedia.org/wiki/Katzenellenbogen [16:10:33] STAHP SENDING ME EMAIL [16:10:41] halfak, okay, I will stop [16:10:43] It's accruing faster than I can clear out the old stuff. [16:12:08] * guillom has been strategically waiting for halfa.k to work through the initial backlog before sending the emails. [16:12:53] +1 guillom [16:13:00] <3 guillom [16:13:13] halfak: It's completely self-serving :p [16:13:15] I'm going to wait week or so :) [16:13:15] Great way to make sure I don't just say "Cool story" in response to your email. [16:13:48] "Hey halfak, can you analyze Foo for me? I think it might be Baring." "Cool story. See ya." [16:13:55] Halfak what ratio is from cranky Wikipedians complaining about the VE study? [16:14:07] Not many surprisingly. [16:14:32] For all the trouble I got into the last time I ran this study, the response for this one has been surprisingly good. [16:14:56] halfak, I think that reflects on "the software does not blow quite so many chunks this time" [16:14:58] but we're both Katzenellenbogen, turns out < Pardon my lack of culture, but for a moment I thought that was a German word to say something along the lines of "allergic to cats". [16:15:04] I've got a few methodological questions. Nothing terribly bad. More "why did you choose this strategy" and less "I think this is wrong" [16:15:14] although Jason Quinn spent an extensive amount of time explaining how research worked to me [16:15:28] and then asking about my work on the VE, because obviously I could only like it if I had a COI from developing it or researching it [16:15:48] guillom, it's a clan of German-Polish Jews [16:15:55] Ironholds, sounds good. Does Jason do courses I could sign up for. ;) [16:16:00] Ironholds: Yup, I saw the article :) [16:16:02] many, many famous rabbis and also the only Jewish King of Poland! [16:16:09] He was King for a day [16:16:11] Thus "for a moment". [16:16:37] halfak, I don't think so, but you'll be happy to know that he can explain what you meant to you [16:17:11] "it's no worse than wikitext for new editors" ACTUALLY means "at no point in a editor's life cycle is it better and there are no possible explanations for why we're seeing a lack of a significant improvement in the 2-week period" [16:17:32] he's actually really more proud of your research than you are, insofar as he refuses to believe alternate hypotheses could exist [16:18:04] Ironholds, oh great. Next time I finish the preliminary analysis, I'll have him dictate the write-up to me. That would save a lot of time. [16:18:22] totally! [16:18:39] halfak, actually, can I run an alternate hypothesis by you? [16:18:47] it seems obvious but I wanna make sure I'm not missing something [16:18:56] Also, it's nice to know that my hypotheses carry so much weight. I hypothesize that world hunger has ended and that the large governments have suddenly started getting it Right(tm). [16:19:07] Ironholds, totally [16:19:11] I hypothesise free pizza! [16:19:23] hangovers make one's life goals smaller and more immediate [16:19:55] halfak, "one reason for why we don't see a substantial improvement in the early time is that a certain proportion of newly registered editors are switching from other accounts or from anonymous editing, meaning there is a degree of re-acclimatising going on" [16:20:13] +1 [16:20:17] I think that's right. [16:20:31] It's one I threw at Abbey. I really want to get a better understanding of anons. [16:20:41] nothing complex or controversial, but it seems very plausible and while we can't do it to say "okay we have z-test results but NO HYPOTHETICAL" we can actually say "it's no worse, and actually might be a lot better, it's just hindered by the lack of uniform exposure" [16:20:53] I think that fiddling with VE is mostly done by people who have some experience in Wikitext [16:21:13] yeah, which is great (everyone except me should eventually switch over) but doesn't tell us how genuinely new people handle it [16:21:50] +1. I tried a few different strategies for weeding out the not-genuinely-new people, but there didn't seem to be a nice way. [16:22:02] E.g. looking at editors who edited in the Wikipedia namespace on their first day. [16:23:23] yeah [16:23:37] but there's no good way of doing it because what we'd need to be able to do is well, look at off-wiki stuff [16:23:41] we cannot yet see inside users' heads [16:23:58] (and as one of those users: even if you could you really don't want to. Thar be swamp dragons and brainweasels) [16:24:06] step 1: Integrate facebook login [16:24:36] step 2: unique ID dot surgically implanted in each human that is required for computers to let them internet [16:24:40] step 3: MJ12 [16:24:50] step 4: ... [16:24:56] step 5: PROFIT [16:25:31] Seriously though, we can run some short-term observations by assigning anons a unique cookie and purging it quickly. We've done that before. [16:25:45] I wish I had more time to work with that old dataset... but we purged it :) [16:26:45] Ironholds: we already do a variant of step 2? [16:26:46] :P [16:27:52] 1) log out of accounts. 2) get yuvi's unique ID implant 3) trick Wikipedia into thinking I am a newcomer 4) write a parser function and throw off all the metrics [16:50:47] o/ J-Mo [16:51:00] welcome back halfak! [16:51:30] Thanks dude. [17:04:21] an inherent problem with trying to fit into pre-existing importance categories: [17:04:28] the "top" category is always going to be small, even on a large project [17:05:26] 40 articles of top importance to wikiproject football (n=137,920) [17:05:56] 19 articles of top importance to wikiproject women scientists (n=1,639) [17:06:31] football has 84 times the articles but only twice as many top importance articles [17:24:22] hello halfak. question about Q1 goals as I'll be filling in for Dario in a meeting today. [17:24:37] o/ leila [17:24:40] cool what's up? [17:24:44] is it fair to say that the work on productizing revscoring and wiki labels has already started? [17:24:51] or are you planning to do that in Q1? [17:26:13] halfak, ^ [17:27:52] It has already started and will continue full-steam into Q1. [17:28:00] leila, ^ [17:28:32] and about measuring value added (stretch): is it fair to say that you have a way for assessing contribution value, and you want to see if you can characterize the editor population based on the value they've added to Wikipedia over time? [17:28:40] makes sense, halfak. [17:29:26] leila, indeed. That sounds accurate. There are many facets of that project, but for a blurb, that'll suffice. [17:29:48] ooki. great. and one last thing in case that comes up: is the latter project across languages? [17:29:54] (I understand that the former can be) [17:30:43] halfak, ^ [17:33:14] leila, yes that's right. The method is not concerned with semantics of language -- just simple syntactic properties. E.g., I think we'll have to work out some tokenization issues for languages that do not use spaces for atomization of "words" -- e.g. Mandarin. [17:33:30] But all western languages will work out of the box. [17:33:55] I don't expect to have issues with RTL ones either, but we may need to adjust some parameters in the diff algorithm. [17:39:46] harej, indeed. this will be an issue, but it is reality, so you classifier will need to reflect it well. [17:40:17] In the revert modeling work, I've re-balanced the classes when using the linear SVC so that could get more realistic probability estimates. [17:40:57] And what did you do to rebalance? [17:42:12] harej, I experimented with downsampling the larger class (not reverted) and oversampling with replacement in the smaller class (reverted) and got similar results. [17:42:29] I chose to oversample the smaller class. [17:42:48] That's an interesting idea. [17:42:55] For it to work on my end, I would actually have to do... sampling [17:43:01] I'm working with entire WikiProject populations at the moment. [17:44:11] No problem once you have a datastructure that contains and . [17:45:17] As in, I can feed the classifier the full population and also tell it to oversample the smaller groups? [17:46:00] harej, regretfully, I don't believe that you can. However, there's a nice python function you can use with a row-major datastructure. [17:46:04] * halfak makes a gist [17:50:06] harej, https://gist.github.com/halfak/e6ca19a8cd43b29a8f35 [17:50:34] looks good! [17:50:42] For a real-world example, see https://github.com/wiki-ai/revscoring/blob/master/revscoring/scorer_models/svc.py#L87 [17:50:59] If you switch to revscoring, you'll get label balancing for free when using SVC type models. [17:51:51] I'm happy to switch to revscoring, but I'm not scoring revisions? [17:52:12] Yes you are. :) [17:52:55] Do you use any features of the content? [17:53:06] Not at all [17:53:25] I indirectly do, through page links [17:53:39] But I don't ascertain anything from the page contents, since page link information is database-stored metadata. [17:53:53] Gotcha. So revision doesn't matter. That's OK. [17:54:19] I'd make these features depend on page info (e.g. title, id, namespace, etc.). [17:54:30] So you could provide a page_id and that would work. [17:54:44] It's actually all quite generalized -- more so than the name would suggest [17:55:14] Speaking of which, I need to split out the dependency injection system into it's own library. [17:57:51] And... how do I switch to revscoring? :) [18:02:05] harej, good Q. I think a good way to start is to work synchronously for a bit. I'll be able to better assess what you need and provide you some skeleton code that you can flesh out. [18:02:16] How's Saturday @ 7AM PDT? [18:03:13] And I guess my other question is, what does revscoring offer that I don't have? [18:04:21] If you build it in revscoring, we can put it into ORES. [18:04:32] Also, it'll re-balance your labels for you. [18:04:38] I guess you also get free collaborators. [18:04:43] Works for me! [18:05:16] yes, 7 AM PDT on Saturday works for me [18:06:20] currently I am testing if population size is an issue through the use of a much larger project [18:06:25] (football was too large) [18:07:14] if i get 58% accuracy on a project with 1,639 training pages, let's see what accuracy i get on a project with 31,273 [18:10:10] harej, 42 [18:15:46] halfak, python question? [18:16:17] what would you recommend as a structure to iteratively increment values to - something vaguely tabular? [18:16:23] harej, you want something like AUC instead of accuracy. [18:16:25] at the moment I've got a list of lists [18:16:59] Ironholds, defaultdict. [18:17:09] e.g. counter = defaultdict(lambda: 0) [18:17:15] counter['thing'] += 1 [18:17:29] halfak, cool! [18:17:33] thanks :) [18:17:47] collections saves my butt yet again [18:17:49] No problem. [18:17:51] :D [18:18:29] Ironholds, looks like there's a Counter thing inside collections too. [18:18:40] personally, I like defaultdict because it is more general and powerful. [18:19:00] cool :) [18:20:17] halfak: what's wrong with measuring accuracy by running the training material against the predictor? [18:20:44] harej, nothing, but AOC lets you understand false positive and false negative rate visually as distinct things [18:20:53] ooOOOoh! [18:21:05] harej, oh! You never test with your training material. [18:21:27] You want to keep a random sample separate from the training set to test with. [18:21:35] Otherwise, you can get into overfitting problems. [18:21:44] https://en.wikipedia.org/wiki/Overfitting [18:23:05] harej, if you use revscoring, you'll get an AUC report for free ;) [18:23:17] Also, you can use a couple of commands to extract features and train new models. [18:23:18] :D [18:23:28] okay okay I'll convert to revscoring [18:23:33] But the soonest I can do that is Saturday, yes? [18:24:13] Yeah. At least that is the soonest I can help. [18:24:21] Very good. [18:24:32] harej, would an analogy around overfitting help? [18:24:37] Sure [18:24:40] RACISM! [18:24:59] a few decades ago the DoD commissioned IBM to build a system capable of recognising soviet tanks in satellite imagery [18:25:17] after much work and many neural network specialists, they had it. It could identify tanks in the photos 100% of the time. And they deployed it. [18:25:22] and it...didn't work. [18:25:37] but what it WAS identifying in photos without tanks, was clouds. [18:25:42] Also in photos with tanks [18:26:03] they had trained it to detect clouds, because their initial test set by sheer coincidence had clouds in every photo with a tank in it [18:26:10] *that* is overfitting :D [18:26:33] you make something that detects specific elements of the training set instead of specific elements of what the training set represents [18:27:03] ^ that [18:27:11] the Wikipedia article needs an example like that :> [18:27:19] WP:SOFIXIT! [18:27:20] Racism is another good example of overfitting. [18:29:01] Racism is a lot of things, to be fair, and overfitting is only one of em. But yes. [18:29:54] Yeah, hereditary racism is overfitting in an online learning model. [18:30:17] Goddamn prior got overfit and now we're stuck with it until we have enough observations to normalize. [18:31:01] So I guess the issue I will run into, if I repudiate using the training material to test the classifier and I instead use a sample, that sample is going to be a subset of the training material since the training material consists of "all of the in-scope articles that have received scores" [18:31:38] harej, indeed. Subsample the test set and train on what remains. [18:31:58] So the model will never *see* the test material until you run the test. [18:32:12] Or train with one sample and test with another? [18:32:24] And the training material will necessarily need to oversample top- and high-importance? [18:33:07] Indeed, but the test material should not re-sample. [18:33:22] You're asking the question: how well does my classifier reflect the real world. [18:33:26] Two mutually exclusive samples, I mean [18:33:27] ^? [18:33:31] Yes. [18:33:34] Mutually exclusive [18:33:41] Will revscoring do this for me? =] [18:36:16] Yes :) [18:36:34] All the more reason to switch. Now, I need to find something to do between now and Saturday. [18:37:17] Perhaps I'll fork HotArticleBot. [18:37:39] harej, check out how we define features. E.g. https://github.com/wiki-ai/revscoring/blob/master/revscoring/features/page.py [18:38:01] You express dependencies on "datasources" and other "features" to build the feature that you want. [18:38:47] Here's some example feature lists we use for detecting damage and predicting quality classes: https://github.com/wiki-ai/ores/blob/master/ores/feature_lists/enwiki.py [18:39:07] See "wp10" for the article quality class features [19:00:44] I get 87.5% accuracy with WikiProject Sweden (N > 30,000), granted I am testing accuracy by predicting the population against the training material consisting of the population [19:01:28] And by predicting just about everything as mid or low importance, so it probably isn't actually getting anything right. [19:01:45] Like how a broken clock is right twice a day. [19:02:48] (When I was growing up we had a clock that had to be manually wound and was frequently broken, especially as it got older, and I kept trying to look at the clock at the actual right time so I could witness a broken clock getting the time right.) [19:04:06] I get the highest accuracy with damage prediction when I just predict False all of the time. [19:04:10] This is why AUC is better. [19:04:53] Right. The predictions I'm making are largely accurate for all the wrong reasons. [19:05:32] Chances are, you read an article on Wikipedia and it will be ranked low-importance. Most of Wikipedia is unimportant. If everything were important, nothing would be important. [19:06:06] Or, everything would be AMAZING. [19:06:41] A Wikipedia with Upworthy's editorial policies. [19:07:09] * halfak is not familiar with upworthy [19:07:37] And so you shall remain. [19:07:44] :) [19:51:55] J-Mo: I have a formwizard question [19:52:04] sure, harej [19:52:41] Is there any hook I need to add to buttons to make them invoke the form wizard? Also, on Test Wikipedia the javascript configs seem to be on arbitrary pages rather than MediaWiki: pages; how do I reference a specific one? [19:54:05] * J-Mo is looking for answer... [19:57:02] Like, https://test.wikipedia.org/wiki/Wikipedia:Co-op has a button that reads from https://test.wikipedia.org/wiki/Wikipedia:FormWizard/Config/Co-op/Learner [19:57:09] and I can't see how [19:58:23] harej: the way it's set up on enwiki is that you create the config at MediaWiki:Gadget-formWizard/PortalName/HookName, then in the button you place on that portal (or a subpage of that portal), you add the attribute data-type=HookName [19:58:59] Should I assume the same for Test Wikipedia? [19:59:11] if you look at the commented-out button on https://en.wikipedia.org/wiki/Wikipedia:Co-op/Banner , you'll see that it has data-type=Learner, which loads the config https://en.wikipedia.org/wiki/MediaWiki:Gadget-formWizard/Co-op/Learner [19:59:40] I'm currently making some tweaks to the testwiki code, so it may not work as expected atm [20:00:00] "currently" = this week, I expect it to be up again next week. [20:00:09] fascinating [20:00:21] gadget / userscript stuff is something I've wanted to improve infrastructure for for years.... [20:00:30] * YuviPanda spends time fighitng NFS instead :'( [20:00:36] also, hi J-Mo! [20:00:59] YuviPanda iknowrite wouldn't it be great if working on this cool stuff was actually our jobs? [20:01:07] yup! [20:01:23] J-Mo: In the meantime, should I be doing my development on enwiki? [20:01:25] there's Gadget2, but I also want to bring some software engineering practices into this. [20:01:27] not that I'm complaining too hard [20:01:28] like, git and what not. [20:01:32] * J-Mo <3 his new gig [20:01:32] J-Mo: hehe :) [20:01:36] J-Mo: congrats on that one :) [20:02:01] I need to figure out when 'someone helping IEGs get deployed technically' is an actual job :) [20:02:03] I can do either enwiki or testwiki; I am comfortable with testwiki since I am less afraid of breaking things there, but enwiki will work too. [20:02:20] harej: you can definitely create a new config under MediaWiki:gadget-formWizard. That shouldn't be controversial or break anything [20:02:41] which WikiProject are you going to make a wizard for? [20:02:48] ALL OF THEMMMM [20:03:16] hehe. not currently possible (at least, you would need a different config for each one). But that could be changed later. [20:03:27] I am developing a membership system as per https://phabricator.wikimedia.org/T97210 [20:03:32] I mean if there's a configuration for each WikiProject that's fine [20:04:18] for now, I'd suggest trying on a single one. So, new config would go to MediaWiki:Gadget-formWizard/WikiProject_Medicine/Requests or something. [20:04:50] if the workflow was triggered by a button on Wikipedia:WikiProject_Medicine/* [20:05:40] harej is that enough info to get you started? I'll ping you once I've updated testwiki so that it works exactly the same as enwiki again. [20:07:03] YuviPanda: yes, that would be an awesome job :) [20:07:51] J-Mo: indeed [20:07:58] J-Mo: I might take some time and work on more Quarry features. [20:08:05] specifically repeating queries. [20:08:20] \o/ [20:09:07] * J-Mo back to reviewing… [20:09:23] Since I'll have to experiment in a production system, the first WikiProject to get the new membership system will be WikiProject Women in Technology, my potemkin WikiProject. [20:11:25] halfak, can I again ask for help with a python thing? :/ [20:16:49] J-Mo: Does the immediate subpage to [[MediaWiki:Gadget-formWizard]] need to match the name of the page where the button will be used? [20:18:57] harej: it needs to match the name of the parent page (if the button is on a subpage). So whether the button is on Wikipedia:WikiProject_X or Wikipedia:WikiProject_X/Header, the config path would be the same: it would just have WikiProject_X in it. [20:19:08] Got it. [20:19:22] This also means you can put the button on multiple places in the WikiProject space, and all of them will work. [20:19:45] What about the subpage after that? Can it be any arbitrary term, so long as it matches the hook? [20:28:38] harej: sorry for the delay: yes the subpage that has the JSON config in it can be any arbitrary term as long as it matches the hook. [20:37:38] Ironholds, yes. Python thing. [20:46:00] halfak, https://github.com/Ironholds/floccus - what am I missing to be able to build a tarball out of this? [20:46:05] because currently it just builds an empty tarball [20:46:21] oh wait [20:46:22] init.py [20:46:25] it doesn't know it's a module [20:46:28] thank you rubber halfak! [20:46:34] __init__.py [20:46:35] you mean [20:46:57] http://stackoverflow.com/questions/12966216/make-distutils-in-python-automatically-find-packages [20:47:17] YuviPanda, indeed [20:47:37] Better link: https://pythonhosted.org/setuptools/setuptools.html#using-find-packages [20:47:51] Also yes, you need __init__.py for package finders to work :) [20:48:03] halfak, you made it work! [20:48:05] you're a genius! [20:48:18] Na. I *have* a genius. [20:48:50] https://en.wikipedia.org/wiki/Muse [20:48:57] which muse? [20:51:02] the Remember-crap-that'll-get-you-caught-reading-docs muse [20:51:48] that's a weird name for Dr Dr Halfaker [20:51:54] but hey, whatever works for you two [20:53:03] Ironholds, don't forget a MANIFEST.in [20:53:09] yup! :D [20:53:16] Or your uses will get errors when they can't find LICENSE [21:13:14] J-Mo: aarrgh why is this happening https://en.wikipedia.org/w/index.php?title=Wikipedia%3AWikiProject_Women_in_Technology&type=revision&diff=668525163&oldid=668522580 [21:15:06] I don't know :( [21:15:26] this is the config: https://en.wikipedia.org/wiki/MediaWiki:Gadget-formWizard/WikiProject_Women_in_Technology/Join [21:16:41] I'm too busy to give meaningful debug support this week.. but if you email me with the links and a description of what you did, and what's going wrong, we can probably set up a time next week to meet and discuss. Works? [21:17:33] you might also reach out to Jeph directly: he might be able to identify the problem and provide support sooner.