[15:53:53] Good morning bearloga :) [15:54:03] & Ironholds & guillom :) [15:54:09] goood mornings! [15:54:18] we're doing A/B testing and bad science. You're missing out. [15:54:24] * halfak runs away [15:54:32] it's REALLY BAD SCIENCE [15:54:54] Speaking of which, OMG such a big rant for paper I'm reviewing. [15:54:56] there are MULTIVARIATE BUCKETS WITH NO BUCKETS FOR PERMUTATIONS [15:55:06] AB or CD and no BC/AD pairs! [15:55:09] * Ironholds makes scary fingers [15:55:14] No [15:55:16] NoOOOoo [15:55:30] (in our defence we did not design it) [15:55:30] good morning halfak Ironholds guillom! :D [15:55:34] (and I would love a rant :D) [15:55:57] * halfak remembers all of the times he heard, "We just need to know which is better." during his time on the Growth team and points to all of the long-term value we have gotten from the GOOOD SCIENCE he fought for. [15:56:00] yes rant plz [15:56:13] halfak, yeah, I have a mini-retrospective inside my slides [15:56:46] "We did a thing and engagement went up. We didn't run an experiment, but we're pretty sure that it went up due to the thing we did." [15:56:50] what worked "the data came through". What didn't work "because nobody asked us before designing the experiment we have learned absolutely nothing of long-term value about our users. We just know this permutation is marginally better for that one. But not why, or which bit of it is responsible for the improvement, that would be SILLY" [15:57:00] Me: What about seasonal effects. maybe engagement was going up regardless. [15:57:14] Them: "We ran a regression to make sure there were no seasonal effects." [15:57:34] * Ironholds cocks head [15:57:35] Me "A regression of what? What type of regression? Why didn't you do a timeseries analysis!? AHHH!" [15:57:36] hello bearloga, halfak & Ironholds. [15:57:41] hey guillom :) [15:57:49] halfak, I'm not a proper scientist but even I know how to measure seasonality [15:58:00] grab data. Do timeseries analysis. Look at size of residuals. Nod knowingly. [15:58:05] AC: "I still think this paper is valuable and your methodological concerns don't diminish that." [15:58:26] * Ironholds blink [15:58:35] Ironholds, critical thing is skepticism. Show me that you thought about how you might be wrong and I'll give you a mile of rope. [15:58:48] okay, okay, if you say you're giving me a jam sandwich, and I bite into it and it's a ketchup sandwich because at no point did you check it was actually jam [15:59:01] the fact that it is edible is unimportant [15:59:05] But it might "seed conversations" [15:59:07] because your methodological failing means it is NOT A JAM SANDWICH. [15:59:29] oh, "seed conversations" [15:59:37] Except that THIS STUDY HAS ALREADY BEEN DONE BEFORE AND THEY DON'T CITE PAST WORK. [15:59:46] code for "I don't have the impetus to publish this Right so I'm hoping I can half-arse it and other people will tie it up for me" [15:59:54] halfak: ah, so they're computer scientists [16:00:11] "it will s/seed conversations/look I don't want to finish this, I'm bored of it, if I publish other people can do follow-ups that cite me/g" [16:00:12] Not sure. The reviewer is a critical theory-ish person [16:00:34] Rather the AC is. [16:00:38] I'm the AC-reviewer. [16:00:51] halfak: sorry, I'm just used to computer scientists inventing methods that have been invented and published by statisticians 50 years ago [16:01:10] bearloga, yeah, I once accidentally reinvented permutation testing [16:01:33] "I don't know how to measure this, but my programmer hat says that if I simulate it millions of times..." [16:02:32] Ironholds: good job. permutation testing is usually chapter 10 of most intro to stats books. so at least you were re-inventing advanced methodology. [16:06:11] :) [16:06:24] Computer science stats == simulate what you need. [16:06:41] And check to see if it works like you expect. [16:06:42] I mean, I feel like most basic stats is identical to good testing [16:06:58] the best way of finding out if you've written utter bullshit is to get a computer to do it MILLIONS OF TIMES [16:07:03] and if it doesn't fall over once, you've done good [16:27:06] :) [16:27:09] +1 Ironholds [17:05:20] halfak: About that list of references on the Swedish language Wikipedia: it's also very possible that the problem is in my script. [17:05:36] guillom, but why one language and not another? [17:05:43] dunno [17:06:04] Maybe EVERYTHING IS WRONG. [17:06:20] ha. [17:06:40] I was thinking that maybe when you ran the analysis, you didn't pick up all of the XML dump files. [17:06:43] Could that be the case? [17:07:21] I only did a qualitative sanity check (i.e. "does it make sense that those domains are at the top of the list"), not a quantitative one ("does the number of hits make sense?") [17:07:43] ugh; it's possible. I'd have to check my .bashrc on stat2, I guess. [18:07:03] Ironholds: halfak, nobody coming to ops/researcher meeting today :( ? [18:07:48] ottomata, it wasn't in my calendar! [18:07:50] Not on my calendar [18:07:55] :PPPP [18:08:04] ottomata, ^ [18:09:09] ! [18:09:09] no [18:09:10] ?? [18:09:22] BAH! [18:09:24] just bob? [18:09:24] why [18:09:25] weird [18:09:32] so weird [18:10:07] * halfak continues to learn about curse words in all sorts of interesting languages. [18:11:25] wait [18:11:26] oh [18:11:27] there is one tomrorow [18:11:28] confusing! [18:11:31] which one is real [18:11:32] i guess the other one? [18:11:55] huh. strange [18:11:57] ok i guess it is tomorrow [18:14:43] Ahh yes. That one i'll be traveling for. [18:26:45] The urban dictionary is really good for English and Spanish slang. Not so good for portuguese and really useless for indonesian. [20:33:43] halfak: David Strohmaier sent an email to wiki-research-l. you may want to have a look at some of their work. There seems to be some common research work between what you do and what they do [20:35:57] Indeed. SuggestBot uses a task predictor that we have in the pipeline for ORES. [20:36:03] Do you know what this Quality Assisted Editor for Wikipedia is? [20:37:29] leila, ^ [20:37:51] have you seen the UI halfak? [20:37:55] No [20:38:01] It seems google hasn't either. [20:38:02] digging the link, 1 sec [20:38:22] Are you talking about this: http://david-strohmaier.com/RankingTool/tutorialVideo.html [20:38:23] ? [20:38:43] there is another one which is the actual UI, not the video [20:38:46] halfak, ^ [20:39:16] That doesn't seem to be an editor, leila [20:39:28] "Quality Assisted Editor for Wikipedia" [20:39:40] ... is what David said in his email. [20:39:47] right, I haven't seen an editor, halfak. [20:41:23] halfak: there is http://david-strohmaier.com/RankingTool/ that I know of [20:42:27] I wonder what the purpose of this work is. [20:44:24] Nettrom, ^ [20:45:18] Nettrom, did you ever publish about SuggestBot's work detection system? [20:45:48] I'm thinking of the 5 columns that have big X's in them. [20:46:02] Content, Headings, Images, Links, Sources [21:43:21] halfak: sorry about delay, had a meeting [21:43:37] I didn't get to publish on that, unfortunately :( [21:45:52] (we built it, deployed it, but I didn't get traction on spending time analysing the results) [22:00:02] * Nettrom gotta run [22:30:18] halfak: not sure where people are but I'm setting up the room. will be there in 1min [22:30:27] kk