[23:20:37] dsaez: I don't know why you're up but now that you're. :D question about the penalty term in +1-abs(distance/max(len(sectionsLang1,sectionsLang2))) [23:21:06] how do you exactly compute it? what does len measure? character length? [23:22:26] hi [23:22:27] no [23:22:38] number of sections [23:23:22] so len(sectionLang1) is the number of sections in that article (which we don't specify) in Lang 1? [23:23:46] sectionsEnglish = ['Introduction','Early Years','See Also','External Links'] [23:24:03] len(senctionsEnglish) => 4 [23:24:17] owwwww. in the co-occurance. [23:24:28] yep [23:24:33] got you. [23:24:47] and the distance will be considering the position [23:24:51] and what is distance? [23:25:21] hmm. but distance is a function of article, right? how do you compute it? [23:25:25] spanish= ['Introducción', 'Referencias'] [23:26:21] then co-occurrence of Introduction with Introduction will be 1 - (1 - 1) / 4 = 1 [23:27:06] Introduction and Referencias will be: 1- abs(1/2)/4 = 0.5 [23:29:42] dsaez: did you extract the sections in each basket considering their position in the article? as in: is [a,b,c] considered differently than [b,a,c], for example? [23:30:12] yep [23:31:12] * leila thinks [23:32:53] dsaez: so in your training, did you handle [a,b,c] different than [b,a,c] in general? for the purposes of co-occurrence and market basket analysis, the order doesn't matter, does it? [23:33:16] dsaez: (and feel free to go sleep or do whatever else you want to do. we can talk tomorrow or the day after, too) [23:33:46] leila, no I'm just doing one approach considering the order... [23:34:46] got you. I'll respond on the thread, I think it's worth trying not considering the order, too, cuz you're missing some data here that can be useful, right? [23:35:17] (I will also send you the hyperlink extraction code, dsaez) [23:35:28] good, thanks [23:35:36] dsaez: but this is off to a very good direction. the recommendations are quite good. [23:35:51] dsaez: and as you said, the ones that are bad are badly bad. ;) [23:36:24] true [23:37:55] dsaez: and it seems tizianop has some good news for you before you sleep. [23:38:02] * leila digs tizianop's email. [23:38:22] yes \o/ [23:39:29] let me see :D [23:40:04] let this poor Chilean sleep you all [23:41:25] think about your country-man, Tiziano. :D [23:41:28] DarTar: ^ [23:41:39] oops [23:41:46] * DarTar waves at tizianop [23:42:28] hi DarTar :D [23:43:00] yo