[08:46:53] I'm deeply troubled by this concept that Wikipedia can be divided in "Democratic pages" and "Republican pages". I hope it doesn't gain traction... https://arxiv.org/abs/2007.08197 [11:16:33] Nemo_bis: interesting. from what I understand the Democratic pages and Republican pages here refer to articles belonging to different existing categories on wikipedia, specifically Category:Democratic Party (United States) and Category:Republican Party (United States). one could certainly argue that this is a US-centric approach to capture the political aspect of articles. I would be interested to learn what do you think are [11:16:34] the fundamental issues in using the category-labels to assign articles to different types of content? [11:20:37] mgerlach: the fundamental issue is that the whole idea of assigning a viewpoint to a page is the opposite of the chief design decision of MediaWiki and UseModWiki before it [11:21:31] For the paper to make any sense, you'd need e.g. pro-abortion arguments to be on one page, and anti-abortion articles on another [11:22:02] And then you'd need people to click [[Barack Obama]] only to hear good things about how great Obama was, while the anti-Obama people would click [[Hussein Obama]] or something [11:23:28] There is some interesting discussion on MeatBallWiki about such early design decisions, you can see traces of TimStarling in some of these too [11:23:51] I usually quote this: « I think I just had a revelation (you know, one of those minor ones you get when following the spaghetti of a wiki): despite the claim that used to be above, ViewPoint is nothing like a wiki. -- ChrisPurcell» http://meatballwiki.org/wiki/ViewPoint [11:25:57] More concretely, if I start from an article [[U.S. officer convicted of crimes in 2017–2020]], because I hate the current administration, I will probably end up clicking dozens of articles and biographies about folks in the Republican party. According to the paper I'm in a "Republican bubble". It just makes no sense whatsoever. [11:46:21] Nemo_bis: thanks. I see the problem about identifying a category-label of an article with a viewpoint (if I understand you correctly). though the bias they measure is still informative in my opinion: it measures the difference between the directed flows (from i to j and from j to i) indicating that it is more likely to go in one direction than in the other (reciprocity). this type of bias (though only from the network and not [11:46:21] from the flow) has been also measured between articles on men and women, respectively , to surface structural gender bias in content https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewPaper/10585 [11:48:09] mgerlach: why should the flows be balanced? [11:51:57] (reading more carefully now) [11:53:23] Nemo_bis: they dont need to be balanced. what I get is that the balance is more unequal in the actual navigation is larger than in a null model of randomly exploring the network [11:56:54] mgerlach: even assuming the pages conformed to the labels received (which is extremely unlikely to be correct), I don't see why that would be a problem [11:58:50] I think you were referncing the section RQ2: «The trends referring to the clickstream graph, for all the topics, but cannabis and sociology, show that the dynamic structural bias turns out to be greater than the expected on the baseline models, even on the long run. In other words, users tend to navigate pages belonging to the same parti- tion.» [11:59:50] Particularly unwarranted is this conclusion: «Longer navigation sessions expose users to higher level of structural bias.» [12:01:39] They're just measuring the effect of their own classification method here. People in a long session are likely to read articles on connected topics, which tend to be in the same category. However the authors give a certain label to all pages in a given category. If they had assigned the labels with some method other than categories they could possibly find some information not created by [12:01:45] themselves, but this way I doubt. [12:02:43] In short, I really don't understand anything at all about this paper. I found it from the list of todos of the next research newsletter so it would be great if someone could review it! [12:02:59] https://etherpad.wikimedia.org/p/WRN202007 [12:04:49] I wish the authors had focused on something less controversial and easier to understand, say music preferences or cuisine or sports... how likely users are to find out about Oregon cuisine coming from articles on Idaho and vice versa, or how often fans of Wagner can end up reading about Ray Charles and whatever [12:08:20] and thank you mgerlach for this discussion :) [12:27:50] Nemo_bis: thanks for your thoughts and the discussion :also for motivating me to take a deeper look (it was on my reading list) : ) [14:23:39] mgerlach: same! I didn't read closely enough before I had to answer your comments ;)