[00:12:29] J-Mo: re research showcase. I talked with OIT now and I /think/ this is one model we can explore: [00:13:00] J-Mo: we can say that we want to do an experiment (based on the feedback) to open up the sync conversations further and have more people in the virtual room. [00:13:53] J-Mo: to do that, we ask people to submit their email address and a one line explanation why they want to be in the room and then we decide to add or not. [00:14:20] J-Mo: this way, we can retain some control over the make-up of the room. We also discussed ways to manage mic/camera, etc. [00:14:56] J-Mo: I'm happy to talk about this experiment tomorrow in the research showcase and discuss what we'd like to do and the thinking behind it. or you can do it. either way works for me. [00:57:22] J-Mo: I'm signing out for the day from IRC. we can continue tomorrow and before the showcase. [00:58:11] J-Mo: nevermind. the showcase is next week. I thought it's tomorrow. too excited I guess. ;) [16:01:55] o/ isaac [16:02:00] I'm looking at https://en.wikipedia.org/wiki/Talk:History_of_evolutionary_thought [16:02:07] It uses some *OLD* WikiProject templates. [16:02:19] Should I expect them to appear in the output you produced? [16:02:32] E.g.: [16:02:34] {{EvolWikiProject|class=FA|importance=top}} [16:02:34] {{HistSci|class=FA|importance=high}} [16:03:24] It looks like I see some and not others. [16:03:48] E.g. I see "evolwikiproject" but not "histsci". [16:04:51] I could produce a list of all of the templates & template redirects we expect for you to match against if that would be helpful. [16:59:03] halfak: yeah, i wasn't aware of those old-style templates so having a list of examples would be quite useful. my approach is somewhat ad-hoc: after lowercasing template names, it pattern matches for "wikiproject" or "wp". i also throw out the bannershell and a few other false positives that aren't connected to a wikiproject [17:00:15] Makes sense. That's what I was guessing. If I make you a list today, do you think you'll have time to do a new run soon? [17:00:48] there's a follow-up step then that tries to map those potential wikiproject template names to actual wikiprojects that is a mixture of standard formats -- e.g., I check for wikiProject military history but also militaryhistorywikiproject. and then i have an additional whitelist that has things like wpmilhist [17:02:19] yeah, easy to re-run it. just provide me an additional list of strings that I'll check against. there's probably an opportunity to then further improve the mapping of template -> standard wikiproject name and mid-level category, but that stage can be re-run very quickly so no need to get it right just yet [17:03:19] I think we should do mid-level category in separate step since the taxonomy will change. [17:06:57] :thumbs up: -- i'll leave it out for now then [21:46:21] isaacj, sorry for the delay! See https://phabricator.wikimedia.org/T240273#5733833 [21:54:51] thanks halAFK ! it looks like you went ahead and generated the complete list so i'll remove my wp/wikiproject string matching approach and replace it fully with a lookup against this data [21:55:27] any reason to retain the actual template name used or you happy for me to just map them all to their standard forms in the output data? [21:55:50] Hmm. Might as well keep the same capitalization [21:55:58] We can do some easier post processing if we want to then [21:58:26] Oh I see. Yeah, let's not map to the cannonical name in this pass. That'll be easy to do in the second pass. [22:04:16] Hmm. On second thought, if you're putting this into figshare, it might be better to map to the cannonical names. More valuable to future users. [22:10:59] https://www.irccloud.com/pastebin/b7MiSFv8/ [22:13:23] halAFK: see above when you get a chance. i'll map to canonical names where i have one like you suggested. we can always clean it up later too with the wikiprojects where we haven't established a canonical name