[00:37:23] 10Scoring-platform-team, 10Scap, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): Support git-lfs - https://phabricator.wikimedia.org/T180627#4128798 (10awight) [00:51:20] (03PS1) 10Awight: Update assets submodule with word2vec bin [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/425939 [01:03:33] (03PS4) 10Awight: Point submodules at gerrit [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/425717 (https://phabricator.wikimedia.org/T180627) [01:03:35] (03PS6) 10Awight: Add the assets submodule, git-lfs enabled [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/419613 (https://phabricator.wikimedia.org/T180627) [01:03:37] (03PS2) 10Awight: Update assets submodule with word2vec bin [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/425939 [01:14:24] (03PS7) 10Awight: [DNM] Add the assets submodule, git-lfs enabled [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/419613 (https://phabricator.wikimedia.org/T180627) [13:42:10] o/ [13:51:22] hi [14:16:34] i think we may get puppet errors today [14:16:38] due to the openstack upgrade [14:18:38] Oh? Did it cause some alerts paladox? [14:18:49] halfak not yet, but just in case :) [14:19:00] it will take all day according to the email labs sent [14:20:27] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: [research] Why is the japanese 'reverted' model so bad? - https://phabricator.wikimedia.org/T133405#4129739 (10Halfak) We just got pinged to re-consider this here: https://www.mediawiki.org/wiki/Topic:Ub6ir6tww9z81960 @Ladsgroup, fr... [16:29:57] WOah VSP [16:30:15] Or Vermont [16:30:35] Oh. Probably not you? [16:35:38] halfak: Using the previous abstract and current points I've put together a final version of the new abstract. Since abstracts need to go on 16th, it'd great if you can spare 5min to glance through that for any issues... [16:35:55] its on overleaf [17:02:10] Thanks. I'll have a look in an hour [18:04:02] Got out of meeting. Doing some family tech support and them I'm looking at the abstract [18:33:00] np [18:56:17] halfak: VSP is me. [18:56:24] o/ Vermont [18:56:32] Saw you getting disconnected a lot [18:56:35] Connection issues? [18:59:07] Yep. [18:59:30] Now I have to deal with getting it unbanned from all 56 channels it was in. [19:01:00] codezee, {{done}} [19:01:00] You rule, halfak! [19:01:01] please review [19:01:07] lol [19:01:12] halfak: thanks! looking [19:05:59] I still have a long way to go in learning to write better :) [19:06:54] Na. I incorporated most of your points. Do you like my version better? [19:10:38] yes, its concise, one thing that I always get confused about is usage of past and present, like I think it should be "we developed an automatic topic modeling strategy" [19:11:04] Ahh yes. It should be consistent across the board. [19:11:24] So rather than "develop" it should be the present tense of the paper "describe" [19:15:14] {{adjusted}} [19:15:15] 10[1] 04https://meta.wikimedia.org/wiki/Template:adjusted [19:15:18] :) [19:15:21] codezee, ^ [19:18:21] does not exist :/ [19:19:18] since "in this paper..." is occuring twice I'll go ahead and replace the 2nd one with further [19:21:48] Cool [19:31:36] halfak: are you aware of past works trying to use categories for topics? if you know you can throw the keywords i'll google [19:32:57] codezee, https://scholar.google.com/scholar?hl=en&q=wikipedia+category+structure&btnG=&as_sdt=1%2C24&as_sdtp= [19:33:38] I've got to change locations. Back in ~25 mins [19:33:51] thanks...i'll dive into those [20:12:58] https://www.mediawiki.org/wiki/Topic:Ub58fvti46x4e5lf [20:19:08] i see cscw has a system design category whose description exactly fits the topic of this paper... [20:21:59] i'll set aside sometime to read this, interesting insight into ores tech specs [20:22:19] :) It felt good to write that out. [20:22:30] I still feel like I'm behind on ORES system paper essays. [20:22:36] But it's getting close. [20:22:45] If I can button this up next week, it'll be a small miracle! [20:24:40] although i can't add much to the tech specs but feel free to ping for looking over it for any minor errors... [20:28:30] perhaps interesting, slides for my talk at haystack: https://commons.wikimedia.org/wiki/File:From_Clicks_to_Models_The_Wikimedia_LTR_Pipeline.pdf [20:36:28] ebernhardson: a question, what kind of data was being shuffled b/w JVM and python? during label generation [20:38:25] :) [20:52:35] codezee: in that case it's ~40% of search sessions over the last 90 days [20:52:47] codezee: in general though any time in pyspark you actually load the data into python (as opposed to using the sql dataframe) it's going to suffer a fairly significant overhead in my experience [20:53:30] not sure these came through, hotel wifi is horrible ... [20:53:35] codezee: in that case it's ~40% of search sessions over the last 90 days [20:53:40] codezee: in general though any time in pyspark you actually load the data into python (as opposed to using the sql dataframe) it's going to suffer a fairly significant overhead in my experience [21:45:33] Heading out. Will be doing a bit of writing review over the weekend. codezee, I think I'll want to take a full pass of the paper on Sunday if you can have it ready. [21:45:46] Otherwise, see y'all on Monday!