[00:08:42] halfak: [00:08:49] https://www.irccloud.com/pastebin/HhbORwBt/ [00:31:30] Amir1, \o/ [00:32:30] Amir1, I won't make it with the revids before next week :( [00:32:41] halfak: Check my latest commit in wb-vandalism [00:32:54] I have to head out right now and I probably won't be logging in until Tuesday morning. [00:33:02] and see if you agree on the system [00:33:14] oh it's okay [00:33:17] no rush :) [00:33:40] bon voyage :) [00:33:50] DictDiffer! [00:34:22] I wonder if you might add that as a datasource [00:34:40] So that you only need to load up the differ once for added/removed/changed [00:35:19] I like the generalized idea of the DictDiffer :) [00:36:09] yeah [00:36:24] I was thinking of something more robust [00:36:38] that was my initial idea [00:36:48] but It's not robust [00:36:57] just works [00:37:41] I will try to add it as datasource [00:45:31] Amir1, I just realized, you don't need to convert .keys() to a set to do efficient set operations [00:45:40] .keys() will behave like a set :) [00:45:59] oh, cool [00:46:03] so, d1.keys() - d2.keys() = keys_in_d1_missing_from_d2 [00:46:21] I fix it [01:03:45] my mind is not helping [01:03:51] I will do it tomorrow [01:11:28] Anyway, it's awesome work :) [01:11:40] thanks halfak :) [01:13:05] hello what is Wikipedia AI up to these days [01:17:10] hare: hey, on my behalf: I published new version of Kian (data mining tool for Wikidata) and made progress on anti-vandal bot for Wikidata [01:17:10] https://github.com/Ladsgroup/Kian [01:17:12] https://github.com/Ladsgroup/wb-vandalism [01:42:09] hare, I've mostly been working on some fundamental data processing utilities. E.g. http://pythonhosted.org/mwxml/, http://pythonhosted.org/mwtypes/ and https://github.com/yuvipanda/python-mwapi. I'm doing this in parallel with my use of them to run some analyses at the WMF. E.g. I'm applying the article quality model to the beginning and end of a 6 month period and looking at how article quality changed over time. [01:43:15] First I wrote an API extraction strategy with mwapi, but that was too slow, so I switched to am XML processing strategy with mwxml. [01:45:08] * halfak gets back to packing [15:48:26] hey [16:06:20] hello [16:06:25] we meeting? [16:07:24] Amir1? [16:07:59] hey, We have another meeting in the next week [16:08:12] yes [16:08:12] so I think we shouldn't have this one [16:08:18] umm okay [16:08:38] we are supposed to hold it normally but I dont have much to say on tuesday if we meet today [16:08:54] so I didnt send you an email for it but [16:08:59] do you thing you can run that bot :) [16:09:05] to populate the word list page [16:20:22] hmm [16:20:23] tonight [16:20:25] I promise [16:20:30] you have my words [16:20:40] White_Cat: ^ [19:06:58] http://www.schimpfwoerter.de/schimpfwoerter/fsk http://www.hyperhero.com/de/insults.htm [19:07:03] courtesy of derhexer [19:07:13] I think we can exploit these resources [19:07:24] they have a few false positives at least for german though]