[11:17:28] Amir1: awesome :D [11:17:40] hey YuviPanda :) [11:17:41] thanks [11:17:57] I actually labeled 750 edits to finish it [11:18:17] Amir1: I'm at a tamil wikipedia tech training event, am wondering if I should ask them to try to get models setup for tawiki? or are you guys too busy now? [11:19:07] nah [11:19:11] It would be awesome [11:19:30] ok. I'll talk to some of 'em [11:19:34] YuviPanda: let me start extracting bad words [11:19:49] the first thing we want is reviewing them [11:22:56] Amir1: right. is this process documented somewhere? [11:23:37] We have some phab cards for other langs [11:23:54] YuviPanda: but I'm not sure we documented the whole thing [11:25:07] Amir1: that would be awesome too I think [11:36:09] YuviPanda: https://phabricator.wikimedia.org/T107590 [11:36:17] Sorry It took so long to find it [11:36:51] then we need this card: https://phabricator.wikimedia.org/T133563 [11:40:43] Amir1: np! Thanks :D You should put it on wiki somewhere too [11:40:54] yeah, that would be easy [11:41:03] doing it very soon [11:41:14] let me make phab cards for Tamil [12:08:17] 06Revision-Scoring-As-A-Service: Report of work since last report - https://phabricator.wikimedia.org/T128958#2254611 (10Ladsgroup) a:03Ladsgroup [12:08:23] 06Revision-Scoring-As-A-Service: Report of work since last report - https://phabricator.wikimedia.org/T128958#2091293 (10Ladsgroup) I do it on a weekly basis from now on. [12:08:32] 06Revision-Scoring-As-A-Service: Report of work since last report - https://phabricator.wikimedia.org/T128958#2254613 (10Ladsgroup) [12:28:40] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254643 (10Ladsgroup) [12:28:47] YuviPanda: https://phabricator.wikimedia.org/T134105 [12:28:57] https://meta.wikimedia.org/wiki/Talk:Objective_Revision_Evaluation_Service#Proposed_section [12:56:26] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254721 (10Ladsgroup) Bad words should be in [[https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service/Word_lists/ta| the page in meta wiki]] very soon. [12:56:26] 10[2] 04https://meta.wikimedia.org/wiki/https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service/Word_lists/ta [12:56:43] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254722 (10Ladsgroup) a:03Ladsgroup [12:57:04] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254643 (10Ladsgroup) [14:08:24] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254784 (10Ladsgroup) It's generated. Now we need a native speaker review bad words :) [14:08:38] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254785 (10Ladsgroup) a:05Ladsgroup>03None [14:11:37] 06Revision-Scoring-As-A-Service, 10wikilabels: [Investigate] Intermittent performance issues with wikilabels - https://phabricator.wikimedia.org/T130872#2254788 (10Ladsgroup) I dug into logs and extracted cases that generating took more than 0.1 sec. P2980 is the result. [14:24:21] 06Revision-Scoring-As-A-Service, 10ores: [Investigate] ORES spike of errored requests every hour - https://phabricator.wikimedia.org/T134109#2254813 (10Ladsgroup) [14:24:43] 06Revision-Scoring-As-A-Service, 10ores: [Investigate] ORES spike of errored requests every hour - https://phabricator.wikimedia.org/T134109#2254801 (10Ladsgroup) [15:04:24] o/ [15:05:02] halfak: o/ [15:05:16] Hey dude. Just looking at your post on Talk:ORES [15:05:27] oh thanks [15:05:52] do whatever you want to do with and then put it in the main page [15:06:03] kk :) [15:06:08] (I did some stuff today) [15:10:43] Amir1, cool. Anything else I should have a look at? [15:11:04] a lot [15:11:05] :D [15:11:17] halfak: I made the announcement for Wikidata [15:11:26] sent it to ai-l and wikidata-l [15:13:23] Great! [15:13:28] I haven't looked at my inbox yet :/ [15:14:08] Love the TLDR :) [15:14:57] :D [15:17:49] halfak: once you're done: https://phabricator.wikimedia.org/P2980 [15:18:01] this is cases of intermittent performance issues [15:19:26] we had two cases that took more than a sec to generate: 1429 msecs and 1194 msecs [15:19:29] nothing higher [15:20:14] Gotcha. So you felt it when submitting a label. [15:20:29] Interesting. I did not expect that *this* would be so bad. [15:20:40] The changes you made to label creation were merged, right? [15:20:47] *merged and deployed* [15:21:05] yeah [15:21:44] I'm saying it's an issue because generally it takes between 10 to 100 msecs [15:21:49] (if you check the logs) [15:22:01] but we have some strange cases like this [15:22:08] Yeah... It's not surprising that it is generally fast. [15:22:15] WTF is going on with this postgres server? [15:22:26] There must be some substantial other usage. [15:22:33] I wonder if we can set up a grafan for it. [15:22:38] Do you know what machine it is? [15:25:40] halfak: we have gangalia [15:25:45] let me find it for your [15:25:47] *you [15:26:07] https://ganglia.wikimedia.org/latest/?r=year&cs=&ce=&m=cpu_report&c=MySQL+eqiad&h=labsdb1004.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=NOGROUPS [15:27:24] "MySQL eqiad"? [15:31:55] I have no idea why it's named mysql eqiad [15:32:25] but per what I'm told pgsql.eqiad.wmnet is an alias for labsdb1004.eqiad.wmnet [15:33:04] halfak: https://ganglia.wikimedia.org/latest/?r=month&cs=&ce=&m=cpu_report&c=MySQL+eqiad&h=labsdb1004.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=NOGROUPS [15:33:36] if you see, in week 14 it had lots of traffic (that's when the original and big intermittent issues occured) [15:33:59] What dates does that correspond to? [15:34:43] mid-March [15:34:52] (around the hackathon) [15:44:14] Amir1, woah that was weird. I had a long lockup. No high CPU usage, but back now. [15:44:42] strange [15:44:46] is it old? [15:44:54] maybe it's just worn out [15:47:36] The laptop is ~2012. An i5 with 8GB of ram. I have an original X1 Carbon. It's generally pretty high performance. [15:47:53] But I'm running 14.04. I've been eyeing upgrading to 16.04 soon. [15:48:15] I've got some finicky issues that I hope new versions of Gnome 3 and maybe some basic display drivers will fix. [15:48:34] Intel's got some good Open source patterns with regards to their drivers. [15:49:37] I use 16.04 right now [15:49:45] but the old one was 14.04 [15:50:17] also it's strange that printers still need driver [15:50:28] Seriously. WHAT YEAR IS IT? [15:51:36] I heard somewhere, "it's official, driverless cars will come before driverless printers" [15:53:24] Ha! That's a good way to put it. [15:55:44] heh, a driveless printer saves time while a driveless car saves lifes [16:00:25] GhassanMas: USB 3.0 is also just faster but it's here, "sometimes" time is precious [16:00:30] :D [16:03:42] indeed. mostly, market competition plays a critical rule towards the development of a particular product [16:20:21] have the tokenizing been used in any of the models ? [16:58:52] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2254643 (10Shanmugamp7) >>! In T134105#2254784, @Ladsgroup wrote: > It's generated. Now we need a native speaker review bad words :) Do you need some help here @Ladsgroup , i don't know what thi... [17:57:21] 06Revision-Scoring-As-A-Service, 10revscoring: Tamil language utilities - https://phabricator.wikimedia.org/T134105#2255113 (10Ladsgroup) @Shanmugamp7 Awesome, Thank you! What I need is two lists 1- bad words: lists of words that should not be in anywhere in Wikipedia. You can see list of them for English in... [17:58:30] afk for a while, to watch spotlight [19:00:49] o/ [19:00:59] was afk now back and working on reviewing a paper. [21:59:16] Hi all! [22:00:04] Amir1 halfak YuviPanda 0/ [22:01:03] hey :) [22:01:29] I've been away from you all for a good reason although I suppose it seemed I was very enthusiastic about joining the research. [22:04:25] Hm. Got disconnected for a sec [22:05:17] Anyway: I was timid about asking a question about what looked a simple problem but it wasn't so after all. :( [22:07:00] So here it goes: how do you guys load a file of tsv of mixed type into an array in python? [22:08:39] *types: boole and and int [22:10:13] I mean if this matrix is big I have to load a row by row and multiple booleans with 1 or something like that. Is there a cleaner solution - like some sort of a function in sklearn? [22:14:03] Because I need an array of values of unique type not mixed ones, so I can manipulate it further. Right? Or is there something I'm missing? It's way easier in Matlab because you can treat anything as a number - float. [22:19:59] I guess halfak knows this, so I'll contact him by email. It seems that I had to pluck up the courage and talk about my problem so publicly so I could continue with this work. :) [22:21:03] It's just that I've been away from programming for a long time. :/ :) [22:21:56] Gtg now, hope I will be seeing you more. Bye