[15:56:28] halfak: o/ I hope you feel better today :) [16:11:15] (03CR) 10Ladsgroup: [C: 04-1] "I think it may have some problems with LocalSettings.php. Need to check more carefully" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [17:59:23] legoktm: hey, do you think it's the proper way to implement arrays in extension.json. line ~60 https://gerrit.wikimedia.org/r/#/c/265676/4/extension.json,cm [17:59:28] ? [17:59:59] someone pointed out to this bug https://phabricator.wikimedia.org/T121378 [18:00:05] you commented [18:10:51] Amir1: yeah, that looks right [18:11:19] but maybe just OresDamagingThresholds: { "soft": 0.5.0, ... } no need to repeat ores-damaging multiple times :) [18:11:53] oh yeah [18:11:57] you're right [18:12:01] I do it [18:44:09] Hello [18:45:43] YuviPanda, what's up with scipy. I can get it to install with pip after these reboots. [18:49:04] Some "libraries openblass not found", "libraries mkl,vml,guide not found" messages and many more up till "#warning "Using deprecated NumPy API, disable it by " \ [18:49:05] ^ [18:49:05] x86_64-linux-gnu-g++: internal compiler error: Killed (program cc1plus)". [18:49:22] I googled it but I can't find a solution. [18:49:43] *googled _them_ [18:57:30] o/ [18:57:38] o/ [18:58:11] o/ halfak :) [18:58:23] I hope you feel better today [18:58:29] pipivoj, can you paste the exact error? [18:58:39] Amir1, I do. Not much time to hack though. Just an hour [18:58:57] it's okay :) [18:59:03] stderr or stdout? [18:59:23] I made some notes in the work log [18:59:23] pipivoj? [18:59:26] stdout is quite large [18:59:49] well, give me the part that is an error :P [19:00:13] halfak: vagrant patch is merged so people can simply use ORES by "enable role ORES" [19:00:19] There goes stderr: [19:00:21] Command "/srv/paws/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-mwanrqr1/scipy/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-2etdf_rv-record/install-record.txt --single-version-externally-managed --compile --install-headers /srv/paws/include/site/python3.4/scipy" failed with error code 1 in /tmp/pip-build-mwanrqr1/scipy [19:01:01] And this is the last part of stdout: [19:01:02] In file included from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/ndarraytypes.h:1781:0, [19:01:02] from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/ndarrayobject.h:18, [19:01:02] from scipy/sparse/sparsetools/sparsetools.h:5, [19:01:03] from scipy/sparse/sparsetools/bsr.cxx:4: [19:01:04] /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] [19:01:10] #warning "Using deprecated NumPy API, disable it by " \ [19:01:12] ^ [19:01:14] x86_64-linux-gnu-g++: internal compiler error: Killed (program cc1plus) [19:01:16] Please submit a full bug report, [19:01:18] with preprocessed source if appropriate. [19:01:20] See for instructions. [19:01:22] Amir1, great! So that will start up a devserver? And how do we deploy a new version of ORES? [19:01:22] error: Command "x86_64-linux-gnu-g++ -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -D__STDC_FORMAT_MACROS=1 -Iscipy/sparse/sparsetools -I/srv/paws/lib/python3.4/site-packages/numpy/core/include -I/usr/include/python3.4m -I/srv/paws/include/python3.4m -c scipy/sparse/sparsetools/bsr.cxx -o build/temp.linux-x86_64-3.4/scipy/sparse/sparsetools/bsr.o" failed with exit sta [19:01:27] tus 4 [19:01:29] [19:01:35] pipivoj, ahhh. Paste service like pastebin.com please :P [19:01:42] ok [19:02:09] it uses ores.wmflabs.org [19:02:28] (the testwiki by default) [19:03:28] Amir1, oH! That'll work. [19:04:19] we can change everything in LocalSettings.php [19:07:06] halfak: btw. It's break between semesters, I've nothing to do except ORES [19:07:56] This'll help with the KDD writing. IME, it's easier to write when it is my primary creative outlet. [19:08:42] halfak: http://pastebin.ca/3348621 [19:08:43] yeah, exactly [19:08:59] I'm planning to get the KDD article to somewhere [19:09:55] pipivoj, not seeing the error. It seems like the end is cut off. [19:10:15] If you could give me /just the error/ -- not the compiling output or warnings -- that would be most helpful. [19:10:39] Well I saved the stderr to a different file and already posted it here. Here, I'll paste it again [19:10:52] Command "/srv/paws/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-mwanrqr1/scipy/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-2etdf_rv-record/install-record.txt --single-version-externally-managed --compile --install-headers /srv/paws/include/site/python3.4/scipy" failed with error code 1 in /tmp/pip-build-mwanrqr1/scipy [19:11:48] ^ that's not enough. That line just says that there was an error. [19:11:55] Hm [19:12:04] How do I get more? [19:12:28] I did "pip install scipy > stdout.txt 2> stderr.txt" [19:13:15] pipivoj, something is wrong with your pasting. I don't see the error you pasted into the chat in your pastebin.ca [19:16:01] well it's the first thing I thought it was error: line 124 [19:16:25] it's not openblass it was openblas :( [19:17:16] 124 has "warning: ‘elsize’ may be used uninitialized in this function" [19:18:06] Woops. No, it has "Block: zptsv" [19:18:57] Could you try copy-pasting the last 100 lines of output that appear in the terminal? [19:19:02] pipivoj, ^ [19:19:49] http://pastebin.ca/3348621 on line 124 is libraries openblas not found in ['/srv/paws/lib', '/usr/local/lib', '/usr/lib', '/usr/lib/x86_64-linux-gnu'] [19:19:52] ok I will [19:20:37] #warning "Using deprecated NumPy API, disable it by " \ [19:20:37] ^ [19:20:37] In file included from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/ndarrayobject.h:27:0, [19:20:37] from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/arrayobject.h:4, [19:20:38] from scipy/sparse/csgraph/_min_spanning_tree.c:242: [19:20:42] /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:1634:1: warning: �_import_array� defined but not used [-Wunused-function] [19:20:45] _import_array(void) [19:20:47] ^ [19:20:49] x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.4/scipy/sparse/csgraph/_min_spanning_tree.o -Lbuild/temp.linux-x86_64-3.4 -o build/lib.linux-x86_64-3.4/scipy/sparse/csgraph/_min_spanning_tree.cpython-34m.so [19:20:54] building 'scipy.sparse.csgraph._reordering' extension [19:20:56] compiling C sources [19:20:58] C compiler: x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC [19:21:01] [19:21:05] compile options: '-I/srv/paws/lib/python3.4/site-packages/numpy/core/include -I/srv/paws/lib/python3.4/site-packages/numpy/core/include -I/usr/include/python3.4m -I/srv/paws/include/python3.4m -c' [19:21:08] x86_64-linux-gnu-gcc: scipy/sparse/csgraph/_reordering.c [19:21:12] In file included from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/ndarraytypes.h:1781:0, [19:21:15] from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/ndarrayobject.h:18, [19:21:18] from /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/arrayobject.h:4, [19:21:21] from scipy/sparse/csgraph/_reordering.c:242: [19:21:23] /srv/paws/lib/python3.4/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_ [19:21:44] Please don't paste in IRC [19:22:11] (1) I can't read it because the formatting is all ruined and (2) it fills our logs with nonsense. [19:31:09] ok sorry [19:32:55] http://pastebin.ca/3348739 [19:39:05] Sorry for delay, my kid was ready for bath so I had to bathe her. [19:46:49] pipivoj, again the last line ends with "[?25h" [19:46:56] It doesn't look like the full error. [19:47:09] Line 119 [19:47:59] Ok I'll try install it again and see what I get. [19:49:28] http://socio-technologist.blogspot.com/2016/01/notes-on-writing-wikipedia-vandalism.html [19:49:31] Amir1, ^ [19:49:41] Will want to do better for the paper, but those are some thoughts. [19:51:07] thanks halfak [19:51:16] blogpost is blocked in Iran [19:51:19] So, it turns out that "recall" has a name, but I'm not quite sure what TP+FR/ALL would be called. [19:51:23] Amir1, lolwat [19:51:29] * halfak emails a PDF [19:51:32] it may take some time until I get a workaround [19:51:35] thanks [19:51:36] :) [19:52:24] Amir1, I think I want to spend some time with the thresholds I proposed. [19:52:54] halfak: FR? [19:52:54] I think that ClueBot doesn't measure "False-positive rate" or "Recall" the way I expect. [19:53:07] Sorry that was supposed to be FP [19:53:11] False-positives [19:53:40] Review-proportion = all true predictions / all obs. when recall is fixed at 95%. [19:54:11] We might drop that recall value. We can, after all, put some review burden on people and their watchlists. [19:54:55] Maybe we should do coloring in ScoredRevisions based on these recall bands. [19:55:03] E.g. 95% is light yellow [19:55:27] Meh. That's not as intuitive [19:55:36] I need to think about that more. [19:56:29] 95% light yellow, 75% orange, 50% red. Around 50%, we should be getting to the things that can be auto-reverted by bots. [19:57:11] we can examine with different thresholds [19:57:17] and see what happens [19:57:40] halfak: I already got some people who can help with reviewing the new datasets for fa.wp [19:57:51] tell me when you deployed a new changeset [19:57:53] Amir1, +1. So one could say, "It's OK if I only find 50% of vandalism, but I want all of the stuff I review to be vandalism" [19:58:08] Amir1, a new campaign? [19:58:09] Sure! [19:58:34] Email sent [19:58:38] you told me that the number is not enough [19:59:20] +1, unless we do some re-weighting of the input set. [19:59:24] I need to explore that. [19:59:38] Right now, none of the items in fawiki's test set are predicted true. [19:59:48] halfak it looks it's the same stuff but without the "[?25h" [20:00:26] Dunno if I should redirect again to a text file. [20:00:37] halfak: regarding the threshold, that's my main reason to implement "soft" and "hard" thresholds for the extension [20:01:53] "internal compiler error: Killed (program cc1plus)" This makes me wonder if it was sent an interrupt. [20:02:06] Amir1, +1 Seems to work well for scored revisions. [20:03:11] pipivoj, looks like you might be running out of memory [20:03:15] http://stackoverflow.com/questions/19595944/trouble-installing-scipy-in-virtualenv-on-a-amazon-ec2-linux-micro-instance [20:03:33] YuviPanda, could comment better. I'm not sure how much memory is available to a PAWS instance. [20:03:47] But you shouldn't need to install scipy now that I think of it. [20:03:57] It should already be installed for revscoring to work. [20:09:56] * pipivoj quitting - will be back [20:10:01] o/ pip [20:10:29] So, I don't see a good way to balance the obs. I could use the same strategy that we use in SVC or we could just adjust how we think of probability. [20:10:35] What do you think Amir1? [20:11:07] If suddenly 50% probability corresponded to a much higher precision, would that be a big problem? [20:12:02] I don't think we should give percentage to people [20:12:05] Either way, it seems that it would be most desirable if tools like the ORES extension and ScoredRevisions could read a machine-readable endpoint to find out where to set its thresholds. [20:12:21] So that we can safely change the underlying scores. [20:12:28] halfak: hmm, installing scipy yourself into PAWS is going to be painful [20:12:36] I should just install sklearn and scipy into it myself shouldn't I [20:12:42] we should find a way to balance that and just let users decide on an intuitive "sensitivity meter" [20:12:46] YuviPanda, +1 [20:12:48] I don't have revscoring installed into it yet [20:12:59] so I guess I'll just do that now [20:13:03] * YuviPanda goes to find that ticket [20:13:10] Amir1, so, the only reason I can see to balance it is if we *are* going to show it to people. [20:13:28] show percentage? [20:13:32] Yeah. [20:13:35] oh no, it's really bad idea [20:13:49] So leave it unbalanced for now? [20:13:55] yeah [20:13:58] that's my idea [20:14:07] This means that anything that is relying on our percentage to be consistent is going to be sad. [20:14:22] E.g. if you still have that bot on fawiki, the score threshold is going to change. [20:14:27] You might even set it at 50% [20:15:25] we should let programmers that use our API know about changes [20:15:32] maybe by sending email to AI-l [20:15:50] Yeah. I'm thinking that this will be a rough switch even if we give lots of notice. We need a more stable output. [20:15:52] Hmmm. [20:16:32] https://phabricator.wikimedia.org/T120317#1933458 [20:16:33] hmm [20:16:36] so [20:16:38] I should install [20:16:43] 'revscoring' and 'wikiclass' modules [20:16:55] here is my idea: we need to have some developers that built upon our tools and then we have users that they have no idea about how ORES works, [20:17:07] the users of tools should not get any percentage from ORES [20:17:21] Amir1, +1 to that. [20:17:39] But we might want them to get a recall or precision -- just not the "probability" field. [20:17:40] they should just get "colors" or radio buttons of "soft and hard" threshold in their preferences [20:18:11] we should give "precision", "recall" to devs [20:18:13] not users [20:18:29] Amir1, users could think effectively about these values, I think. [20:19:00] mostly they don't know about precision or recall [20:19:04] E.g. for a given score you can say (This has a X% chance of actually being vandalism) [20:19:07] I talked to most of them in my wiki [20:19:21] Right now, our probability doesn't really do that, but it could. [20:20:01] I see [20:20:13] in that case we might need scaling [20:20:28] normal scaling [20:20:35] turning these numbers to z scores [20:20:36] So, i think that us *exactly* what we are getting with the Gradient Boost model. [20:21:03] We've been living with crazy scaling issues with SVCs [20:21:24] Moving away from support vectors should make these probability estimates a lot saner. [20:22:35] +1 [20:22:39] let's give it a try [20:23:44] OK. We'll be able to test out in staging first either way. [20:24:40] * halfak continues work on editquality models. [20:24:49] Will load up an fawiki campaign after this is done. [20:25:53] thanks [20:25:54] :) [20:27:55] Our goodfaith classifier in fawiki kicks ass. [20:28:25] yay [20:28:34] we have lots of people working on it [20:28:46] ORES has lots of fans in fa.wp [20:28:47] At 50%, we get 93% recall and 99.9% precision [20:29:16] The damaging classifier isn't as good. Maybe the vandal bot should target the goodfaith model instead :) [20:30:45] Had to restart my modem. [20:31:50] Halfak I had to install scipy before - before the reboots - because I couldn't use sklearn, it said it needed scipy [20:31:57] The same again [20:32:39] Looks like YuviPanda is going to work on it. [20:32:56] He'll get the dependencies for revscoring and that includes sklearn [20:33:05] yeah [20:33:06] so [20:33:17] that's the revscoring and editquality pip packages right? [20:33:30] pipivoj: if I install it in the base everyone'll get it automatically [20:33:43] that would be great [20:34:17] halfak: can you confirm it's just the revscoring and editquality pip packages? :D sorry if you already did and I missed it [20:34:24] Then my notebooks wouldn't complain if reboot happened. [20:34:25] YuviPanda, +1 for pip [20:34:54] ok [20:34:57] let me do [20:36:25] ok so [20:36:26] I'm installing [20:36:43] revscoring editquality wikiclass [20:36:49] pipivoj: would you like any other libraries? [20:36:51] sklearn-nn? [20:36:57] yes [20:37:07] Thanks YuviPanda :) [20:37:09] its name is sklearn-neuralnetwork [20:38:02] YuviPanda, if you have a minute when you are done, could you check out the staging and web nodes. They are running the "uwsgi" service rather than the "uwsgi-ores-web" service. [20:38:15] This messes up our deploy scripts and it seemed weird enough to have you take a look. [20:38:26] I'm worried puppet is doing something weird. [20:38:37] halfak: I responded yesterday [20:38:43] Oh! I missed it. [20:38:44] you're right re: reboot [20:38:53] I killed the uwsgi service and started uwsgi-ores-web [20:39:01] and since we had the lb it was fine to do [20:39:06] haven't on staging [20:39:11] I need to fix puppet to not do that anymore [20:39:21] Great. OK. I'll get the task made. [20:39:28] yeah +1 [20:39:44] not too hard, just ensure that the 'uwsgi' service is dead everywhere [20:40:11] +1 [20:40:41] pipivoj: am doing the install / builds now [20:41:06] great [20:41:32] https://phabricator.wikimedia.org/T124621 [20:41:35] YuviPanda, ^ [20:41:52] thanks halfak [20:42:06] we're using assignment in the labs team to track some stuff, so am leaving this unassigned [20:42:18] plus I hope to convince madhuvishy to do this instead of me :D [20:42:41] I'm happy to bug madhu about it. Could be a fun problem for schana too. [20:42:49] I've got to run now though. [20:42:56] Gotta "go outside" and "be a human" [20:42:57] Psssh [20:42:58] o/ [20:43:40] I should probably do those things too eh [20:46:29] Have to reset my modem again. Brb. [20:47:18] installation is chugging along [20:47:23] one of the negatives of having a low powered CPU [20:47:28] is that docker builds take forever [20:52:21] Back :) [20:57:19] cool [20:57:30] pipivoj: it's still building, going to be at least 30mins maybe more [20:59:21] Ok, I was just wondering what happened cause it was faster with my "user" install. [21:00:26] Ah, ofcourse you are building it I was merely installing it. :) [21:00:45] pipivoj: yeah, this is rebuilding the entire docker image [21:00:55] and since it isn't in my cache anymore it's rebuilding *all the libraries* [21:40:34] (03PS5) 10Ladsgroup: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) [21:40:50] (03CR) 10jenkins-bot: [V: 04-1] Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [21:46:45] (03CR) 10Hoo man: "All other changes look good to me." (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [22:23:06] (03PS6) 10Ladsgroup: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) [22:55:51] (03CR) 10Ladsgroup: Some minor improvements to the database schema (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [23:06:59] (03CR) 10Ladsgroup: [C: 032] Update authors += ladsgroup [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264607 (owner: 10Awight) [23:07:48] :D [23:07:57] (03Merged) 10jenkins-bot: Update authors += ladsgroup [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264607 (owner: 10Awight) [23:15:06] (03CR) 10Hoo man: "re" (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [23:44:51] YuviPanda, still no sign of scipy. Nevermind, will try tomorrow. Bye