[00:00:11] what happens if you rm -rf ~/.env and start over again? [00:00:21] can you try the pip install -r method? [00:05:11] fajne: Another approach you can take in parallel, you'll probably have a much easier time working on ores-misc-01 [00:05:24] ores-misc-01.eqiad.wmflabs [00:06:14] you can have custom code there as well, and the environment is not as exotic. [00:06:30] i am in ores-misc [00:06:34] aha [00:06:58] compute was deleted, wasn't it? [00:07:04] yep [00:08:12] so yeah, i am in misc, in virtualenv, i ran one make target okey, but another one doesn't want to work) [00:09:06] I think your diagnosis is correct [00:09:22] sounds like python dependency hell. [00:09:28] now i am trying to install revscroing, ores and what not in wiki-ai repo [00:10:18] lemme do the same steps as you... [00:11:10] git clone https://github.com/wiki-ai/editquality.git [00:12:20] * awight taps foot [00:12:40] ok, i installed enchant [00:12:56] virtualenv ~/.env -p python3 [00:12:58] cool. [00:13:09] I was shying away from suggesting that, cos it need to be a specific version [00:13:17] (>= 1.6.6, < 1.6.999) [00:13:52] source ~/.env/bin/activate [00:14:10] ok now rather than run setup, I'll try pip [00:14:15] specific ver of enchant? [00:14:21] cd editquality [00:14:22] yeah [00:14:24] i just used pyenchant [00:14:28] pip install -r requirements.txt [00:14:36] now nltk stopwords is the problem [00:14:37] * awight facepalms [00:14:42] omg it's building numpy [00:14:46] oh good! [00:14:52] there's a command for that... [00:15:16] python -m nltk.downloader stopwords [00:15:29] i wonder ho many packages i have to install.. it may very well take the entire day. already took actually [00:15:36] thanks! [00:15:53] it's best for sanity to run this stuff under "screen" btw [00:16:11] that way you can reconnect if the internet or your laptop goes away [00:17:26] looks like we'll have to install the Russian dict as root. [00:17:57] k it's already on the box. [00:18:22] yep. at least doesn't ask yet [00:18:54] it's thinking............................................. [00:19:28] nice. the pip install -r thing I was recommending fell flat on its face. [00:19:35] ImportError: No module named 'numpy' [00:19:52] scipy couldn't be built. [00:19:53] this one i've already installed [00:20:30] numpy, scipy and sklearn are usual suspects [00:21:49] readme of revscoring suggests some commands to run under debian, but the second one gave me an error [00:23:23] Which command? [00:23:41] Run apt-get install aspell-ar aspell-bn myspell-cs myspell-nl myspell-en-us myspell-en-gb myspell-en-au myspell-et voikko-fi myspell-fr myspell-de-at myspell-de-ch myspell-de-de myspell-he myspell-hu aspell-id myspell-it myspell-nb myspell-fa aspell-pl myspell-pt myspell-es aspell-sv aspell-ta myspell-ru myspell-uk hunspell-vi [00:24:13] Shouldn't hurt anything. myspell-ru was installed successfully, I think that's all you need. [00:24:14] wait, with sudo it actually worked. but still did not solve the pyenchant problem [00:24:49] now it's past [00:27:04] Interesting. I just had a breakthrough moment. [00:27:08] pip install wheel [00:27:11] pip install --upgrade pip [00:27:24] after that, I'm able to pip install -r requirements.txt and it uses pre-built wheels for everything. [00:29:02] hm [00:29:27] It sounds like you're past that point, but I'll be sure to add this to our docs [00:30:06] and i'll be sure to ask advice next time earlier [00:30:28] lol it's always reassuring to see other people fail in the same way [00:38:38] fajne: you see lots of dots? [00:38:55] exactly [00:39:12] and d sometimes [00:39:36] d: [00:39:43] my job here is done [00:39:56] hopefully you're in screen cos that extract step takes ages [00:40:13] i am not sure... [00:40:20] gtg but I'll be around tomorrow. good luck! [00:40:29] anyway, we cannot change anything right now [00:40:38] thanks! [00:40:53] 10Scoring-platform-team-Backlog, 10Documentation: Improve documentation about how to install ORES - https://phabricator.wikimedia.org/T170506#3433863 (10awight) [00:45:31] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3433884 (10awight) Update after discussing with @halfak: I'm going to train our trial model using the FR approved revisions, plus th... [00:45:40] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10MediaWiki-extensions-ORES, 10Wikidata, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Support ORES for propagated Wikidata edits - https://phabricator.wikimedia.org/T158025#3433885 (10Mattflaschen-WMF) [00:45:56] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10MediaWiki-extensions-ORES, 10Wikidata, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Support ORES for propagated Wikidata edits - https://phabricator.wikimedia.org/T158025#3023878 (10Mattflaschen-WMF) [14:02:11] Arg! We need to move this meeting time. [14:02:30] * halfak waits for awight to join a 7AM PDT meeting :S [14:11:01] you have meetings at 7am [14:14:11] 10Scoring-platform-team-Backlog, 10Wikidata: Impact of ORES on Wikidata: time-to-revert changes - https://phabricator.wikimedia.org/T141896#3435450 (10Halfak) [14:52:20] halfak: Async conversation about ethical AI conversation... [14:52:46] halfak: I have a random, ad-hoc thought about how to approach this [14:53:11] but ggellerman has much better, systematic suggestions for how to do this bit. [14:54:06] So, my instinct is to find and read a bunch of the literature, maybe giving myself some short and finite deadline [14:54:19] then ask like 100 questions in an etherpad [14:54:41] this seems like the bare minimum homework to "facilitate" [14:55:10] when I send out prompts however, the intro will be intentionally non-prescriptive [14:55:21] like, I won't be asking 100 leading questions. [14:56:11] Grace was suggesting wikimedia-l, which gave me a start [14:57:31] I'd be concerned about people feeling unsafe participating there--but on the other hand, diluting drama with fun convos might be positive [15:00:09] gtg commute [15:38:30] 10Scoring-platform-team, 10draftquality-modeling, 10artificial-intelligence: Test draftquality sentiment feature on Editquality - https://phabricator.wikimedia.org/T170177#3436032 (10Sumit) [16:03:31] arg! had a meeting and now I'm biking to U. Will read scrollback and respond when I get there [16:03:32] o/ [16:38:56] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Define a process for adding ORES filters to new wikis when ORES is enabled on those wikis - https://phabricator.wikimedia.org/T164331#3436361 (10Catrope) @awight: A couple weeks a... [16:39:32] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Add a new config setting to enable ORES UI features - https://phabricator.wikimedia.org/T170500#3436364 (10Catrope) Dupe of {T167908} ? At least its patch does exactly what this t... [16:59:44] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Add a new config setting to enable ORES UI features - https://phabricator.wikimedia.org/T170500#3436451 (10awight) [16:59:48] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017), 10Patch-For-Review: Summarize what it will take to separate product and platform for ORES Extension - https://phabricator.wikimedia.org/T167908#3436454 (10awight) [17:06:34] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017), 10Patch-For-Review: Summarize what it will take to separate product and platform for ORES Extension - https://phabricator.wikimedia.org/T167908#3436494 (10awight) Yes, that's perfect! My curren... [17:07:11] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Add a new config setting to enable ORES UI features - https://phabricator.wikimedia.org/T170500#3436501 (10awight) [17:07:15] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017), 10Patch-For-Review: Summarize what it will take to separate product and platform for ORES Extension - https://phabricator.wikimedia.org/T167908#3436502 (10awight) [17:07:17] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Define a process for adding ORES filters to new wikis when ORES is enabled on those wikis - https://phabricator.wikimedia.org/T164331#3436500 (10awight) [17:14:48] o/ [17:15:00] Ended up joining a meeting half-way through my ride. [17:15:04] That was interesting. [17:15:10] And now my phone is almost dead. [17:49:40] \ [17:49:42] }}} [17:49:42] }> [17:49:43] [TXT] 20150729.txt 2017-03-01 12:07 7.5K [17:49:43] [TXT] 20150730.txt 2017-03-01 12:07 40K [17:49:45] [TXT] 20150731.txt 2017-03-01 12:07 4.4K [17:56:47] tiny hands attack [17:59:42] halfak: I found something juicy--my OED has a few definitions of "ethical" meaning "concerned with the science of ethics", but none that include a virtue judgment, "succeeds in being consistent with some system of ethics". [18:01:37] IMO that would mean that there is no "unethical" or at least it's a neologism, and the opposite of ethical is a-ethical. anethical? [18:02:02] amoral. [18:06:46] funky. https://trends.google.com/trends/explore?date=all&q=unethical,ethical,ethics [18:08:16] school semesters? parliament somethings? [18:10:53] https://books.google.com/ngrams/graph?content=ethics%2C+ethical%2C+unethical&case_insensitive=on&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t4%3B%2Cethics%3B%2Cc0%3B%2Cs0%3B%3Bethics%3B%2Cc0%3B%3BEthics%3B%2Cc0%3B%3BETHICS%3B%2Cc0%3B.t4%3B%2Cethical%3B%2Cc0%3B%2Cs0%3B%3Bethical%3B%2Cc0%3B%3BEthical%3B%2Cc0%3B%3BETHICAL%3B%2Cc0%3B.t4%3B%2Cunethical%3B%2Cc0%3B%2Cs0%3B%3Bunethi [18:10:59] cal%3B%2Cc0%3B%3BUnethical%3B%2Cc0 [18:13:22] "moral" is six times more prevalent than "ethic*" in books, but roughly equal in web searches [18:17:02] (what's the copyright on google graphs?) [18:26:34] ethic_INF,unethic_INF,moral_INF ftw [18:43:17] awight, I think I want to somehow step back from the word "moral" because I'm not sure there's a wrong so much as a better good to strive for. [18:43:31] * halfak tries to think of better words [18:44:45] One thing I like about "the science of ethics" is that I feel it describes one of our key efforts -- identifying what Right(TM) is and building a framework to go in that direction. [18:44:55] Maybe we could focus on the term "best practices" [18:45:03] Or "human-focused" [18:46:20] +100 let's not use "moral", but I wanted to look at its cultural currency [18:47:03] gotcha. [18:56:28] https://books.google.com/ngrams/graph?content=ethical+*_NOUN&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t2%3B%2Cethical%20%2A_NOUN%3B%2Cc0%3B%2Cs0%3B%3Bethical%20principles_NOUN%3B%2Cc0%3B%3Bethical%20standards_NOUN%3B%2Cc0%3B%3Bethical%20theory_NOUN%3B%2Cc0%3B%3Bethical%20system_NOUN%3B%2Cc0%3B%3Bethical%20issues_NOUN%3B%2Cc0%3B%3Bethical%20values_NOUN%3B%2Cc0%3B%3Bethical% [18:56:34] 20problems_NOUN%3B%2Cc0%3B%3Bethical%20code_NOUN%3B%2Cc0%3B%3Bethical%20considerations_NOUN%3B%2Cc0%3B%3Bethical%20life_NOUN%3B%2Cc0 [19:00:35] Our AI doesn't seem to bump up against much in this synthesis, https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Values [19:01:17] I was curious what that would look like... and now wondering how we might eventually anchor our "reference" system of ethics [19:02:33] Anything in Sandvig that is useful there? [19:02:54] Maybe in http://www-personal.umich.edu/~csandvig/research/Auditing%20Algorithms%20--%20Sandvig%20--%20ICA%202014%20Data%20and%20Discrimination%20Preconference.pdf [19:07:13] Yeah the Sandvig was a great place to start. On the practical tip, they recommend analyzing using (deontological) given rules, and by reasoning about consequences. [19:07:23] I'll read that next one, too. [19:07:54] We could look into moral justifications of laws against disparate impact. [19:08:35] * halfak waits forever to download the editquality repo [19:08:37] *sigh* [19:08:38] I was looking around for potential intellectual authorities for our rules... always an interesting question. [19:08:45] This repo is dumb. I want to use gitlfs [19:08:47] >:( [19:09:15] I'm just about to add a ton more model file to it too [19:09:16] https://en.wikipedia.org/wiki/Ten-Point_Program [19:10:14] I'm almost serious in trotting that out to contrast with what we have to work with here [19:10:40] Hmm... we could do a "What we believe" [19:10:51] I don't think we should let all of the vandals out of jail though ;) [19:11:30] lol maybe jail is not improving their lives or ours [19:12:05] +1 the models need not destroy bandwidth around the world [19:13:03] halfak: Do you happen to have any entry points for understanding the world of corporate ethics? [19:13:24] * halfak immediately falls asleep [19:13:38] the pillow bursts into flames [19:13:54] the allegory of the ethicist [19:14:06] This looks a bit rarefied, http://www.journals.uchicago.edu/toc/et/current [19:16:01] ugh. I have to do this in tiny bites [19:16:13] Don't get lost. [19:16:15] It's deep. [19:16:24] There's a ton of practical things we can do right now :) [19:16:36] * halfak likes practical goods [19:16:55] ++ [19:17:16] * halfak builds models for tawiki, elwiki, and bnwiki [19:17:54] loll [19:19:07] Do you agree that the common usage of "ethical" seems to be "does the right things" rather than "thinks about issues of ethics"? [19:19:21] It's weird, cos I cannot prove that to myself using the internet [19:19:55] the n-gram books results especially make it look like "ethical" is only used to denote concern for ethics, in formal writing at least. [19:20:01] http://www.ethicssage.com/2015/01/are-you-an-ethical-person.html [19:20:12] * awight shudders [19:27:32] "ethical person" occurs 1/100th as often as "ethical issues" [19:28:32] Indeed. I still don't really see a problem with meaning "the science of ethics" when we say "ethical" though. [19:28:54] We don't know what Right is or how to achieve it, but we're going to be thinking about it and trying things. [19:29:22] Still, I'd be totally cool with human-focused or "best practices" [19:30:03] fajne, how's the work going? [19:30:14] Were you able to log into ores-misc-01? [19:30:23] sure [19:31:40] making tuning_reports with max_features=null [19:31:55] halfak: Yep I don't see a problem either, sorry if I've changed my position since the last outburst :) [19:31:56] it has been running since 8.30 am [19:32:12] fajne, nice! Was just thinking about those tuning reports. [19:32:29] ... but I want to know what people are primed to hear w.r.t. these words [19:32:36] awight, that's a good point. [19:33:04] fyi https://trends.google.com/trends/explore?date=all&q=ethical%20issues,ethical%20person [19:33:22] Still baffled by the seasonal variations [19:33:57] https://trends.google.com/trends/explore?date=all&q=ethical%20issues,ethical%20person,unethical%20behavior,ethical%20behavior [19:34:16] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017), 10Patch-For-Review: Summarize what it will take to separate product and platform for ORES Extension - https://phabricator.wikimedia.org/T167908#3437413 (10Catrope) >>! In T167908#3436494, @awigh... [19:34:38] just curious, what so ethical/moral are you guys trying to write? [19:34:50] halfak: yah that's what I'm seeing [19:35:27] fajne: o/ I'm doing a tiny bit of background reading, to start a public thread about how we should present our work as engaging with ethics [19:35:52] fajne, specifically with making the nature and processes around ORES transparent and auditable. [19:41:58] 8D [19:43:24] halfak: I can ping you on the task if you prefer, but wanted to highlight this for you, https://phabricator.wikimedia.org/T167908#3437413 [19:43:30] PROBLEM - puppet on ores-web-04 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [19:44:20] actually. I feel pretty comfortable about my understanding of our chat. Replying now. [19:46:19] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q4-Apr-Jun-2017), 10Patch-For-Review: Summarize what it will take to separate product and platform for ORES Extension - https://phabricator.wikimedia.org/T167908#3437430 (10awight) @Catrope Cool. I think I can p... [19:46:58] cool [19:47:00] Thanks [19:57:34] will slowly dump notes into: https://etherpad.wikimedia.org/p/ORES_and_ethics [20:13:57] RECOVERY - puppet on ores-web-04 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [20:33:59] nap time. [21:32:29] wiki-ai/revscoring#1098 (codez266-sqassets - 2db0a06 : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/253391317 [21:32:50] That's right :) [21:32:54] * halfak nods decisively [21:40:26] * halfak watches fajne's tuning jobs chugging away :) [21:43:37] randomly curious, have you looked into doing tuning on hadoop? I've found great benefit in being able to train 100 models in parallel [21:44:35] ebernhardson, I looked into it briefly. It would boost us quite a bit. [21:56:34] fajne, how many models are you tuning? [22:33:00] OK so I need to patch old 1.3.x code so I'm going to work from that branch [22:33:03] This is dumb. [22:33:15] I guess eventually we'll just rebase master onto the 1.3.x branch [22:33:17] >:( [22:35:14] wiki-ai/revscoring#1104 (1.3.x - 87f745f : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/253409321 [22:37:01] wiki-ai/revscoring#1105 (1.3.x - 25d848b : halfak): The build failed. https://travis-ci.org/wiki-ai/revscoring/builds/253409796 [22:37:09] hush travis [22:37:14] I already fixed that [22:39:33] wiki-ai/revscoring#1106 (1.3.x - 540d665 : halfak): The build was broken. https://travis-ci.org/wiki-ai/revscoring/builds/253410549 [22:39:49] you're a liar :P [22:39:52] Oh! I know [22:41:12] there. chill dude [22:44:22] wiki-ai/revscoring#1107 (1.3.x - 0fcea15 : halfak): The build was fixed. https://travis-ci.org/wiki-ai/revscoring/builds/253411641 [22:49:57] OK new models building. :) [22:50:00] I'm off [22:50:02] o/ [22:50:05] have a good one! [23:23:24] halfak: sorry, missed your question. I started doing all but realized it would take forever and focused on ruwiki only. And it still takes forever. [23:25:59] ebernhardson: that's a great idea. I was having a similar thought, but without the layer of standardization... that we should somehow be able to queue lots of these jobs which take forever. [23:26:30] So you're already doing this for search? [23:35:10] I see you mentioning "xgboost" in Phabricator tasks... [23:44:42] As for scikit-learn integration, it looks like was can easily export training scikit models to PMML and run on hadoop, but I'm not seeing general support for training