[01:11:45] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring: Implement sentences datascources & experiment with normalization. - https://phabricator.wikimedia.org/T148867#2743671 (10Halfak) OK.  So I've been looking at getting clean sentences out of Wikipedia articles.  I get a lot of nonsense around tables.  It would be ni...
[06:25:54] <wikibugs>	 06Revision-Scoring-As-A-Service: Review ORES Grafana metrics - https://phabricator.wikimedia.org/T149015#2743907 (10awight) 05Open>03Resolved a:03awight The new graphs look great!
[06:27:23] <wikibugs>	 06Revision-Scoring-As-A-Service: Review ORES Grafana metrics - https://phabricator.wikimedia.org/T149015#2743922 (10awight) @Halfak  You might want to review the TODOs at [[ https://www.mediawiki.org/wiki/ORES/Metrics | mw:ORES/Metrics ]] as well.
[08:13:20] <HaeB>	 halfak: seen this? https://freedom-to-tinker.com/2016/08/24/language-necessarily-contains-human-biases-and-so-will-machines-trained-on-language-corpora/
[14:59:23] <wikibugs>	 06Revision-Scoring-As-A-Service, 10ORES: Implement datasources_extracted metric - https://phabricator.wikimedia.org/T149199#2745036 (10Halfak)
[14:59:31] <wikibugs>	 06Revision-Scoring-As-A-Service, 10ORES: Implement datasources_extracted metric - https://phabricator.wikimedia.org/T149199#2745051 (10Halfak) https://github.com/wiki-ai/ores/pull/176
[15:00:32] <wikibugs>	 06Revision-Scoring-As-A-Service: Review ORES Grafana metrics - https://phabricator.wikimedia.org/T149015#2739711 (10Halfak) {T149199}
[15:16:27] <wikibugs>	 06Revision-Scoring-As-A-Service: Review ORES Grafana metrics - https://phabricator.wikimedia.org/T149015#2745111 (10Halfak) Also {{done}} for reviewing those todos.
[15:45:54] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Meta ORES: UI for reviewing how ORES classifies you and your stuff - https://phabricator.wikimedia.org/T148700#2745208 (10Capt_Swing) @halfak this is probably part of the plan already, but wanted to lobby for it if it isn't: there are two use cases for refutatio...
[15:48:08] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Meta ORES: UI for reviewing how ORES classifies you and your stuff - https://phabricator.wikimedia.org/T148700#2745213 (10Halfak) Agreed.  I'd include edits (or pages or users) you are concerned about in "your stuff".  I'd like to pursue this, but right now, it...
[15:50:16] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Meta ORES: UI for reviewing how ORES classifies you and your stuff - https://phabricator.wikimedia.org/T148700#2745219 (10Capt_Swing) Would non-engineering support be helpful?
[15:52:21] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Meta ORES: UI for reviewing how ORES classifies you and your stuff - https://phabricator.wikimedia.org/T148700#2745221 (10Halfak) Good Q.  Yes!  I'd like to get this type of system well documented before-hand.  We might even start imagining where we'd put a proo...
[16:38:17] <grrrit-wm>	 (03PS8) 10Anomie: Action API integration for ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/313831 (https://phabricator.wikimedia.org/T143614) 
[16:38:57] <grrrit-wm>	 (03CR) 10Anomie: "I'm hoping the watchlist patch will be merged soon enough to avoid having to split things here." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/313831 (https://phabricator.wikimedia.org/T143614) (owner: 10Anomie)
[16:41:04] <adamwight>	 halfak: tiny question--did you intend to have both bars & lines?  https://grafana.wikimedia.org/dashboard/db/ores?panelId=11&fullscreen&from=1477420624120&to=1477420972342
[16:42:50] <adamwight>	 And, a bigger question, I guess: celery worker threads seem to take about 1GB of resident memory, but only last a few seconds of CPU time.  That seems like a lot of overhead.
[16:45:21] <adamwight>	 meh--just convinced myself that they last a few minutes of wall time, so whatever about the initialization cost.
[16:45:44] <halfak>	 lol woops
[16:45:55] <halfak>	 That looks pretty silly, huh?
[16:46:08] <adamwight>	 a bit ;)
[16:46:36] <halfak>	 adamwight, yeah, we restart celery threads periodically because of some weird internal memory issues. 
[16:46:49] <halfak>	 But they'll score at least 100 times before restarting
[16:46:50] <adamwight>	 good idea
[16:46:52] <halfak>	 Might be 1000
[16:47:41] <halfak>	 Removed the bars from that plot
[16:47:50] <adamwight>	 btw, are we happy with the 1s median response time?
[16:47:56] <adamwight>	 I was going to try to profile a bit, but if that's sounds like reasonable performance, I won't bother
[16:48:35] <halfak>	 adamwight, it's not what I want, but it's pretty good. 
[16:48:38] <adamwight>	 confirmed :)
[16:48:51] <halfak>	 We have a profiler in `revscoring extract` that you can use to explore. 
[16:49:21] <halfak>	 It's hard to just use python's profile to make sense of things.  Our profiler will tell you the time to solve each dependency. 
[16:49:53] <halfak>	 It works nicely for finding out how long it takes to get text (IO) and how long to process that text (CPU) for the two "datasources" that do that
[16:50:55] <adamwight>	 ok great
[16:50:58] <adamwight>	 While I'm causing ADHD, I was wondering...  I made a request to the API, then didn't find my score in the redis cache...
[16:55:06] <adamwight>	 https://ores.wikimedia.org/v2/scores/enwiki/damaging/745065890/ should -> ores:enwiki:damaging:745065890:* AIUI
[16:55:38] <adamwight>	 meanwhile, my browser reports X-Cache:"cp1061 miss, cp2012 miss, cp4002 miss, cp4001 hit/1"
[16:55:39] <adamwight>	 which makes me feel funny.
[16:56:44] <halfak>	 adamwight, need to include the version number of the model in the redis key
[16:56:46] <adamwight>	 Should we be investigating whether we're going through frontend cache by mistake?
[16:57:12] <halfak>	 ores:enwiki:damaging:0.1.2:745065890
[16:57:18] <adamwight>	 yah I thought the wildcard at the end would cover that, e.g. "keys ores:enwiki:damaging:745065890:*"
[16:57:34] <halfak>	 Ohh... does the version come after the ID?
[16:57:41] <adamwight>	 I thought
[16:57:51] <halfak>	 Oh!  It does!
[16:57:52] * adamwight reads code
[16:58:20] <halfak>	 https://github.com/wiki-ai/ores/blob/master/ores/score_caches/redis.py#L42
[16:58:36] <halfak>	 adamwight, are you connected to the right redis port?
[16:58:51] <halfak>	 we have a redis for celery and a redis for caching
[16:59:32] <adamwight>	 I believe: :6380 for the score cache
[16:59:35] <adamwight>	 I see lots of other keys using "randomkey"
[16:59:41] <adamwight>	 e.g. ores:viwiki:reverted:25399141:0.1.2
[17:00:22] <adamwight>	 It's... really strange though that my revision doesn't appear in the score cache.
[17:00:30] <halfak>	 Agreed. 
[17:00:34] <adamwight>	 The TTL on those entries is 15 years or something
[17:00:53] <halfak>	 It uses an LRU strategy to stay within memory
[17:01:04] <adamwight>	 so even if it had been calculated and was being served from mystery varnish, there should still be a key
[17:01:07] <adamwight>	 oh.  that might explain it then
[17:01:14] <adamwight>	 & would also confirm that we're being varnished.
[17:01:43] <halfak>	 Hm... we shouldn't be being varnished, but it could be that we've not deployed the cache-avoiding headers yet
[17:01:43] <adamwight>	 Which is a problem cos it bypasses model version invalidation
[17:02:01] * adamwight is busy making problems to solve :)
[17:02:21] <halfak>	 adamwight, right
[17:02:22] <adamwight>	 I don't think we should have to use anti-caching headers cos we aren't supposed to be included in the varnish "misc" config
[17:02:27] <halfak>	 Looks like we need a deploy
[17:02:35] <halfak>	 We have anti-caching headers merged. 
[17:02:51] <halfak>	 I've been working on a revscoring upgrade that required regenerating all of the models. 
[17:02:56] <halfak>	 So that's due 
[17:03:10] <halfak>	 I should be able to send that to WMFLabs today or tomorrow. 
[17:03:38] <halfak>	 Amir1 is currently looking into why a recent deploy that fixes some logging issues failed. 
[17:03:42] <halfak>	 So we're blocked on that.
[17:04:28] <halfak>	 Example use of @nocache: https://github.com/wiki-ai/ores/blob/master/ores/wsgi/routes/v2/scores.py#L16
[17:04:34] <adamwight>	 kk that would be a good time to test this varnish theory
[17:04:36] <adamwight>	 athough, I was going to simply delete the cached score from redis
[17:04:51] <halfak>	 https://github.com/wiki-ai/ores/blob/master/ores/wsgi/util.py#L89
[17:05:15] <adamwight>	 O_o
[17:05:53] <adamwight>	 We would verify those headers by making a curl request from the labs server, maybe
[17:06:13] <halfak>	 Oh!  You don't think they'd get passed through?
[17:07:09] <adamwight>	 I know the headers would be slightly changed, maybe those would pass through but I wouldn't trust it
[17:09:09] <halfak>	 Seems like they should pass through to such down secondary caches -- like the browser's 
[17:11:38] <halfak>	 I'm going to go grab lunch.  I'll read scrollback when I get back.  Also, I think I'll summarize what's needed in order to get an ORES deploy done. 
[17:11:44] <halfak>	 until then, 
[17:11:49] <adamwight>	 There are directives for the browser, and also for varnish
[17:29:58] <adamwight>	 halfak|Lunch: fwiw, it does look like our cache-control header should be disabling varnish caching: https://github.com/wikimedia/operations-puppet/blob/production/modules/varnish/templates/vcl/wikimedia-common.inc.vcl.erb#L331
[17:35:18] <adamwight>	 It's getting creepy.  https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=cpu_report&s=by+name&c=Misc%2520Web%2520caching%2520cluster%2520ulsfo&tab=m&vn=&hide-hf=false
[17:35:24] <adamwight>	 cp4001: down.
[17:35:34] * adamwight hard refreshes
[17:35:52] <adamwight>	 X-Cache:"cp1061 miss, cp2012 miss, cp4002 miss, cp4001 hit/2"
[17:56:05] <adamwight>	 Oh.  oh
[17:56:26] <adamwight>	 I must be on the wrong boxes?
[17:56:38] <adamwight>	 Is this the labs or the production project?  https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores
[18:02:19] <wikibugs>	 06Revision-Scoring-As-A-Service: Spike: Is Varnish caching ORES responses? - https://phabricator.wikimedia.org/T149223#2745755 (10awight)
[18:03:50] <wikibugs>	 06Revision-Scoring-As-A-Service: Spike: Is Varnish caching ORES responses? - https://phabricator.wikimedia.org/T149223#2745773 (10awight) I think I must be looking at staging boxes.  Can someone help me access the production cluster?
[18:05:38] <ToAruShiroiNeko>	 hmm varnish + nginx of ores caching
[18:06:01] <adamwight>	 ToAruShiroiNeko: hi!
[18:06:06] <ToAruShiroiNeko>	 greetings
[18:06:19] <adamwight>	 yeah it's causing me a raised eyebrow
[18:06:39] <adamwight>	 most of the creepy is probably explained by accidentally probing the staging boxes
[18:07:06] <adamwight>	 unfortunately I have to run for a few hours...
[18:29:25] <halfak>	 I'm back, but I got hammered by messages so I'm working through them now
[20:24:57] <wikibugs>	 06Revision-Scoring-As-A-Service, 10rsaas-articlequality , 03Research-and-Data-2016-17-Q1, 15User-Ladsgroup: Generate recent article quality scores for English Wikipedia - https://phabricator.wikimedia.org/T135684#2746284 (10Halfak) (145058 + 33858) / (27118 + 5820) = 5.43 X as many GA+ articles
[22:42:04] <wikibugs>	 06Revision-Scoring-As-A-Service, 10Wikilabels: Load new stratified edit_type campaign - https://phabricator.wikimedia.org/T149256#2746772 (10Halfak)
[22:42:12] <wikibugs>	 06Revision-Scoring-As-A-Service, 10Wikilabels: Load new stratified edit_type campaign - https://phabricator.wikimedia.org/T149256#2746791 (10Halfak) https://meta.wikimedia.org/wiki/Research_talk:Automated_classification_of_edit_types/Work_log/2016-10-26
[22:57:34] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring: Implement sentences datascources & experiment with normalization. - https://phabricator.wikimedia.org/T148867#2746866 (10Halfak) OK. I've managed to build a simple sub-parser for handling tables.  Essentially, I'll parse an entire table into one giant "sentence"....