[07:35:14] 10Scoring-platform-team, 10Beta-Cluster-Infrastructure, 10ORES, 10Beta-Cluster-reproducible, 10Wikimedia-log-errors: InvalidArgumentException for betacluster eswiki - https://phabricator.wikimedia.org/T197584#4296536 (10Ladsgroup) Yes, After only one edit, the system finds the new models and update the d... [07:41:14] 10Scoring-platform-team, 10Beta-Cluster-Infrastructure, 10ORES, 10Beta-Cluster-reproducible, 10Wikimedia-log-errors: InvalidArgumentException for betacluster eswiki - https://phabricator.wikimedia.org/T197584#4298806 (10awight) 05Open>03Resolved a:03awight @Ladsgroup This is certainly a corner case... [07:45:42] 10Scoring-platform-team, 10Beta-Cluster-Infrastructure, 10ORES, 10Beta-Cluster-reproducible, 10Wikimedia-log-errors: InvalidArgumentException for betacluster eswiki - https://phabricator.wikimedia.org/T197584#4298811 (10Ladsgroup) I agree with that and we should have some sort of phabricator ticket to tr... [08:03:03] 10Scoring-platform-team, 10MediaWiki-extensions-ORES: Gracefully handle edge case where wiki is not receiving edits - https://phabricator.wikimedia.org/T197647#4298827 (10awight) [08:03:37] 10Scoring-platform-team, 10MediaWiki-extensions-ORES: Gracefully handle edge case where wiki is not receiving edits - https://phabricator.wikimedia.org/T197647#4298843 (10awight) p:05Triage>03Low [08:07:56] (03PS1) 10Awight: Simplify allowed schemas [extensions/JADE] - 10https://gerrit.wikimedia.org/r/440997 [08:19:42] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Puppet has 11 failures. Last run 3 minutes ago with 11 failures. Failed resources (up to 3 shown): Service[nscd],Service[nslcd],Exec[set debconf flag seen for wireshark-common/install-setuid],Service[prometheus-node-exporter] [08:47:42] RECOVERY - puppet on ORES-web02.Experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:55:54] (03PS1) 10Awight: double -> single quotes [extensions/JADE] - 10https://gerrit.wikimedia.org/r/441002 [08:55:56] (03PS1) 10Awight: Validate only one preferred judgment [extensions/JADE] - 10https://gerrit.wikimedia.org/r/441003 [08:56:37] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Puppet has 5 failures. Last run 3 minutes ago with 5 failures. Failed resources (up to 3 shown): Service[rsyslog],Service[diamond],Exec[remove_uwsgi_initd],Service[uwsgi] [08:56:59] (03CR) 10jerkins-bot: [V: 04-1] Validate only one preferred judgment [extensions/JADE] - 10https://gerrit.wikimedia.org/r/441003 (owner: 10Awight) [09:00:05] (03CR) 10jerkins-bot: [V: 04-1] Validate only one preferred judgment [extensions/JADE] - 10https://gerrit.wikimedia.org/r/441003 (owner: 10Awight) [09:12:54] (03PS2) 10Awight: Validate only one preferred judgment [extensions/JADE] - 10https://gerrit.wikimedia.org/r/441003 [09:15:16] 10Scoring-platform-team, 10JADE: Surface JADE validation errors - https://phabricator.wikimedia.org/T197653#4299006 (10awight) [09:47:43] PROBLEM - check users on ORES-web01.Experimental is CRITICAL: connect to address 10.68.17.182 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [09:48:07] PROBLEM - check disk on ORES-web01.Experimental is CRITICAL: connect to address 10.68.17.182 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [09:49:07] PROBLEM - check load on ORES-web01.Experimental is CRITICAL: connect to address 10.68.17.182 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [09:56:39] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: connect to address 10.68.17.182 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [09:58:07] RECOVERY - check disk on ORES-web01.Experimental is OK: DISK OK [09:58:36] RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [09:59:07] RECOVERY - check load on ORES-web01.Experimental is OK: OK - load average: 0.90, 2.57, 1.78 [09:59:43] RECOVERY - check users on ORES-web01.Experimental is OK: USERS OK - 0 users currently logged in [10:26:37] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Puppet has 19 failures. Last run 3 minutes ago with 19 failures. Failed resources (up to 3 shown): Service[ssh],Package[ntp],Service[systemd-timesyncd],Service[exim4] [10:40:20] 10Scoring-platform-team (Current), 10JADE: Write a letter to JADE stakeholders - https://phabricator.wikimedia.org/T197668#4299275 (10awight) [10:41:25] 10Scoring-platform-team, 10Beta-Cluster-Infrastructure, 10ORES, 10Beta-Cluster-reproducible, 10Wikimedia-log-errors: InvalidArgumentException for betacluster eswiki - https://phabricator.wikimedia.org/T197584#4299291 (10MarcoAurelio) Thanks for your assistance :-) [10:54:36] RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:36:37] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10artificial-intelligence: Duplicated feature name in editquality - https://phabricator.wikimedia.org/T197679#4299529 (10awight) [12:41:58] 10Scoring-platform-team (Current), 10ORES, 10editquality-modeling, 10artificial-intelligence: Duplicated feature name in editquality - https://phabricator.wikimedia.org/T197679#4299556 (10awight) [12:42:06] 10Scoring-platform-team (Current), 10ORES, 10editquality-modeling, 10artificial-intelligence: Duplicated feature name in editquality - https://phabricator.wikimedia.org/T197679#4299529 (10awight) https://github.com/wiki-ai/editquality/pull/163 [12:42:54] 10Scoring-platform-team (Current), 10ORES, 10editquality-modeling, 10artificial-intelligence: Duplicated feature name in editquality - https://phabricator.wikimedia.org/T197679#4299559 (10awight) It would be nice if revscoring threw an exception on duplicate feature names. [12:50:31] wiki-ai/editquality#341 (dedupe_name - 00e9310 : Adam Wight): The build failed. https://travis-ci.org/wiki-ai/editquality/builds/394086208 [13:09:27] wiki-ai/editquality#341 (dedupe_name - 00e9310 : Adam Wight): The build failed. https://travis-ci.org/wiki-ai/editquality/builds/394086208 [14:00:05] wiki-ai/editquality#341 (dedupe_name - 00e9310 : Adam Wight): The build passed. https://travis-ci.org/wiki-ai/editquality/builds/394086208 [16:54:48] o/ [16:54:59] o/ [17:06:08] codezee, sorry I didn't get to the draft topic paper yesterday. I'm struggling with interruptions in the SF office. [17:06:20] Today is a holiday so I'm here by myself. So I should make some good progress today. [17:19:10] 10Scoring-platform-team (Current), 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000#4300361 (10Ottomata) Hey hm I just brain bounced with @Jallemandou for a bit, and now I have some... [18:03:54] 10Scoring-platform-team (Current), 10Edit-Review-Improvements-RC-Page, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-This-Quarter), and 4 others: Selenium tests for ORES - https://phabricator.wikimedia.org/T184451#4300491 (10Aklapper) [18:04:35] 10Scoring-platform-team (Current), 10Beta-Cluster-Infrastructure, 10Recommendation-API, 10Patch-For-Review, 10Release-Engineering-Team (Watching / External): What to do with deployment-sca03? - https://phabricator.wikimedia.org/T184501#4300493 (10Aklapper) [18:05:45] 10Scoring-platform-team (Current), 10Beta-Cluster-Infrastructure, 10ORES, 10Puppet, 10User-Ladsgroup: Puppet broken on deployment-ores01 due to missing hieradata - https://phabricator.wikimedia.org/T184478#4300508 (10Aklapper) [18:40:13] o/ awight [19:19:41] halfak: hey! Wanna see something weird... https://github.com/adamwight/ores-lime/blob/master/Explain%20edit%20quality.ipynb [19:21:13] "Identify which features are booleans. Can we introspect the features?" [19:21:23] * awight perks up [19:21:23] Feature.returns has what you want. [19:21:27] oho great [19:21:32] E.g. Feature.returns == bool if it returns a bool [19:22:18] It doesn't look like I can inform LIME about integers vs. floats, but I think the results will be better if I at least mark all the bools [19:23:28] The "explanations" are looking reasonable, but I'm not sure if they're useful. [19:24:10] I need to do some background reading... Seems like positive explanations are more intuitive. [19:26:53] Yeah. I'm not sure how to interpret this exactly. E.g. is 0.03 a change to the log odds? [19:27:11] The raw probability? [19:27:28] I imagine it is using the feature weights + feature values to make some estimates. [19:31:30] I haven't found a rigorous explanation of the explanations yet, only that the float is a "weight" [19:31:35] this is the paper, https://arxiv.org/pdf/1602.04938.pdf [19:34:32] The "compare_injected_features" stuff at the end of my notebook suggests that the weights do seem to make sense locally, that LIME is correctly picking up the features which would affect the outcome the most if perturbed. [19:36:19] There's a datastructrure in the gradient boosting model that has those weights. [19:36:24] Not sure exactly what they mean. [19:36:31] gotta run for lunch! [19:36:39] wat [19:36:41] ok [19:36:52] formula 1 in the paper seems to define the explanation weights [19:38:00] I wanted to use the text explainer, which does something like scores after removing each word in turn, but it's not clear how to do that for a diff. [19:38:14] So I may just try it on draft topic first. [21:02:34] https://github.com/adamwight/ores-lime/blob/master/Explain%20draft%20topic.ipynb [21:02:37] halfak: ^ [21:03:07] It takes 5 minutes to calculate one of these, but it might be useful for improving our model... [21:03:21] Wait... how did you get the words? [21:03:22] "von" is pretty funny [21:03:29] lol @ "von" [21:03:39] I'm pretty sure it just removes one word at a time [21:03:48] Oh! That makes sense. [21:03:54] or eh maybe all occurrences of a word [21:03:55] 'cause we don't use a bag of words. [21:04:00] So it is guessing that we do. [21:04:10] exactly. [21:04:43] This is cool :D [21:05:02] Not sure what to do with it yet, but it is fun. [21:05:16] Yeah, would be cooler if I weren't running a million word2vec scoring passes single-threaded [21:08:38] The words with negative weights are strange, I guess we're normalizing the probabilities to sum to 1 so any word that contributes to another category counts against us. [21:10:10] The explanations would probably be clearer if we could eliminate that effect by letting the probabilities each range 0..1 independently. [21:12:41] I think it would ultimately be the same, [21:12:57] It makes sense if the categories are exclusive. And in this case, they are. [21:13:10] The problem is that the classifier is non-exclusive -- which we don't account for. [21:13:20] But it works OK in practice. [21:15:08] It's seeming increasingly wrong as I think about it... This screws up our thresholds, right? [21:16:48] The threshold for a single label should be independent of what's happening with other labels, but in this case we'll scale the probability down if other labels are also likely. [21:18:21] But I'm confused by "the categories are exclusive", maybe that's where I'm getting tripped up. [21:19:29] oho, I was wrong about the normalization. [21:19:38] np.sum(score(original_text)) -> 1.1168222992594836 [21:19:48] In that case, never mind me. [21:21:09] Now I'm confused about the words with negative weights, but happy to continue not understanding at this point, if you think it makes sense. [21:41:56] awight, actually, that's not true. A label *is* independent of other labels. [21:42:35] I think the words with negative weights suggest that an article is unlikely to be in the target class. [21:43:05] https://en.wikipedia.org/wiki/Von is unlikely to be about biology