[00:36:37] wiki-ai/revscoring#1509 (typos - e0802c2 : Adam Wight): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/411065095 [00:49:15] wiki-ai/revscoring#1511 (joblib_pickling - 3618211 : Adam Wight): The build failed. https://travis-ci.org/wiki-ai/revscoring/builds/411067732 [01:06:42] wiki-ai/revscoring#1513 (joblib_pickling - 7365986 : Adam Wight): The build was fixed. https://travis-ci.org/wiki-ai/revscoring/builds/411070946 [04:59:15] 10Scoring-platform-team, 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10Marostegui) >>! In T200297#4466074, @awight wrote: >>>! In T200297#4464608, @Marostegui wrote: >> What does: "our r... [08:16:10] (03PS1) 10Legoktm: Define NS_JADE_TALK for .namespaces.php [extensions/JADE] - 10https://gerrit.wikimedia.org/r/449957 (https://phabricator.wikimedia.org/T200978) [08:39:17] (03CR) 10Addshore: [C: 032] Define NS_JADE_TALK for .namespaces.php [extensions/JADE] - 10https://gerrit.wikimedia.org/r/449957 (https://phabricator.wikimedia.org/T200978) (owner: 10Legoktm) [08:45:42] (03Merged) 10jenkins-bot: Define NS_JADE_TALK for .namespaces.php [extensions/JADE] - 10https://gerrit.wikimedia.org/r/449957 (https://phabricator.wikimedia.org/T200978) (owner: 10Legoktm) [08:50:26] (03CR) 10jenkins-bot: Define NS_JADE_TALK for .namespaces.php [extensions/JADE] - 10https://gerrit.wikimedia.org/r/449957 (https://phabricator.wikimedia.org/T200978) (owner: 10Legoktm) [10:32:30] o/ [10:32:42] https://github.com/wiki-ai/ores/pull/257#issuecomment-409506523 [11:09:43] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ORES/+/449768 [13:18:29] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: [Spec] Use `reverted` models in ORES review tool - https://phabricator.wikimedia.org/T146378 (10Ladsgroup) @jmatazzoni: Sorry for bothering. I didn't know that. So reverted model is... [13:43:26] o/ [14:15:45] halfak: easy merge: https://github.com/wiki-ai/ores/pull/257#issuecomment-409506523 [14:18:03] I talked to bblack today, It seems it's good to go, I looked at PoolCounter stuff and it's rather easy, scary but easy [14:19:00] Cool! [14:19:03] I'll have a look [14:19:11] I'm digging into the remaining mwbase issue [14:19:44] {{merged}} [14:19:45] 10[1] 04https://meta.wikimedia.org/wiki/Template:merged [14:20:09] Confirmed that travis is in fact getting the new version of mwbase. [14:20:17] Maybe the new version of mwbase doesn't have the fix in place? [14:21:04] afk for some stuff [14:42:10] I have confirmed that the 0.1.1 wheel does have the fix in place. [14:42:13] Strange. [14:47:02] OK that's it. I'm installing python 3.4 [14:47:13] No more of checking this with travis only >:( [15:18:20] back now [15:31:23] awight: hey, do you have time to check https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ORES/+/449768 ? [15:31:58] * halfak starts own hair on fire [15:32:04] This OrderedDict thing is so weird! [15:32:23] I'm following a recipe for python 3 and it's still broken in 3.4 [15:32:31] Everything works as expected in 3.5 [15:32:39] Maybe 3.4 is buggy? [15:33:46] halfak: AFAIK, up to 3.5 [15:33:58] OrderedDict was built on python [15:34:04] in 3.5 it switched to c [15:34:16] so anything less than 3.5 will fail :/ [15:37:23] But... still it's a simple super() that is failing to call across its own private (mangled) attributes [15:37:48] It's so weird to have it say something like "OrderedDict has no attribute _OrderedDict__map" [15:38:15] But then you can directly call the attribute via __getattribute__("_OrderedDict__map") [15:38:32] And it works! [15:38:35] WTF python [15:38:38] What are you doing? [15:39:49] The exact same line of code works directly on the console! [15:39:51] super(OrderedDict, d1).__getattribute__("_OrderedDict__map") [15:40:16] * halfak pours more gas on head and re-lights [15:51:30] awight, Amir1: why joblib pickling and not something like dill? [15:54:38] oh god that is a dreadful story re. OrderedDict [15:54:59] halfak: Only cos I hadn’t dug into the other alternatives! [15:55:15] * awight grumbles about puns being harder to search for [15:55:25] “dill pickling” not [15:55:46] awight, what did you figure out re. memory sharing and joblib? [15:56:31] I’m not seeing anything particularly exciting jump out of the dill docs [15:56:51] halfak: Haven’t checked yet, these were just random notes I read last thing before knocking off for the night [16:00:10] https://pythonhosted.org/joblib/parallel.html#working-with-numerical-data-in-shared-memory-memmaping [16:00:16] I don’t think we get it for free [16:00:29] Gotcha. Might not be worth merging a change until we have a solid understanding of what is gained. [16:00:31] Plus… we’re probably already sharing the numpy arrays correctly [16:00:38] * halfak runs to meeting [16:00:52] +1, the disk savings are nice but I’d like to time stuff. [16:01:59] Aha! There's disk savings! That's interesting :) [16:02:53] > 2x smaler [16:02:54] ll [16:03:09] just because joblib has inline compression we can enable, so it’s not magic or anything [16:03:36] Everyone says it’s much faster for big numpy arrays, like 1e6 elements [16:04:31] I think I can write a small thing to unpickle and joblib serialize all models, actually. It should be easy to benchmark ORES startup. [16:04:42] If that’s even a thing we care about… I’ll also check memory usage. [16:05:57] kk relocating to the office in a minute. [17:03:46] * halfak is back from meetings [17:04:16] sheesh. BART is the worst [17:05:18] Amir1 & harej, I've talking to VC about the situation with the DBs and she wasn't really aware of the severity. She's asked me to mobilize a larger group to plan next steps. My plan is to get Mark and Corey in a room to talk about what is going on with page/revision. [17:05:22] awight, ^ [17:05:34] ookay, ty [17:05:48] ‘larger group’ sounds like a fail but here’s to trying. [17:06:06] I've also talked to the research folks and apparently Product is also not aware of the issues and how it's likely to block new development work. [17:06:11] I'm not sure what adding more cooks to the kitchen accomplishes [17:06:38] In this case, the platform evolution team needs to have this on their plan because it shouldn't be *our* problem. [17:07:12] And if the situation is really that 0% growth is the only acceptable state, then we need to pull the fire alarm. [17:08:00] YES. that. [17:08:44] And for real, suggesting we use wiki pages at a reasonable and controllable rate should not be causing this kind of pain. It’s almost like some people don’t get the entire idea of a wiki. [17:08:59] It’s not a ice palace to maintain in perpetuity. [17:11:32] (03CR) 10Awight: [C: 032] "That should do it!" (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [17:13:51] awight: at this point, what would you say the odds are of there being a central wiki? [17:14:17] i.e., how much thought should i be investing in figuring out how to make it work? [17:14:20] harej: The odds that we’ll be forced to do that? [17:14:23] Pretty high, sadly. [17:16:01] (03CR) 10Ladsgroup: Join decomposition on maintenance/PurgeScoreCache.php (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [17:16:29] If we do go down that route it’s some extra work we’ll have to sign up for. [17:17:14] It’s so terrible. [17:17:19] The community part, mostly. [17:17:38] harej, I think there's some potential in federation. [17:17:50] (03CR) 10Awight: [C: 032] Join decomposition on maintenance/PurgeScoreCache.php (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [17:18:03] halfak: What do you mean by that? [17:18:10] Like, get other depts to care about our plight? [17:18:13] awight, harej: if we had a good way to federate changes and direct linking between a separate wiki and a client wiki, then we might be able to do what we want. [17:18:46] Looking at Risker's essay, it's really about changes showing up at the right people and the right people having the priv's to deal with problems. [17:19:04] *right places [17:19:29] Does that exist yet? [17:19:33] E.g. wikidata changes are starting to show up in people's watchlists. [17:19:40] I don't think it's something we can just tie into [17:19:56] I think it's a huge overhead if we were to try and tackle it for our use case, but I'm not sure. [17:20:42] If we can piggyback off of the code that allows Wikidata edits to show up in change streams, it makes our work easier. [17:21:00] Amir1 did some work on that [17:21:04] Amir1, any opinion? [17:21:37] yup, it's possible to inject rc records on other wikis, it should be rather straightforward [17:22:10] for wikidata it's more complex for several reason the most important one is that lots of pages might be subscribed to a an entity on wikidata [17:22:41] 10Scoring-platform-team (Current), 10ORES: Explore alternative model serializations - https://phabricator.wikimedia.org/T201047 (10awight) [17:23:13] but if we want to inject to another wiki, we need to define page id (otherwise it doesn't show up in people's watchlist) but I think it's not that hard [17:23:23] and 1:1 relation makes things a little bit easier [17:23:25] Amir1: Do the changes show up in isolation, or as something attached to a particular page? [17:23:47] awight: they show up attached to pages [17:24:04] So we could have judgements of edits of a page on enwiki show up for people who watch the page on enwiki. [17:24:05] * awight looks for examples. [17:24:16] halfak: yup, that would work [17:24:36] And all judgments for enwiki things show up in enwiki RC? [17:24:50] https://en.wikipedia.org/w/index.php?title=Barack_Obama&action=info [17:25:08] halfak: yup, for wikidata they are hidden by default under an option [17:25:19] https://www.irccloud.com/pastebin/2fbi5st0/ [17:26:13] so if someone changes statements of item of Barack Obama on Wikidata, it will show up associated to his article and will show up in watchlist of people who watched the article [17:26:22] Hmm.. It seems that is pretty specific to wikidata [17:26:23] Hum, this should have made wikidata edits show up but nothing is marked with the “D” [17:26:24] https://en.wikipedia.org/wiki/Special:RecentChanges?hidebots=1&hidecategorization=1&limit=500&days=7&enhanced=1&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&urlversion=2 [17:26:39] the RC page suuux [17:27:06] I see some, awight [17:27:25] I got it too, had to remove some filters. Buggy it seems. [17:27:26] Ctrl-F for "(Q" [17:28:39] it's sparse because it's rather narrow and only gets injected when the change affects the page (for example the statement for birth date is used and date of birth has changed on wikidata) [17:28:43] How about this: I work on a contingency plan for a central wiki that addresses as many problems of having a central wiki as possible. [17:28:46] I think it’s just doing an “OR” is all, so wikidata changes were dilute [17:29:07] harej: That sounds great, and I’m sorry about the potential for wasted work. [17:29:21] I don't waste a single thought :) [17:29:29] Good luck planning around the immature community :p that’s like 2 wasted years right there. [17:30:26] Amir1: try hovering over a “D” in the RC feed. I get raw HTML [17:30:29] (03PS1) 10Ladsgroup: Join decomposition on maintenance/PurgeScoreCache.php [extensions/ORES] (wmf/1.32.0-wmf.15) - 10https://gerrit.wikimedia.org/r/450066 (https://phabricator.wikimedia.org/T200680) [17:30:58] Also, the timestamp becomes a link, which is inconsistent with other lines. [17:31:16] awight: nice catch, probably it used to be a XSS vector point :)) [17:31:22] hahaha [17:31:36] This is great that it exists, though! [17:32:21] It’s actually better than the way we would present JADE RCs which were edited on the same wiki…. I wonder if we can still tap into the attach-to-page thing. [17:34:11] * halfak runs to lunch [17:34:15] back soon [17:43:08] 10Scoring-platform-team, 10Cloud-Services, 10Cloud-VPS, 10ORES: Keep wmflabs scoring boxes up-to-date - https://phabricator.wikimedia.org/T168478 (10Ladsgroup) 05Open>03stalled It's blocked on T169247 [17:52:58] wiki-ai/revscoring#1516 (joblib_pickling - 6889cb3 : Adam Wight): The build was broken. https://travis-ci.org/wiki-ai/revscoring/builds/411401223 [18:00:44] 365M submodules/editquality/models [18:00:45] 153M submodules/editquality/models_new [18:00:50] that is a thing. [18:09:16] 10Scoring-platform-team (Current), 10ORES: Explore alternative model serializations - https://phabricator.wikimedia.org/T201047 (10awight) Size change with joblib and its default zlib compression: ``` 121M submodules/articlequality/models 55M submodules/articlequality/models_new 13M submodules/draftquality/m... [18:13:06] (03CR) 10Ladsgroup: [C: 032] "SWAT" [extensions/ORES] (wmf/1.32.0-wmf.15) - 10https://gerrit.wikimedia.org/r/450066 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [18:14:47] (03Merged) 10jenkins-bot: Join decomposition on maintenance/PurgeScoreCache.php [extensions/ORES] (wmf/1.32.0-wmf.15) - 10https://gerrit.wikimedia.org/r/450066 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [18:17:11] Machine learning is so fantastic. [18:17:22] Do you all know about MarI/O and LuigI/O? [18:18:39] (03CR) 10jenkins-bot: Join decomposition on maintenance/PurgeScoreCache.php [extensions/ORES] (wmf/1.32.0-wmf.15) - 10https://gerrit.wikimedia.org/r/450066 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [18:19:51] awight: Awesome! [18:22:44] It immediately deleted all old scores from wikidata, it supposed to take more than a day :D [18:23:14] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10Growth-Team, 10MediaWiki-extensions-ORES, and 3 others: Index on oresc_probability, temporarily or permanently - https://phabricator.wikimedia.org/T175778 (10Ladsgroup) [18:23:18] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10Patch-For-Review, 10User-Ladsgroup: Run PurgeScoreCache.php on all wikis that have ORES enabled - https://phabricator.wikimedia.org/T200680 (10Ladsgroup) 05Open>03Resolved [18:25:04] WORD [18:25:16] harej: link, please [18:26:57] https://www.youtube.com/watch?v=qv6UVOQ0F44 [18:28:47] that's the original concept, this is a live stream of someone applying the same concept but to the original super mario bros https://www.youtube.com/watch?v=ARaeAh_BQt8 [18:36:18] (03CR) 10jerkins-bot: [V: 04-1] Join decomposition on maintenance/PurgeScoreCache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [18:36:57] ridiculous. [18:45:45] 10Scoring-platform-team (Current), 10MediaWiki-Database, 10Schema-change, 10TechCom-RFC (TechCom-Approved), 10User-Ladsgroup: Use index on rc_this_oldid - https://phabricator.wikimedia.org/T139012 (10Ladsgroup) a:03Ladsgroup I implement this in order to improve performance of ores extension. [19:01:50] (03CR) 10Ladsgroup: [C: 032] "Kicking jenkins" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [19:06:04] * awight spins a baseball bat in the background, seeing if jenkins is still moving when Amir1 is done [19:06:54] I was talking about it in releng, it seems we have disk space issues atm :/ [19:09:15] harej: y you don’t like the {{nutshell}}? [19:09:15] 10[2] 10https://meta.wikimedia.org/wiki/Template:nutshell [19:10:02] Ah I see, maybe because it clashed with {{hatnote}} [19:10:02] 10[3] 10https://meta.wikimedia.org/wiki/Template:hatnote [19:10:49] I probably didn't mean to remove it, but in any case it probably should be written as a proper lead section. I'll work on that. [19:14:07] I think it’s good like this, just didn’t thinking about how they would render. [19:14:16] err *hadn’t given thought to [19:14:43] Also, would you say there's a difference conceptually between JADE and the extension called JADE? [19:15:01] I'm thinking of just merging the two pages together, but not if JADE has some existence as a concept outside of the MediaWiki extension [19:15:26] Kind of like how VisualEditor and Wikibase technically exist outside of MediaWiki [19:17:23] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10User-Ladsgroup: Use rc_timestamp index when joining to ores_classification - https://phabricator.wikimedia.org/T138444 (10Ladsgroup) [19:17:43] Ah good question [19:17:58] Most of all, I wanted to separate the technical implementation from the conceptual explanation. [19:38:36] OK I'm starting to think I'm going to drop OrderedDict It's just broken in python 3.4 Yeesh [19:39:17] lol even _eq_ is broken in python 3.4 for OrderedDict [19:39:19] WTF [19:39:31] Alternatively, maybe we could drop support for python 3.4 [19:39:36] what would that entail... Hmmm [19:41:49] I thought WMCS was still on Python 3.4... [19:42:43] harej@tools-bastion-03:~$ python3 [19:42:44] Python 3.4.3 (default, Nov 28 2017, 16:41:13) [19:43:40] Though, the new WMCS images run Python 3.5 [19:44:14] So we'd essentially be dropping support for Toolforge and anything else running Debian... uh Trusty I think [19:47:01] https://packages.debian.org/jessie/python3 [19:47:05] Jessie is on 3.4 [19:47:25] Probably what you were saying, though [19:48:21] I'm done for the day [19:48:22] o/ [19:55:55] 10Scoring-platform-team (Current), 10ORES: Explore alternative model serializations - https://phabricator.wikimedia.org/T201047 (10awight) Runtime profile: ``` pickle: 2729 awight 20 0 2533940 1.770g 31676 S 0.3 5.0 0:27.36 python 2740 awigh... [19:56:00] see ya [20:01:18] halfak: Let me know what you think of https://github.com/wiki-ai/ores-diagrams/blob/master/ores_dependencies.pdf [20:01:54] I’m trying to map the dependencies and components in order to see better what we might be able to extract into upstreamable libs [20:02:47] I'm not sure how the diagram helps, but I do think there are some clear divides in the package. [20:03:03] E.g. we can remove almost all of the wikitext features and revision_oriented datasources. [20:03:08] Those are all mediawiki specific [20:03:20] But all of the meta-features and meta-datasources should stay [20:03:32] Oh! The feature extractor is MW specific. [20:04:03] It looks like I need to change locations. I tried my best to not need to but it looks like I'll need to hop offline for 30 minutes >:( [20:04:15] cool [20:04:51] I’ll look at the feature extractor again, for some reason I thought the MW dependency was inside of the data sources [20:07:11] The feature extractor (api) performs some optimizations for MW specifically. I think we just pull that extractor out and leave the base extractor class in. [20:07:17] OK running away! [20:51:13] (03Merged) 10jenkins-bot: Join decomposition on maintenance/PurgeScoreCache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [20:55:47] Lunching o/ [21:20:58] 10Scoring-platform-team, 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10Halfak) @Marostegui, essentially, we need JADE things to [be wiki pages](https://www.mediawiki.org/wiki/Everything_... [21:39:43] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/450121 (owner: 10L10n-bot) [22:01:29] 10Scoring-platform-team (Current), 10revscoring, 10User-Ladsgroup, 10artificial-intelligence: Rewrite scoring libraries to replace pywikibase with mwbase - https://phabricator.wikimedia.org/T194758 (10Halfak) https://github.com/mediawiki-utilities/python-mwbase/pull/5 I give up [22:03:42] hehe [22:10:54] (03CR) 10jenkins-bot: Join decomposition on maintenance/PurgeScoreCache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/449768 (https://phabricator.wikimedia.org/T200680) (owner: 10Ladsgroup) [22:58:27] 10Scoring-platform-team (Current), 10JADE: Write JADE internal APIs to simplify integrations - https://phabricator.wikimedia.org/T198207 (10awight) [23:01:49] 10Scoring-platform-team (Current), 10ORES: Experiment with LIME integration for ORES, providing explanations for its predictions - https://phabricator.wikimedia.org/T196475 (10awight)