[09:32:43] 10Scoring-platform-team, 10Scap, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): Support git-lfs - https://phabricator.wikimedia.org/T180627#4193364 (10demon) >>! In T180627#4192775, @awight wrote: > 15:50 < awight> twentyafterfour: bad news, my test LFS deployment failed to pull the files again >... [13:50:29] halfak: around? [14:03:22] o/ [14:03:26] Hey Amir1 [14:05:00] halfak: Morgen! [14:05:12] Just shared a document with you, please check [14:08:58] OK will look [14:11:16] Ooh I like it. [14:21:10] Amir1, OK if I invite Kaldari? [14:39:40] halfak: sure [15:26:18] 10Scoring-platform-team (Current), 10ORES, 10drafttopic-modeling, 10artificial-intelligence: Check drafttopic model memory usage - https://phabricator.wikimedia.org/T192293#4194371 (10Halfak) Reading your notes, should we expect a ~2x increase in per-worker memory usage (based on a naive interpretation of... [15:32:19] * halfak plans the trips of the next couple of months [17:04:48] halfak: quick question, should I stop deploying the change on enwiki and implement the weighted sum situation first? [17:05:29] I'm not sure. Let's chat about that. [17:05:50] There's a minor amount of computation that must be done on the score doc to get a weighted_sum prediction. [17:06:04] That computation is different from Wikidata, Enwiki, frwiki, etc. [17:06:18] So there will need to be some configuration. [17:06:29] This sounds a little complicated, but not crazy. [17:07:02] Ultimately, you're computing sum(WEIGHT[class]*probability for class, probability in score_doc['probability']) [17:07:17] WEIGHT[class] would come from the relevant wiki's config. [17:07:34] building this is easy [17:07:54] :) [17:07:57] I like you [17:07:59] :D [17:08:26] :D I like to do easy things not things like management :D [17:08:43] So, downside is that it's hard to tell exactly what class got the highest probability. [17:08:47] the thing is it's SWAT right now and I added it to this SWAT so I need to know if I can move forward now [17:09:26] OK. So let's talk about the cost of changing things after you deploy. What would it take to change to weighted_sum later? [17:09:56] Alternatively, if we don't deploy now, how long do you think it will take us to be ready to deploy weighted_sum? [17:10:14] one week at the most [17:10:18] I heard from greg-g that SWATs will continue next week [17:11:15] halfak: changing existing data in the database will be hard [17:11:18] that's the reason [17:12:02] because you don't know what type of data is there, with or without weighted_sum [17:12:20] Gotcha. Let's delay then. When is the next SWAT we might hit? [17:12:22] distinguishing that just by doing select and parsing it is hard [17:12:47] there are three SWATs every day except Friday [17:13:19] halfak: for the record, the windows will be open but don't expect RelEng members to be there :) [17:14:20] * halfak plans some sketchy things for next week [17:14:22] mwahahaha [17:14:22] if no one is there, I deploy :D [17:14:34] * Amir1 high fives halfak [17:14:45] :DD:D:D:D [17:15:08] OK so, we have some config for weighed sum in article quality. Let me grab that and let's chat about it. [17:15:31] but seriously, as an "axillary member of SWAT team" (I invented this title) I do SWAT when no one is doing it [17:16:39] halfak: the only question is that what number we should store? is it between 0-1 or 0-[# of classes] [17:17:04] I'd go with normalized one for several reasons [17:17:35] It *could* be between 0 and 1 [17:17:55] It would be nice to have it be between 0 and # of classes. But that's not critical. [17:18:04] * halfak makes a gist [17:18:16] https://github.com/wiki-ai/articlequality/blob/master/articlequality/utilities/extract_scores.py [17:18:17] There' [17:18:22] s this, but I don't like it. [17:20:07] * halfak works on a gist [17:20:16] It'll be python but I'm sure you can php-ize :D [17:21:33] Amir1: Here's the pad I was mentioning, https://etherpad.wikimedia.org/p/Scoring_platform_hackathon_ideas [17:26:37] Amir1, https://gist.github.com/halfak/b925a2d45a3903a3e10dc5d6cd7c01b1 [17:27:18] normalizing and denormalizing are easy :) [17:28:08] exactly [17:28:57] what I'm worried is that it establishes this way and then we end with models with more than 10 classes which we can't fit into to the database [17:29:04] 10Scoring-platform-team, 10Scap, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): Support git-lfs - https://phabricator.wikimedia.org/T180627#4194644 (10awight) > You know, we should probably install git-lfs everywhere we can, just like git, and do `git lfs install --global` as part of it. That wo... [17:29:25] Amir1, this will work for models with more than 10 classes. [17:30:13] in models with > 10 classes we might end up with weighted sums like 10.099, this can't be stored in ores_classification [17:30:27] Nope. Not with the code I provided. [17:30:35] unless we normalize the weighed sum [17:30:55] is that what you mean? [17:31:26] Oh yes. Check out my code. [17:31:41] I normalize it so that the highest possible value is 1.0 [17:33:18] that was my idea so we are on the same page [17:33:19] \o/ [17:33:32] :) [17:34:09] So then UIs will be able to use the per-wiki config to de-normalize and arrive at a text representation of the predicted class. [17:34:31] E.g. 1.84 = Start (Or maybe a little bit below a "Start" [17:34:42] 1.84 post denormalization [17:34:48] I guess in this case 0.308 = start [17:35:20] I figure that out, that's details [17:35:47] +1 [17:35:56] OK I'm off for lunch then. Thanks Amir1! :) [17:36:09] Have fun, I need to leave soon for the terminal [17:36:14] see you soon [17:41:11] Amir1: for SoS, is the article quality data present now or still pending? [17:41:36] awight: pending as we decided to do weighted sum first [17:41:48] kk [17:41:54] Should I still announce? [17:42:04] no please :) [17:42:38] got it, thanks [17:48:24] (03Abandoned) 10Awight: Update assets submodule with word2vec bin [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/425939 (owner: 10Awight) [17:49:41] halfak: o/ [17:50:30] awight: o/ [17:55:02] codezee: hi! [17:55:11] We're on the brink of getting your model deployed FYI [17:58:12] although I'm not having a stable internet connectivity these days, feel free to shoot me a mail if anything is needed from my side, [17:58:25] will do [18:06:15] halfak: Amir1: codezee: Can someone kick https://github.com/wiki-ai/drafttopic/pull/24 ? [18:06:40] Mostly, I need a merge to trigger the git mirroring between servers... [18:12:59] * awight eyes self-merge button [18:17:55] merged :D [18:18:09] thanks! [18:29:21] 10Scoring-platform-team (Current), 10Release-Engineering-Team: Phabricator repo is not mirroring to gerrit as hoped - https://phabricator.wikimedia.org/T194295#4194783 (10awight) [18:29:40] 10Scoring-platform-team (Current), 10Release-Engineering-Team: Phabricator repo is not mirroring to gerrit as hoped - https://phabricator.wikimedia.org/T194295#4194793 (10awight) [18:29:42] 10Scoring-platform-team (Current), 10ORES, 10drafttopic-modeling, 10Patch-For-Review, 10artificial-intelligence: Deploy drafttopic model to ORES - https://phabricator.wikimedia.org/T176336#4194794 (10awight) [18:33:06] 10Scoring-platform-team (Current), 10ORES, 10drafttopic-modeling, 10artificial-intelligence: Check drafttopic model memory usage - https://phabricator.wikimedia.org/T192293#4194796 (10awight) I'm pretty certain this won't cause 2x overall memory usage, although it is the upper bound. Some of RSS ends up b... [18:35:21] halfak, awight: I've been messing with parsing talk pages more, and one thing that's become clear is that saving the parsed talk pages to a .csv file more or less destroys a large part of the formatting that mwchatter does (since csv's don't save formatting to individual cells). I tried storing in an xlsx file, but the same thing happened. Do either of you have thoughts on how to store the talk page data? [18:36:18] I'm not sure what kind of formatting can be destroyed, can you give an example? [18:36:45] FWIW, we mostly cache stuff as JSON, but that probably has the same issue as what you're seeing. [18:37:07] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10User-Ladsgroup: Make wp10 rows be a squeezed to a weighted sum in ores_classification - https://phabricator.wikimedia.org/T194297#4194811 (10Ladsgroup) [18:37:55] So, this is how it looks before being stored: https://github.com/ewhit51/talkpage_scraper/blob/master/Test_Parse.ipynb [18:38:39] After being stored in csv, all of the spacing/indentation/newline for each user's comments is lost, so it just looks like a block (can be seen here: https://github.com/ewhit51/talkpage_scraper/blob/master/Talk%20Page%20Scraper.ipynb [18:39:22] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10User-Ladsgroup: Add option of keep forever for ores scores - https://phabricator.wikimedia.org/T194298#4194823 (10Ladsgroup) [18:39:50] I could cache as JSON, but I'm mostly concerned about needing to transfer from whatever file I store it in to Qualtrics so that participants can read/rate/react to the talk pages, and I'm not sure it I can do that with a JSON file [18:40:10] ewhit_, I'm guessing there's some normalization going on when converting to the CSV format you are using. [18:40:21] Yes, that's what it looks like [18:40:41] There's nothing inherent in CSV that requires the destruction of whitespace. [18:40:48] This is more a big or intentional limitation. [18:41:09] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10User-Ladsgroup: Write a maintaince script to populate wp10 data for articles - https://phabricator.wikimedia.org/T194300#4194844 (10Ladsgroup) [18:41:10] Hmm. So tabs/indents should be alright as well? [18:41:32] OK so what I'm seeing here is not that line breaks are lost, but rather converted to "\n" [18:41:34] Which is normal. [18:42:02] Tabs should be converted to "\t" [18:42:12] Is there a way to prevent that, or somehow deal with that so the output looks correct? [18:43:05] Well, you can't have a line break in a CSV because line breaks have meaning. [18:43:18] So you'll need to de-convert the line break when reading from the CSV [18:43:22] Mmm, yes, that makes sense [18:43:30] In this case, it looks like you're just reading the raw file. [18:43:37] yes [18:43:44] I imagine csv.reader will de-convert for you. [18:44:02] ok, I'll check on that. Thank you! [18:44:10] No problem :) [19:04:38] awight, anything I can help with re. drafttopic? [19:04:50] Oh! Maybe I can get wmflabs configured for a deployment! [19:06:47] halfak: That would be awesome. [19:06:56] Currently I'm stuck on making the drafttopic submodule, though. [19:07:16] paladox and Hauskatze are helping tweak gerrit permissions... [19:07:20] Anything I can help with? [19:07:30] Probably not on that front [19:08:01] the wheels still need to be rebuilt, if you need something to do while you make a sandwich :) [19:08:26] well, this gerrit bureaucracy is getting worse :| [19:08:32] ugh [19:15:29] awight, I'll let you know if I get to wheels before you do. [19:15:34] Hauskatze huh? [19:15:38] Are we just rebuilding revscoring=2.2.3? [19:15:56] halfak: yes, idk if that comes with other dependencies but I don't think so. [19:16:08] 10Scoring-platform-team (Current), 10Gerrit, 10Release-Engineering-Team: Phabricator repo is not mirroring to gerrit as hoped - https://phabricator.wikimedia.org/T194295#4195003 (10awight) We need a Gerrit admin to CR and submit this, https://gerrit.wikimedia.org/r/#/c/432145/ [19:16:32] I was going to wait for the proper submodule, but considering what a traffic jam this is, now thinking I'll just clone from github and symlink submodules/drafttopic [19:16:41] for wheels-building [19:17:36] * Hauskatze departs for dinner [19:18:00] good luck with the phab-to-gerrit thing, maybe it'd be good to document how the whole thing is done for the future :) [19:18:54] +1 thanks for figuring out the magic incantation! [19:19:44] I need foods. [19:19:51] biab [19:20:16] Also, FYI the services window is overlapping with the MW train today, for some reason. I'm taking that to mean we don't have a window. [19:20:28] I might improvise something later in the day, if the other blockers are resolved. [19:33:08] Arg. need a reboot. BRB [20:00:20] 10Scoring-platform-team (Current), 10Gerrit, 10Release-Engineering-Team: Phabricator repo is not mirroring to gerrit as hoped - https://phabricator.wikimedia.org/T194295#4195134 (10Paladox) 05Open>03Resolved [20:00:31] halfak ^^ [20:00:38] https://gerrit.wikimedia.org/r/plugins/gitiles/scoring/ores/drafttopic/ [20:03:26] Oh good. I'm sure once awight is done getting mob'd he'll pick up where he left off. [20:03:36] I'm working in wmflabs land where we can play fast and loose with github ;) [20:03:49] lol [20:13:47] * halfak tries to find the "assets" repo [20:17:04] * halfak downloads the GoogleNews vectors yet again [20:23:22] awight_mob: seems it's fixed https://gerrit.wikimedia.org/r/plugins/gitiles/scoring/ores/drafttopic/+/master [20:34:42] IT"S ALIVE [20:34:53] Or rather, I've got the deploy repo working :) [20:35:01] Now to rebuild wheels and work on a staging deployment. [20:35:29] Also, drafttopic is definitely fast-enough [20:35:59] awight https://gerrit.wikimedia.org/r/plugins/gitiles/scoring/ores/drafttopic/ :) [20:36:07] * awight blinks [20:36:17] I was just grinning at the backscroll :) [20:36:21] lol [20:37:01] paladox: Hauskatze: Thanks for jumping on that Diffusion vs. Gerrit glitch, and in record time too! [20:37:21] your welcome :) [20:37:55] paladox: so what had to be done finally? [20:38:06] Hauskatze the change i created :) [20:38:16] twentyafterfour merged it after asking him :) [20:38:20] (hes an admin on gerrit) [20:38:29] ok, /me will remember for the future the procedure [20:38:29] awight, https://github.com/wiki-ai/ores-wmflabs-deploy/pull/96 [20:38:38] Want to check to see if I'm doing something reasonable with word2vec? [20:38:54] It's possible that either patch would have worked, but yeah that was some dastardly magic. [20:39:48] +1 that it should be inherited through "parent" repos, or documented better if it needs to remain kludgey [20:39:55] halfak: Cool, looking now. [20:41:46] halfak: The way I solved word2vec was simply, ln -s submodules/assets/word2vec . [20:42:04] Yes. That looks the same as my strategy [20:42:07] Works great [20:42:38] aha! I was looking at a single commit rather than the full diff [20:43:13] codezee's relative search path ftw, surprisingly! [20:44:22] * awight does not like how ores-labd-deploy is an unrelated repo [20:45:10] Agreed. I'd like to see a better proposal so we can do that instead. [20:45:18] Branches seem to be ... not exactly right. [20:45:45] I started a Phab task to discuss, we should close that loop some day. [20:46:03] I'm +1 branches but happy to hear a challenge [20:46:30] branches are really nice for trying experimental stuff and porting it over when complete, or maintaining long-term variations [20:46:42] seems to fit our usage [20:46:56] also good for comparing across [20:47:14] submodules come from somewhere else in wmflabs [20:47:24] So it's awkward to manage that in a branch. [20:47:29] that's fine, AFAIK [20:47:33] But we could work around that by having submodules always come from prod. [20:47:51] submodule rewriting is a reasonable difference between branches, IMO [20:48:16] It's even possible to switch between them using "git submodule sync" [20:49:29] * halfak runs "make deployment_wheels" [20:49:32] it's happening [20:49:38] :) [20:50:32] I never could have guessed, after two years spent writing makefiles, that I would still enjoy them 20 years later. [20:51:58] (03PS2) 10Awight: Provision the drafttopic model [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432000 (https://phabricator.wikimedia.org/T176336) [20:52:34] halfak: Ah, I have precache configured to score drafttopic. ^ [20:52:38] should we not? [20:52:48] Oh yes. I missed that. [20:53:26] branches :p [20:57:20] (03PS2) 10Awight: Point submodules at gerrit [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/429846 (https://phabricator.wikimedia.org/T180627) [20:57:22] (03PS12) 10Awight: Add the assets submodule and word2vec, git-lfs enabled [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/419613 (https://phabricator.wikimedia.org/T180627) [20:57:24] (03PS3) 10Awight: Provision the drafttopic model [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432000 (https://phabricator.wikimedia.org/T176336) [20:57:27] halfak: would you be so kind as to CR ^ [21:03:39] Sure. Just finishing up wheels. I don't want to lose my train of thought [21:03:56] all gu:t [21:05:46] ah I'm jumping the gun anyway, trying to deploy to beta but we need the new wheels [21:06:13] IIRC, revscoring 2.2.3 just patches an edge case where the text is empty when we try to extract features. [21:06:23] Gotcha. [21:06:51] * awight believes he can hear gerrit groaning about having targeted it with git-lfs traffic [21:11:33] (03Abandoned) 10Awight: Put all the wheel tools into the Makefile [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/391562 (https://phabricator.wikimedia.org/T180496) (owner: 10Awight) [21:12:28] (03Abandoned) 10Awight: New ORES codfw cluster isn't provisioned yet [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/387811 (https://phabricator.wikimedia.org/T165170) (owner: 10Awight) [21:12:49] (03Abandoned) 10Awight: Bump revscoring and ores [services/ores/deploy] (CELERY_4) - 10https://gerrit.wikimedia.org/r/391264 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [21:14:20] (03Abandoned) 10Awight: [WIP] Send logs to logstash [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/377553 (https://phabricator.wikimedia.org/T169586) (owner: 10Awight) [21:15:02] halfak: http://ores-beta.wmflabs.org/v3/scores/enwiki/12345/drafttopic [21:15:32] (03PS1) 10Halfak: Updates for revscoring 2.2.3 [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432287 [21:15:33] :D [21:15:50] I've found that considering any topic with probability >= 0.05 is useful [21:15:53] halfak: celery 4? that can't be right. [21:16:23] Oh? We have the requirement in the wmflabs repo. I wonder if that is a remnant from some test. [21:16:26] I was going to ask. [21:16:29] It was in master. [21:16:46] We should build the wheels directly in ores-prod rather than in ores-labs, for this reason [21:16:54] that's why I ported the makefile over [21:17:05] Yeah... well, both share wheels... so [21:17:11] ooh [21:17:14] d'oh [21:17:16] oho [21:17:24] looks like ores requires celery 3.1 [21:18:11] * halfak rebuilds. [21:19:23] U know if pbena is coming to Barcelona? [21:20:01] Not sure. I would expect him though. [21:20:52] Just idly thinking about how cool sort-or-filter-by-topic might be [21:22:06] halfak: fwiw this is what I get when rebuilding the wheels: https://phabricator.wikimedia.org/P7109 [21:22:16] awight, write down your ideas :D [21:23:16] For topic sorting :D [21:23:20] The thresholds stuff must already support the drafttopic model, I guess? [21:23:29] Yes. [21:23:35] reuse ftw [21:23:45] Regretfully it's a little cumbersome, but not crazy. [21:23:58] class names are verbose! [21:24:03] If you want I can commit the wheels changes I made ^ [21:24:15] I'm just finishing up wheels [21:24:17] kk [21:24:23] Get offa my cookie! [21:24:52] lol [21:25:33] https://orig00.deviantart.net/b663/f/2013/156/1/d/cookie_licking_by_snowieguns-d67zhja.jpg [21:26:27] (03PS2) 10Halfak: Updates for revscoring 2.2.3 [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432287 [21:26:32] OK should be good now. [21:27:39] I got slightly different wheels, which is strange. But I accept. [21:27:46] They look safe either way. [21:27:53] (03CR) 10Awight: [V: 032 C: 032] Updates for revscoring 2.2.3 [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432287 (owner: 10Halfak) [21:28:02] K i'll bump prod [21:28:23] lol we don't have git-lfs on stat1005 [21:28:24] :| [21:29:10] * awight scowls [21:29:25] halfak: naw, it's there [21:29:51] Hmm [21:30:05] (03PS4) 10Awight: Provision the drafttopic model [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432000 (https://phabricator.wikimedia.org/T176336) [21:30:44] ahh I had the command wrong [21:34:18] 10Scoring-platform-team: deployment-ores01 needs a bigger disk - https://phabricator.wikimedia.org/T194315#4195429 (10awight) [21:50:40] awight, confirmed that those wheels *work* [21:50:42] :) [21:50:50] OK back to reviewing. [21:52:54] (03CR) 10Halfak: [V: 032 C: 032] Provision the drafttopic model [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432000 (https://phabricator.wikimedia.org/T176336) (owner: 10Awight) [21:53:26] (03CR) 10Halfak: [V: 032 C: 032] Add the assets submodule and word2vec, git-lfs enabled [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/419613 (https://phabricator.wikimedia.org/T180627) (owner: 10Awight) [21:53:36] Hmm. Seems like I might have done this in a weird order. [21:54:31] (03CR) 10Halfak: [V: 032 C: 032] Point submodules at gerrit [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/429846 (https://phabricator.wikimedia.org/T180627) (owner: 10Awight) [21:54:40] ^ Way less gross than what was happening. [21:54:53] I've got a 30 minute meeting and then I'm going to run away for the day. [21:57:37] what is less gross? [21:58:56] cool, we're merged. [21:59:01] I'll see about production... [21:59:19] URL rewriting is gross :) [22:00:09] oh yeah dang [22:15:29] (03PS1) 10Awight: Remove scap submodule cruft [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432300 [22:16:39] halfak: ^ [22:17:09] (03CR) 10Halfak: [V: 032 C: 032] Remove scap submodule cruft [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432300 (owner: 10Awight) [22:17:14] ty [22:26:15] halfak: We have a wheel conflict [22:26:22] ../articlequality/requirements.txt:mwreverts [22:26:22] ../editquality/requirements.txt:mwreverts >= 0.0.6, < 0.0.999 [22:26:41] iono why but that gave us mwreverts 0.0 and 0.1 [22:26:45] why is that a conflict? [22:26:48] Ohhhh [22:27:14] it should be... something about our makefile, I guess [22:27:26] can I upgrade editquality, or should I remove the newer lib? [22:28:32] (03PS1) 10Awight: Remove newer mwreverts until we need it [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432307 [22:28:37] halfak: ^ [22:29:13] I'm surprised labs worked, are we not using scap there? [22:29:57] No scap there [22:30:03] aha k [22:30:14] (03CR) 10Halfak: [C: 032] Remove newer mwreverts until we need it [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432307 (owner: 10Awight) [22:30:19] (03CR) 10Halfak: [V: 032 C: 032] Remove newer mwreverts until we need it [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432307 (owner: 10Awight) [22:30:26] Ahh yes. We can run without that. [22:30:36] (03PS1) 10Awight: Bump wheels [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432308 [22:30:51] (03CR) 10Awight: [V: 032 C: 032] Bump wheels [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432308 (owner: 10Awight) [22:36:37] OK I'm out of here. [22:36:39] o/ [22:36:49] * halAFK weeds the yard. [22:39:18] (03PS1) 10Awight: Remove another duplicate wheel [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432311 [22:39:28] (03CR) 10Awight: [V: 032 C: 032] Remove another duplicate wheel [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/432311 (owner: 10Awight) [22:39:48] (03PS1) 10Awight: Bump wheels [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432312 [22:39:58] (03CR) 10Awight: [V: 032 C: 032] Bump wheels [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/432312 (owner: 10Awight) [22:41:42] anybody here familiar with SQuAD? https://rajpurkar.github.io/SQuAD-explorer/ [22:42:52] it's an AI training set (and secret test set) aimed at answering natural language questions by processing Wikipedia articles. [22:43:40] Interesting, I wonder if they might consider doing something useful :) [22:44:25] I can't tell whether there are any open source models that have been submitted, but I really want to figure out how to take whatever the best-in-class open source model is, and connect that up to a voice interface for Mycroft AI [22:44:47] (My new open source, open hardware voice assistant came in the mail today!) [22:45:36] (If you say 'tell me about [topic]', it can read from the Wikipedia article, but that's just scratching the surface) [22:49:45] 10Scoring-platform-team (Current), 10drafttopic-modeling: Drafttopic breaks ?features parameter - https://phabricator.wikimedia.org/T194322#4195651 (10awight) [22:50:12] halAFK: ^ blocker? [22:50:24] I'm not sure if anything is relying on "features" yet? [22:51:10] awight: I rely on features for wp10 model. [22:51:29] not for drafttopic though. [22:51:37] Do you get features for all models, or like /v3/scores/12345/wp10?features=1 ? [22:52:08] still using v2: https://ores.wikimedia.org/v2/scores/enwiki/wp10/123456/?features [22:52:16] ok that should still work... [23:26:08] ORES/maintenance/PopulateDatabase.php broke [23:26:16] It tries to get info from the ORES API for all modules, even ones that are disabled [23:26:23] 10Scoring-platform-team: deployment-ores01 needs a bigger disk - https://phabricator.wikimedia.org/T194315#4195784 (10awight) Strange, every time I recreate the machine as a "large", which is supposed to have a 80GB disk, I only see a 20GB root partition. [23:26:37] arg [23:27:51] RoanKattouw: This was the draftquality backfill? [23:28:00] Maybe? [23:28:06] I noticed you just deployed an ORES change [23:28:11] But honestly I don't think so [23:28:11] Look: [23:28:27] AFAIK, we don't run PopulateDatabase on a cron [23:28:43] https://www.irccloud.com/pastebin/RbqXQ2pv/ [23:29:25] No but I'm running it because I'm deploying ORES to four new wikis [23:29:31] And that requires running PopulateDatabase.php [23:29:46] Which makes an HTTP request to the ORES API for all models that it knows about, even ones that are not enabled [23:29:49] I guess I can hack around this [23:30:51] RoanKattouw: I'm out on a limb, but believe that PopulateDatabase is deprecated. [23:30:58] Amir1: ^ Do you know about this? [23:34:01] 10Scoring-platform-team: deployment-ores01 should get more puppet roles by default - https://phabricator.wikimedia.org/T194315#4195792 (10awight) [23:34:58] RoanKattouw: The missing ores_model rows will be populated once precached scores come back with the new models. [23:35:10] That is why I think PopulateDatabase is no longer. [23:35:22] So what else should I run? [23:35:28] Nothing, AFAIK [23:35:36] I still need something to create the rows in the ores_model table etc [23:35:42] Otherwise I get fatal errors on Special:RC [23:35:46] That's supposed to happen transparently [23:35:50] * awight looks for the code [23:36:35] https://phabricator.wikimedia.org/diffusion/EORS/browse/master/includes/ScoreFetcher.php;6142e3746be8b9bb1d254c009bf4450a5bf5b367$68 [23:37:00] checkModelVersion -> updateModelVersion [23:37:56] Hmm so maybe RC hasn't been ported to this newfangled code [23:38:20] You might need to wait for an edit [23:38:26] Is cawiki S:RC currently broken? [23:38:33] Only on mwdebug [23:38:48] Also! The backtrace for the exception that I got goes through the code you mentioned, right? [23:38:58] * awight double-takes [23:39:23] I found it [23:39:24] https://phabricator.wikimedia.org/diffusion/EORS/browse/master/includes/ScoreFetcher.php;6142e3746be8b9bb1d254c009bf4450a5bf5b367$43 [23:39:26] Line 43 [23:39:30] omg [23:39:38] "if left empty, all configured models are queried" [23:39:43] That is not how you determine which models are configured [23:39:49] Because the data structure looks like this: [23:40:04] https://www.irccloud.com/pastebin/a4XzFHMW/ [23:40:12] OK now I know what to live-hack [23:40:51] * awight hackles raise [23:41:52] getScores must be getting called with $models [23:42:14] I have a fix [23:42:25] I'm here for CR... [23:45:18] 10Scoring-platform-team (Current): ores1001 is in some kind of death spiral - https://phabricator.wikimedia.org/T194329#4195813 (10awight) [23:46:59] (03PS1) 10Catrope: ScoreFetcher: Fix determination of enabled models [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 [23:48:01] awight: --^^ [23:49:23] 10Scoring-platform-team: deployment-ores01 should get more puppet roles by default - https://phabricator.wikimedia.org/T194315#4195836 (10awight) [23:49:32] reading... [23:49:57] Why is this not crashing everywhere... [23:51:27] the heck. See includes/FetchScoreJob.php line 125 [23:51:36] yet we never use the return value!! [23:51:54] (03CR) 10jerkins-bot: [V: 04-1] ScoreFetcher: Fix determination of enabled models [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 (owner: 10Catrope) [23:52:37] ah we do, actually [23:52:43] That's private [23:52:47] And the one I fixed is in a different file [23:52:52] You're right we should reuse that logic though [23:53:48] Just trying to understand why it's not already a flaming dumpster. I think we got lucky. [23:54:22] (03CR) 10Awight: [V: 032 C: 032] "That looks right, thanks!" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 (owner: 10Catrope) [23:54:31] Probably some kind of double-bug. [23:54:36] (03PS2) 10Catrope: ScoreFetcher: Fix determination of enabled models [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 [23:54:38] Three lefts makes a right, I should mention. [23:55:01] (03CR) 10Awight: [C: 032] ScoreFetcher: Fix determination of enabled models [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 (owner: 10Catrope) [23:58:05] (03CR) 10jerkins-bot: [V: 04-1] ScoreFetcher: Fix determination of enabled models [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 (owner: 10Catrope) [23:58:49] Ugh the tests fail on my patch, I'll figure out why later [23:58:57] (03CR) 10jerkins-bot: [V: 04-1] ScoreFetcher: Fix determination of enabled models [extensions/ORES] - 10https://gerrit.wikimedia.org/r/432327 (owner: 10Catrope)