[14:08:24] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027057 (10akosiaris) `Temporary is the new permanent` and by further reading, I understand that updates of that data would be rather infrequent, adding one less incentive in the long run to mi... [14:55:39] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027284 (10awight) Alright, thanks for talking this through. I like the idea of keeping these files in git-lfs, and I'm sure we can iron out any scap glitches. ASAP would be nice, but honestl... [15:50:56] o/ [15:51:06] Snow-pocalypse temporarily mitigated [15:51:14] I was out there for 3 hours. [15:51:18] Almost [15:51:26] And I'm in good shape! [15:51:36] yuck. [15:51:58] It was rain, then it was sort of snowing pellets. Then it transitioned to heavy, grainy snow. [15:52:14] That resulted in the worst thing to shovel. Heavy, hard, and sticky. [16:02:42] grainy snow ? like a hail ? [16:03:19] and no, no we don't have a lot of snow over here so my terminology and imagination is rather limited :-) [16:03:48] but I am sure I beat you when taking about sandy beaches :P [16:03:54] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10User-Sebastian_Berlin-WMSE, and 2 others: Check ORES feedback for possible bugs - https://phabricator.wikimedia.org/T188896#4027680 (10Halfak) [16:03:55] so, how bad is it ? [16:04:50] https://en.wikipedia.org/wiki/Types_of_snow [16:04:51] wow [16:05:14] akosiaris, it is kind of like little bits of hail. [16:05:42] halfak: Just to round out the image, what are you shoveling? A narrow valley to your front door? A driveway? [16:06:10] A driveway, a patio, sidewalks. I like to do a good job for two reasons. [16:06:25] aha, so probably what I would call hail given my lack of experiences. Makes sense [16:06:32] (1) disabled people need the whole sidewalk cleared and (2) if you shovel good every time, it's always easier to shovel again with each snow. [16:06:58] akosiaris, it kind of makes a hissing noise while it is falling. For the most part, snowfall is silent. [16:07:14] * halfak uploads photos. [16:07:30] oh nice appreciated. thanks! [16:08:51] https://imgur.com/a/qVTJO [16:09:33] I didn't get a photo of the back patio or the path to the garage. I need to shovel out my dumpster in the back too so that the city can access it. [16:16:24] awight, looking at T188446 [16:16:27] T188446: Package word2vec binaries - https://phabricator.wikimedia.org/T188446 [16:16:40] It looks like we're going to push on git-lfs. [16:16:51] Do you have a good feeling for which blockers are really blockers? [16:17:01] Ah that’s right. If you don’t shovel to the bottom, you get an ice pack. [16:17:21] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027713 (10Halfak) [16:17:24] 10Scoring-platform-team, 10Gerrit, 10ORES, 10Operations, 10Patch-For-Review: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#4027714 (10Halfak) [16:17:58] halfak: I think so. Scap is the only big blocker. [16:18:07] awight, right. I HATE that ice, so I'm really good. Got screwed earlier this year when I was traveling to SF during a snow storm with a thaw-freeze cycle. I was chipping ice every evening for 4 weeks! [16:18:29] https://phabricator.wikimedia.org/T180628 [16:18:29] halfak: hey around? [16:18:34] Hi Amir1 [16:18:58] We’ll just have to figure out the scap stuff. Luckily, I have a local environment for testing scap, and have been able to get through a few cycles of reviewing other scap features. [16:19:05] Do we actually need https://phabricator.wikimedia.org/T180628? I'm not sure I grok scap. [16:19:06] halfak: IRC Cloud is stupid, can you send me a private message? [16:20:25] halfak: I think we do need the client everywhere. scap doesn’t rsync, it runs command-line git on the target machines. [16:20:35] /o\ [16:21:47] Git-lfs+scap will probably timeout like a holy hand grenade. [16:29:36] awight, or be amazing? [16:29:41] maybe. [16:29:42] lolol [16:29:47] probably not [16:31:21] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027751 (10awight) Another random thought: we can build our repo incrementally, to test scap+git-lfs at each step. # Create a stub repo with scap config. # Convert to git-lfs and add a small fi... [16:41:03] * codezee hopes the improvised solve script doesn't blow up [16:45:18] halfak: the cache in extracted features is a dict of {: } right? [16:45:36] codezee, right [16:45:44] Also, I figured out the problem. :) [16:45:47] I think. [16:46:25] I think the issue is with garbage cleanup not clearing caches after extracting features for a single revision. [16:46:55] halfak: So, I’m running low on articles to ingest. It’s time to synthesize my argument… Any suggestions for how to put together a sciencey argument when all I have are just a heap of random observations? It feels wrong to make the random argument I want to make, and use quotes to support. Feels unsciencey. [16:47:14] halfak: yeah i was heading in that direction bec individual solvers work fine so caches are persisting [16:47:28] codezee: Not sure I sent this to you yet? It might help: https://pypi.python.org/pypi/memory_profiler [16:47:35] cool. I think we're on the same page codezee [16:48:03] awight: i looked at it but using it with drafttopic api proved to be a pain, i could use mprof but it provides profiling time-wise [16:48:05] awight, re. the workshop submission, I usually recommend a tight loop of iteration on an outline before expanding. [16:48:23] codezee: ah, rats! [16:48:28] the "python -m profiler..." version didn't seem to work outof the box bec of the way our scripts are called [16:49:02] awight, I usually tell my grad student collaborators to put together a very short outline -- no more than 20 lines. Bonus points for 10. And then use that to tell me the story. [16:49:08] halfak: for sure +1 on the outline, I’m just surprised that the process is to write the argument as if it’s an editorial. [16:49:08] From there, we iterate. [16:49:31] All manuscripts are stories. [16:49:35] :D [16:49:41] Now I know. [16:54:55] halfak: is it possible that due to multiprocessing its loading the word2vec binary in each thread? because that could also very well explain the issue [16:55:20] codezee, not sure. Run a test :P [16:55:34] codezee, it *shouldn't* need to do that. [16:56:33] the single solver script is painfully slow [16:58:43] codezee: fwiw, you can check that by logging whether the binary is loaded before or after forking. [16:59:33] awight: you mean add a logging code while loading and see how many times it executed? [17:02:00] That’s a thing you could do also [17:02:10] but I meant, log when it’s loaded, and log when forking happens [17:02:28] You can get the python thread ID with a log format string like... [17:02:38] format: "s%(asctime)s %(levelname)s %(name)s [P%(process)d T%(thread)d]: %(message)s" [17:02:53] I guess that would be enough, w/o the fork logging. [17:04:06] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: makefile generate makes impossible rules when no model is available - https://phabricator.wikimedia.org/T188777#4027846 (10Halfak) a:03Halfak [17:04:24] off topic. I have to bring my bike by the shop *again*, all the same stuff is broken [17:04:37] I’ve carried the bike farther than it’s carried me. [17:08:31] cool, i'll try that [17:11:07] yeah that is not the problem, the code itself specifies that loading vectors happens once and those vectors are passed around in the feature thing [17:11:28] PROBLEM - puppet on ORES-worker05.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:11:37] funnily enough that part of it was written by me and I didn't remember :P [17:11:57] :) Nice to find breadcrumbs that are still tasty [17:15:13] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:16:56] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:20:00] I need to relocate and carry the bike back to my sloppy mechanics. argh. [17:26:39] PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:27:09] hmm puppet failing on ores* is that expected? [17:27:54] paladox: Unexpected, but when it’s multiple servers my spidey-senses are not stimulated... [17:28:02] paladox, not sure. Maybe cloud is doing some maintenance? [17:28:03] ah [17:28:07] awight, same for me [17:28:15] halfak i think so, i doint think it's cloud maint [17:28:29] are these stretch instances? [17:29:09] ok, biab. [17:31:08] PROBLEM - puppet on ORES-worker03.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:32:36] PROBLEM - puppet on ORES-worker04.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:34:48] halfak i've notfied the cloud team, which are looking into it now :) [17:34:53] failing on my instances too [17:43:08] PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:47:19] PROBLEM - puppet on ORES-lb02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:50:52] paladox, thank you :) [17:51:00] your welcome :) [17:51:11] halfak: did you know that according to opensym.org, you work at the "Wikipedia Foundation"? [17:51:21] lol [17:51:29] PROBLEM - puppet on ORES-worker05.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:51:38] Speaking of which, I blocked off a bunch of time for the Teahouse paper today. [17:51:42] J-Mo, ^ [17:51:48] cool cool. just about to send you a link to the template [17:51:48] PROBLEM - puppet on ORES-worker06.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:52:04] I'm working in Overleaf, You can sign in with your ORCHID [17:53:11] cool [17:53:35] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:55:14] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:56:59] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:06:44] PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:11:09] PROBLEM - puppet on ORES-worker03.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:12:39] PROBLEM - puppet on ORES-worker04.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:18:40] OK lunch time and then some paper work. [18:19:11] I'll be aiming to get the fawiki stuff ready today. Amir1 can you give me a good name for the pilot labeling campaign? [18:19:18] fawiki article quality ^ [18:23:09] PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:27:24] PROBLEM - puppet on ORES-lb02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:31:29] PROBLEM - puppet on ORES-worker05.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:31:49] PROBLEM - puppet on ORES-worker06.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:33:39] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:35:14] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:36:59] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:40:58] RECOVERY - puppet on ORES-worker05.experimental is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [18:40:59] halfak|Lunch: "کیفیت مقالات (۲۰۱۷)" [18:41:19] means article quality 2017. what do you think? [18:42:21] halfak|Lunch it should be recovering now [18:42:38] RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [18:44:43] RECOVERY - puppet on ORES-worker02.experimental is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [18:46:26] RECOVERY - puppet on ORES-web02.Experimental is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [18:46:44] PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:46:49] RECOVERY - puppet on ORES-lb02.Experimental is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [18:47:47] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028330 (10Halfak) I'd be interested in creating a gerrit-only repo with this type of asset in it. We can always move it to phab later if we want. [18:51:00] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028375 (10mmodell) We'll need to ask @demon what is needed to get a git-lfs repo on gerrit. I think he's out sick today, hopefully he's feeling better tomorrow. [18:51:09] PROBLEM - puppet on ORES-worker03.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:51:18] RECOVERY - puppet on ORES-worker06.experimental is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [18:52:33] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4007994 (10Paladox) @mmodell we support git-lfs in gerrit now. :) Chad installed the plugin a few months ago, and i know the setup to enable repos. Which repo do we want to enable this on?... [18:52:39] PROBLEM - puppet on ORES-worker04.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:54:05] RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:54:37] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028448 (10mmodell) @paladox: I thought it was limited by some permissions in gerrit. Doesn't it need to be enabled separately somehow or do all repos have git-lfs automatically now? [18:55:58] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028484 (10Paladox) @mmodell yeh, we add the repo in project.config in All-Projects. It's not automatically. [18:56:10] RECOVERY - puppet on ORES-worker01.experimental is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [19:01:09] RECOVERY - puppet on ORES-worker04.experimental is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [19:01:20] o/ [19:01:38] RECOVERY - puppet on ORES-worker03.experimental is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [19:01:40] 10Scoring-platform-team, 10Collaboration-Team-Triage, 10MediaWiki-extensions-ORES, 10Regression: ORES extension highlights edits that are patrolled - https://phabricator.wikimedia.org/T187337#4028568 (10Catrope) The big problem here is that the patrol flag is not visible to users who don't have the patrol... [19:02:29] o/ [19:06:04] reeelocating [19:08:18] 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028685 (10Halfak) https://gerrit.wikimedia.org/r/#/projects/research/ores/wheels,dashboards/default is where we want it right now. Would 50GB be too high of a ceiling? That's about 10x what... [19:34:53] 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028824 (10Paladox) @halfak ok increased the resources ^^ [19:37:08] 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028832 (10Halfak) Great. Thank you. It looks like {T180628} is our last blocker until we can start experimenting with this. [19:37:08] halfak: the ores-wheels repo doesn’t seem to be where we need git-lfs, at this point. [19:37:35] awight, I think we should start calling that the "assets" repo and put the word2vec bins in there [19:37:38] Well, it might eventually make sense, but I don’t want to damage any of our production repos. [19:37:59] Oh I see. It might be easier to start with a new repo then? [19:38:05] IMO yes [19:38:20] I'm OK with that. Maybe we can make a repo called "assets" and work from that. [19:38:23] ores-assets [19:38:26] ores/assets whatever [19:38:27] My thinking is that we want to avoid a situation where we suddenly can’t deploy wheels for a month. [19:38:39] +1 I like it [19:38:48] and we can move wheels over there when we’re convinced that it works. [19:39:10] I was asking akosiaris, if we should even be experimenting with this new repo as a standalone thing, with its own scap config... [19:42:25] 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028862 (10awight) @Paladox Apologies, we discussed in IRC and we'd like to change the plan slightly. Please disable git-lfs on the wheels repo, and we'll create a new re... [19:42:57] 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028866 (10Halfak) +1 ^ [19:43:23] halfak: “research/ores/assets” or “mediawiki/services/ores/assets”, you think? [19:43:55] I hate these naming schemes. [19:44:08] What are the other repos named? [19:44:25] 10Scoring-platform-team, 10Collaboration-Team-Triage, 10MediaWiki-extensions-ORES, 10Regression: ORES extension highlights edits that are patrolled - https://phabricator.wikimedia.org/T187337#4028870 (10jmatazzoni) 05Open>03declined I'm going to write a task for what Roan suggests above and decline this. [19:44:27] I took those from existing schemes [19:44:37] research/ores/wheels.git [19:44:42] mediawiki/services/ores/deploy.git [19:44:48] damn. hmm [19:44:54] lessee... [19:45:07] All of these names are bad. [19:45:09] Lets do mediawiki/services/ores/assets I think. [19:45:31] kk, fwiw that makes it parallel to our ores-prod-deploy repo, rather than parallel to current wheels [19:47:32] awight halfak i can create the repo in gerrit if you want? [19:47:57] paladox: ty, we [19:48:08] paladox: we’re still trying to figure out where to put it. Any thoughts? [19:48:34] awight gerrit, but have it sync to phab? [19:48:46] ie gerrit then phab mirrors from it [19:48:49] paladox: I don’t think so, cos Phab isn’t ready for git-lfs [19:48:54] ah [19:48:57] awight gerrit then [19:49:06] halfak: looking through https://gerrit.wikimedia.org/r/#/admin/projects/ for "inspiration". [19:49:12] paladox: +1, we just need to decide on a path [19:49:16] ah [19:50:00] awight if it's for ores, maybe [19:50:05] research/ores/assets ? [19:50:13] or ores/assets? [19:50:25] Seems like it would be better if we had ores be called "services/ores/deploy" and this new one be "services/ores/assets" [19:50:35] The "mediawiki" prefix doesn't really make sense. [19:51:08] Though I suppose, ultimately, ORES is a "mediawiki supporting service" right now. [19:51:14] It's a bit more general than that. [19:51:36] & not quite research either [19:51:46] scoring/ores/assets? [19:52:00] +1 ^^ [19:52:03] awight, I don't dislike that [19:52:18] Are we going to rename all the other ones then too? :) [19:52:31] lol [19:52:34] hell no [19:52:38] Or will we have 3 competing standards? [19:52:42] but we can make new repos with git-lfs there. [19:53:31] scoring/ores/editquality — or maybe just scoring/editquality? [19:53:52] hm bad example, I guess we don’t host that in gerrit. [19:55:01] awight, good to think about though. [19:55:24] mediawiki/extensions/ORES/scoring/research/services/editquality [19:55:33] makes sense to me [19:55:35] ;) [19:55:37] * awight dodges tomatoes [19:55:38] scoring/ores/deploy/, scoring/ores/wheels/, scoring/ores/assets/ [19:55:43] Woops. [19:55:50] Oh wait no. That looks right. [19:55:53] +1 [19:56:15] we can eventually move the research- repo, although it doesn’t need lfs-ification [19:56:16] paladox, if we rename a repo in gerrit, is there a redirect left in the old name? [19:56:25] halfak nope [19:56:30] Can there be? [19:56:40] halfak though you could write in the description the new location [19:56:48] harrr [19:56:49] hahaha [19:57:10] yeah... hmm. [19:57:32] IMO nobody outside our team needs to care that we moved it. [19:57:46] The phab mirror can stay its ground. [19:57:57] GOod point. [19:58:28] paladox, is renaming a repo a thing? Or are we really just creating a new repo and loading in the git history? [19:58:46] seems effectively the same [19:58:52] halfak nope, you would have to create the new repo and copy over the repo. [19:59:03] ie you could git push --mirror [19:59:08] awight, +1. But then there's gerrit stuff in the history :) [19:59:15] ooh nuts [19:59:24] git clone --mirror [20:00:59] I shall go ahead with requesting scoring/ores/assets, I assume? [20:01:25] awight, only if you file a task for renaming all of the other repos. :P [20:01:34] gauntlet thrown. [20:01:38] And then put it on the backlog so we can ignore it officially. [20:01:55] ah but—we’re only renaming the ones that don’t benefit from LFS conversion [20:02:34] Why not rename all of 'em? [20:03:24] maybe I’m wrong about how git-lfs history will work... [20:03:33] but I don’t want us to have the bloated histories [20:03:40] Oh, we'll re-write history [20:03:54] All the commits will have the same stuff, but it will seem like we always had git-lfs :) [20:03:56] AIUI, we re-write history, then need to push that to an empty repo [20:04:09] Oh... That must be a gerrit thing. [20:04:12] I don’t think we can push rewritten history to an existing repo [20:04:19] I’ll ask... [20:04:47] nope you carn't push with a rewritten history [20:04:50] that's a git thing [20:04:54] awight ^^ [20:06:23] paladox, of course you can [20:06:25] push -f [20:06:27] :P [20:06:31] oh [20:06:32] rebase == rewriting history [20:06:36] oh [20:06:59] The only time it won’t work is if you change the root revision. [20:09:01] halfak: I’ll rename wikiclass -> articlequality in the meantime. Was that the name we wanted? [20:10:08] yes. When we make that transition, we should do it all at once. We need to make changes in the code. [20:10:37] +1 [20:11:36] fetch [20:12:10] * awight grabs a bone [20:13:14] 10Scoring-platform-team, 10Gerrit, 10ORES, 10Operations, 10Patch-For-Review: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#4029020 (10awight) We're currently thinking that we want to normalize our repo locations in gerrit, and introduce git-lfs in the new locations.... [20:13:36] awight halfak i can setup those repos [20:13:42] would you like me to do that? [20:14:21] paladox: Thanks! I started adding to the Gerrit/New repositories/Requests page, do you still want that for paperwork reasons? [20:14:36] awight you can if you want :) [20:14:38] yeh [20:14:42] lol [20:14:43] ok [20:15:45] awight i guess you want to have this group https://gerrit.wikimedia.org/r/#/admin/groups/uuid-cddbf2315647ba438f5741826fffaeedfdcdfe8a own the repos [20:15:53] (seems the group is not visable to me) [20:16:01] paladox: perfect, ty [20:16:05] :) [20:16:08] nicely done for not being able to see it ;-) [20:16:34] heh [20:16:43] awight it will inherit from All-Projects [20:16:54] unless you have a project in mind you want it to inherit from? [20:17:19] awight would you like me to also tick "Create initial empty commit"? [20:17:41] paladox: Only for the repos that I’ve commented should be empty. [20:17:53] ok [20:18:42] awight i get "Group research-ores does not exist or is not visible to you." [20:19:13] d’oh [20:19:26] lemme add you. [20:19:50] done. [20:19:56] ok thanks :) [20:20:13] halfak: ^ fyi, temporarily adding paladox to our group ACL [20:20:28] https://gerrit.wikimedia.org/r/#/c/416760/ [20:20:34] i carn't seem to merge that [20:20:37] i can +2 and v+2 [20:21:39] this is strange, as soon as multiprocessing module tries to split observations into child processes, memory usage shoots up [20:21:51] i'll discuss this in tomorrow's backlog [20:22:29] as if 3gb vectors are getting replicated but I read that linux uses copy_on_write semantics and nothing is being written to the vectors as such [20:22:41] codezee: That does sound weird. AFAIK, you’re right about copy-on-write. [20:23:42] just when the code hits "for observation in extractor_pool.imap(extract_and_cache, labelings)" it gets killed without evening solving for any observation [20:23:54] https://github.com/wiki-ai/drafttopic/blob/extract-from-text/drafttopic/utilities/extract_from_text.py#L83 [20:24:21] i see that its not even getting a chance to solve for one vector, so solved vectors filling up the cache is not an issue [20:24:33] I mean * solve for one article [20:26:02] What is len(extractors)? [20:26:53] 8 [20:26:55] 10Scoring-platform-team (Current), 10Gerrit, 10ORES, 10Operations, 10Patch-For-Review: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#4029065 (10awight) [20:26:59] awight: ^ [20:27:36] humm [20:27:58] let me try with 4 [20:28:25] Sounds like 8 should be fine, but yeah good idea. Drop to 1 :) [20:29:48] well with 4 it did get killed but somewhat later after loading the vectors than in case of 8 [20:30:49] trying 1 [20:32:20] :0 :0 this is against logic, even with one extractor its getting killed bec of OOM but logically that should work because the solver script i wrote works fine in a single thread [20:32:36] it seems python's multiprocessing is messing inside the code somewhere with the data [20:32:47] :) I think you have your prey in sight [20:32:50] bugs for dinner ;-) [20:33:15] Does it work if you eliminate the Pool.imap and call directly? [20:33:20] haha, bugs would be bad for me, besides already had dinner long time back :P [20:33:54] yeah, as i mentioned i have a script that does almost the same thing in a single thread reading observation one by one solving and writing bacj [20:34:02] its super slow but is working fine [20:34:38] and RAM mem usage is pegged at 3.62G [20:35:34] that’s intense. [20:37:30] yeah we’d better be able to share that among all workers on a machine. [20:44:40] codezee: What are “labelings”? [20:44:43] https://stackoverflow.com/questions/38084401/when-is-copy-on-write-invoked-for-python-multiprocessing-across-class-methods [20:44:57] apparently that argument will be pickled and duplicated for each process. [20:49:31] awight: labelings are observations [20:50:21] collection of individual entries containing rev_id and text [20:50:36] ooh are there a billion? [20:50:42] 93000 [20:50:59] awight: maybe its the data thats getting replicated, and i'm banging my head on vectors, not sure tho [20:51:24] Can you try with a really short list of labelings? [20:55:09] I’m confused about why the data would be replicated, AIUI you should get a generator which doesn’t actually duplicate that data, and then observations are passed to the 8 threads one at a time, so it’s only pickling and duplicating 1-8 observations at a time. [20:55:30] which is my thinking too [20:55:39] But if there’s a bug where the entire list is passed as an argument to the first extractor… for example, an extra array around the args, [ [1, 2, 3…] ] [20:56:41] trying with 100 labelings [20:58:35] read_observations looks like it does what we expect. Mysterious [21:02:06] even with 100 observations it OOMs but when i limit the vectors to 150000(150thousand) instead of 3000000(3million) it seems to work [21:03:30] argh [21:03:39] So this is a limit on how much of word2vec is loaded? [21:03:52] yes [21:04:32] kk [21:05:07] terrible to hear, it sounds unpleasant to debug Python forking, memory management, and garbage collection. [21:06:38] it seems to work with 1 million vectors and 2 extractors as well as 8 extractors so num extractors might not be a problem [21:08:21] alright going now [21:10:07] good find! [21:10:22] That narrows down what could be going wrong, although nothing is occuring to me :) [21:11:14] yeah its the size of vectors for earlier experiements i was using 150thousand words as i was hoping those many eng words might be useful, even now with 150thousand works good [21:11:44] when we scale up, to say 1 million words, both memory and time becomes an issue [21:11:45] How much free memory does your test machine have? [21:12:25] currently extraction is running with 150k words on ores-stat you can see half is being used [21:12:51] although extraction is not taking up all of that half [21:13:26] I’m baffled... [21:45:20] 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4029180 (10demon) Just following up: 50gb is fine for now sure. We've got several TB of free space :) [22:11:44] OKAY writing stuff done for the day. [22:11:51] Now on to fawiki article quality [22:14:34] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#4029236 (10Halfak) @Ladsgroup, please re-review :) [22:50:33] halfak: I think there is bidi issues, can I make a follow-up PR for that? [22:50:56] until later o/ [22:52:38] Amir1, yes please! [23:20:04] https://github.com/wiki-ai/wikilabels-wmflabs-deploy/pull/45 [23:20:09] halfak: ^ [23:36:01] Thanks Amir1 [23:36:07] I'll aim for a deploy tomorrow :)