[14:08:24] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027057 (10akosiaris) `Temporary is the new permanent` and by further reading, I understand that updates of that data would be rather infrequent, adding one less incentive in the long run to mi...
[14:55:39] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027284 (10awight) Alright, thanks for talking this through.  I like the idea of keeping these files in git-lfs, and I'm sure we can iron out any scap glitches.  ASAP would be nice, but honestl...
[15:50:56] <halfak>	 o/
[15:51:06] <halfak>	 Snow-pocalypse temporarily mitigated
[15:51:14] <halfak>	 I was out there for 3 hours. 
[15:51:18] <halfak>	 Almost
[15:51:26] <halfak>	 And I'm in good shape!  
[15:51:36] <halfak>	 yuck. 
[15:51:58] <halfak>	 It was rain, then it was sort of snowing pellets.  Then it transitioned to heavy, grainy snow. 
[15:52:14] <halfak>	 That resulted in the worst thing to shovel.  Heavy, hard, and sticky. 
[16:02:42] <akosiaris>	 grainy snow ? like a hail ?
[16:03:19] <akosiaris>	 and no, no we don't have a lot of snow over here so my terminology and imagination is rather limited :-)
[16:03:48] <akosiaris>	 but I am sure I beat you when taking about sandy beaches :P
[16:03:54] <wikibugs>	 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10User-Sebastian_Berlin-WMSE, and 2 others: Check ORES feedback for possible bugs - https://phabricator.wikimedia.org/T188896#4027680 (10Halfak)
[16:03:55] <akosiaris>	 so, how bad is it ?
[16:04:50] <akosiaris>	 https://en.wikipedia.org/wiki/Types_of_snow
[16:04:51] <akosiaris>	 wow
[16:05:14] <halfak>	 akosiaris, it is kind of like little bits of hail. 
[16:05:42] <awight>	 halfak: Just to round out the image, what are you shoveling?  A narrow valley to your front door?  A driveway?
[16:06:10] <halfak>	 A driveway, a patio, sidewalks.  I like to do a good job for two reasons. 
[16:06:25] <akosiaris>	 aha, so probably what I would call hail given my lack of experiences. Makes sense
[16:06:32] <halfak>	 (1) disabled people need the whole sidewalk cleared and (2) if you shovel good every time, it's always easier to shovel again with each snow. 
[16:06:58] <halfak>	 akosiaris, it kind of makes a hissing noise while it is falling.  For the most part, snowfall is silent. 
[16:07:14] * halfak uploads photos. 
[16:07:30] <akosiaris>	 oh nice appreciated. thanks!
[16:08:51] <halfak>	 https://imgur.com/a/qVTJO
[16:09:33] <halfak>	 I didn't get a photo of the back patio or the path to the garage.  I need to shovel out my dumpster in the back too so that the city can access it.
[16:16:24] <halfak>	 awight, looking at T188446
[16:16:27] <stashbot>	 T188446: Package word2vec binaries - https://phabricator.wikimedia.org/T188446
[16:16:40] <halfak>	 It looks like we're going to push on git-lfs.  
[16:16:51] <halfak>	 Do you have a good feeling for which blockers are really blockers?
[16:17:01] <awight>	 Ah that’s right.  If you don’t shovel to the bottom, you get an ice pack.
[16:17:21] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027713 (10Halfak)
[16:17:24] <wikibugs>	 10Scoring-platform-team, 10Gerrit, 10ORES, 10Operations, 10Patch-For-Review: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#4027714 (10Halfak)
[16:17:58] <awight>	 halfak: I think so.  Scap is the only big blocker.
[16:18:07] <halfak>	 awight, right.  I HATE that ice, so I'm really good.  Got screwed earlier this year when I was traveling to SF during a snow storm with a thaw-freeze cycle.  I was chipping ice every evening for 4 weeks!
[16:18:29] <halfak>	 https://phabricator.wikimedia.org/T180628
[16:18:29] <Amir1>	 halfak: hey around?
[16:18:34] <halfak>	 Hi Amir1 
[16:18:58] <awight>	 We’ll just have to figure out the scap stuff.  Luckily, I have a local environment for testing scap, and have been able to get through a few cycles of reviewing other scap features.
[16:19:05] <halfak>	 Do we actually need https://phabricator.wikimedia.org/T180628?  I'm not sure I grok scap. 
[16:19:06] <Amir1>	 halfak: IRC Cloud is stupid, can you send me a private message?
[16:20:25] <awight>	 halfak: I think we do need the client everywhere.  scap doesn’t rsync, it runs command-line git on the target machines.
[16:20:35] <awight>	  /o\
[16:21:47] <awight>	 Git-lfs+scap will probably timeout like a holy hand grenade.
[16:29:36] <halfak>	 awight, or be amazing?  
[16:29:41] <halfak>	 maybe. 
[16:29:42] <awight>	 lolol
[16:29:47] <halfak>	 probably not
[16:31:21] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4027751 (10awight) Another random thought: we can build our repo incrementally, to test scap+git-lfs at each step. # Create a stub repo with scap config. # Convert to git-lfs and add a small fi...
[16:41:03] * codezee hopes the improvised solve script doesn't blow up
[16:45:18] <codezee>	 halfak: the cache in extracted features is a dict of {<feature_str>: <feature_val>} right?
[16:45:36] <halfak>	 codezee, right
[16:45:44] <halfak>	 Also, I figured out the problem. :) 
[16:45:47] <halfak>	 I think. 
[16:46:25] <halfak>	 I think the issue is with garbage cleanup not clearing caches after extracting features for a single revision. 
[16:46:55] <awight>	 halfak: So, I’m running low on articles to ingest.  It’s time to synthesize my argument…  Any suggestions for how to put together a sciencey argument when all I have are just a heap of random observations?  It feels wrong to make the random argument I want to make, and use quotes to support.  Feels unsciencey.
[16:47:14] <codezee>	 halfak: yeah i was heading in that direction bec individual solvers work fine so caches are persisting
[16:47:28] <awight>	 codezee: Not sure I sent this to you yet?  It might help: https://pypi.python.org/pypi/memory_profiler
[16:47:35] <halfak>	 cool.  I think we're on the same page codezee 
[16:48:03] <codezee>	 awight: i looked at it but using it with drafttopic api proved to be a pain, i could use mprof but it provides profiling time-wise
[16:48:05] <halfak>	 awight, re. the workshop submission, I usually recommend a tight loop of iteration on an outline before expanding. 
[16:48:23] <awight>	 codezee: ah, rats!
[16:48:28] <codezee>	 the "python -m profiler..." version didn't seem to work  outof the box bec of the way our scripts are called
[16:49:02] <halfak>	 awight, I usually tell my grad student collaborators to put together a very short outline -- no more than 20 lines.  Bonus points for 10.  And then use that to tell me the story. 
[16:49:08] <awight>	 halfak: for sure +1 on the outline, I’m just surprised that the process is to write the argument as if it’s an editorial.
[16:49:08] <halfak>	 From there, we iterate. 
[16:49:31] <halfak>	 All manuscripts are stories. 
[16:49:35] <awight>	 :D
[16:49:41] <awight>	 Now I know.
[16:54:55] <codezee>	 halfak: is it possible that due to multiprocessing its loading the word2vec binary in each thread? because that could also very well explain the issue
[16:55:20] <halfak>	 codezee, not sure.  Run a test :P 
[16:55:34] <halfak>	 codezee, it *shouldn't* need to do that. 
[16:56:33] <codezee>	 the single solver script is painfully slow
[16:58:43] <awight>	 codezee: fwiw, you can check that by logging whether the binary is loaded before or after forking.
[16:59:33] <codezee>	 awight: you mean add a logging code while loading and see how many times it executed?
[17:02:00] <awight>	 That’s a thing you could do also
[17:02:10] <awight>	 but I meant, log when it’s loaded, and log when forking happens
[17:02:28] <awight>	 You can get the python thread ID with a log format string like...
[17:02:38] <awight>	     format: "s%(asctime)s %(levelname)s %(name)s [P%(process)d T%(thread)d]: %(message)s"
[17:02:53] <awight>	 I guess that would be enough, w/o the fork logging.
[17:04:06] <wikibugs>	 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: makefile generate makes impossible rules when no model is available - https://phabricator.wikimedia.org/T188777#4027846 (10Halfak) a:03Halfak
[17:04:24] <awight>	 off topic.  I have to bring my bike by the shop *again*, all the same stuff is broken
[17:04:37] <awight>	 I’ve carried the bike farther than it’s carried me.
[17:08:31] <codezee>	 cool, i'll try that
[17:11:07] <codezee>	 yeah that is not the problem, the code itself specifies that loading vectors happens once and those  vectors are passed around in the feature thing
[17:11:28] <icinga2-wm>	 PROBLEM - puppet on ORES-worker05.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:11:37] <codezee>	 funnily enough that part of it was written by me and I didn't remember :P
[17:11:57] <awight>	 :) Nice to find breadcrumbs that are still tasty
[17:15:13] <icinga2-wm>	 PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:16:56] <icinga2-wm>	 PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:20:00] <awight>	 I need to relocate and carry the bike back to my sloppy mechanics.  argh.
[17:26:39] <icinga2-wm>	 PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:27:09] <paladox>	 hmm puppet failing on ores* is that expected?
[17:27:54] <awight>	 paladox: Unexpected, but when it’s multiple servers my spidey-senses are not stimulated...
[17:28:02] <halfak>	 paladox, not sure. Maybe cloud is doing some maintenance?
[17:28:03] <paladox>	 ah
[17:28:07] <halfak>	 awight, same for me
[17:28:15] <paladox>	 halfak i think so, i doint think it's cloud maint
[17:28:29] <paladox>	 are these stretch instances?
[17:29:09] <awight>	 ok, biab.
[17:31:08] <icinga2-wm>	 PROBLEM - puppet on ORES-worker03.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:32:36] <icinga2-wm>	 PROBLEM - puppet on ORES-worker04.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:34:48] <paladox>	 halfak i've notfied the cloud team, which are looking into it now :)
[17:34:53] <paladox>	 failing on my instances too
[17:43:08] <icinga2-wm>	 PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:19] <icinga2-wm>	 PROBLEM - puppet on ORES-lb02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:50:52] <halfak>	 paladox, thank you :) 
[17:51:00] <paladox>	 your welcome :)
[17:51:11] <J-Mo>	 halfak: did you know that according to opensym.org, you work at the "Wikipedia Foundation"?
[17:51:21] <halfak>	 lol
[17:51:29] <icinga2-wm>	 PROBLEM - puppet on ORES-worker05.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:51:38] <halfak>	 Speaking of which, I blocked off a bunch of time for the Teahouse paper today. 
[17:51:42] <halfak>	 J-Mo, ^ 
[17:51:48] <J-Mo>	 cool cool. just about to send you a link to the template
[17:51:48] <icinga2-wm>	 PROBLEM - puppet on ORES-worker06.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:52:04] <J-Mo>	 I'm working in Overleaf, You can sign in with your ORCHID
[17:53:11] <halfak>	 cool
[17:53:35] <icinga2-wm>	 PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:55:14] <icinga2-wm>	 PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:56:59] <icinga2-wm>	 PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:06:44] <icinga2-wm>	 PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:11:09] <icinga2-wm>	 PROBLEM - puppet on ORES-worker03.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:12:39] <icinga2-wm>	 PROBLEM - puppet on ORES-worker04.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:18:40] <halfak>	 OK lunch time and then some paper work. 
[18:19:11] <halfak>	 I'll be aiming to get the fawiki stuff ready today.  Amir1 can you give me a good name for the pilot labeling campaign? 
[18:19:18] <halfak>	 fawiki article quality ^ 
[18:23:09] <icinga2-wm>	 PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:27:24] <icinga2-wm>	 PROBLEM - puppet on ORES-lb02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:31:29] <icinga2-wm>	 PROBLEM - puppet on ORES-worker05.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:31:49] <icinga2-wm>	 PROBLEM - puppet on ORES-worker06.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:33:39] <icinga2-wm>	 PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:35:14] <icinga2-wm>	 PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:36:59] <icinga2-wm>	 PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:40:58] <icinga2-wm>	 RECOVERY - puppet on ORES-worker05.experimental is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:40:59] <Amir1>	 halfak|Lunch: "کیفیت مقالات (۲۰۱۷)"
[18:41:19] <Amir1>	 means article quality 2017. what do you think?
[18:42:21] <paladox>	 halfak|Lunch it should be recovering now
[18:42:38] <icinga2-wm>	 RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:44:43] <icinga2-wm>	 RECOVERY - puppet on ORES-worker02.experimental is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:46:26] <icinga2-wm>	 RECOVERY - puppet on ORES-web02.Experimental is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:46:44] <icinga2-wm>	 PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:46:49] <icinga2-wm>	 RECOVERY - puppet on ORES-lb02.Experimental is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures
[18:47:47] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028330 (10Halfak) I'd be interested in creating a gerrit-only repo with this type of asset in it.  We can always move it to phab later if we want.
[18:51:00] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028375 (10mmodell) We'll need to ask @demon what is needed to get a git-lfs repo on gerrit. I think he's out sick today, hopefully he's feeling better tomorrow.
[18:51:09] <icinga2-wm>	 PROBLEM - puppet on ORES-worker03.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:51:18] <icinga2-wm>	 RECOVERY - puppet on ORES-worker06.experimental is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:52:33] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4007994 (10Paladox) @mmodell we support git-lfs in gerrit now. :)   Chad installed the plugin a few months ago, and i know the setup to enable repos.  Which repo do we want to enable this on?...
[18:52:39] <icinga2-wm>	 PROBLEM - puppet on ORES-worker04.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:54:05] <icinga2-wm>	 RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:54:37] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028448 (10mmodell) @paladox: I thought it was limited by some permissions in gerrit. Doesn't it need to be enabled separately somehow or do all repos have git-lfs automatically now?
[18:55:58] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028484 (10Paladox) @mmodell yeh, we add the repo in project.config in All-Projects. It's not automatically.
[18:56:10] <icinga2-wm>	 RECOVERY - puppet on ORES-worker01.experimental is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[19:01:09] <icinga2-wm>	 RECOVERY - puppet on ORES-worker04.experimental is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures
[19:01:20] <awight>	 o/
[19:01:38] <icinga2-wm>	 RECOVERY - puppet on ORES-worker03.experimental is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[19:01:40] <wikibugs>	 10Scoring-platform-team, 10Collaboration-Team-Triage, 10MediaWiki-extensions-ORES, 10Regression: ORES extension highlights edits that are patrolled - https://phabricator.wikimedia.org/T187337#4028568 (10Catrope) The big problem here is that the patrol flag is not visible to users who don't have the patrol...
[19:02:29] <Amir1>	 o/
[19:06:04] <awight>	 reeelocating
[19:08:18] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028685 (10Halfak) https://gerrit.wikimedia.org/r/#/projects/research/ores/wheels,dashboards/default is where we want it right now.   Would 50GB be too high of a ceiling?  That's about 10x what...
[19:34:53] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028824 (10Paladox) @halfak ok increased the resources ^^
[19:37:08] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028832 (10Halfak) Great.  Thank you.  It looks like {T180628} is our last blocker until we can start experimenting with this.
[19:37:08] <awight>	 halfak: the ores-wheels repo doesn’t seem to be where we need git-lfs, at this point.
[19:37:35] <halfak>	 awight, I think we should start calling that the "assets" repo and put the word2vec bins in there
[19:37:38] <awight>	 Well, it might eventually make sense, but I don’t want to damage any of our production repos.
[19:37:59] <halfak>	 Oh I see.  It might be easier to start with a new repo then?
[19:38:05] <awight>	 IMO yes
[19:38:20] <halfak>	 I'm OK with that.  Maybe we can make a repo called "assets" and work from that. 
[19:38:23] <halfak>	 ores-assets
[19:38:26] <halfak>	 ores/assets whatever
[19:38:27] <awight>	 My thinking is that we want to avoid a situation where we suddenly can’t deploy wheels for a month.
[19:38:39] <awight>	 +1 I like it
[19:38:48] <awight>	 and we can move wheels over there when we’re convinced that it works.
[19:39:10] <awight>	 I was asking akosiaris, if we should even be experimenting with this new repo as a standalone thing, with its own scap config...
[19:42:25] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028862 (10awight) @Paladox Apologies, we discussed in IRC and we'd like to change the plan slightly.  Please disable git-lfs on the wheels repo, and we'll create a new re...
[19:42:57] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4028866 (10Halfak) +1 ^
[19:43:23] <awight>	 halfak: “research/ores/assets” or “mediawiki/services/ores/assets”, you think?
[19:43:55] <halfak>	 I hate these naming schemes. 
[19:44:08] <halfak>	 What are the other repos named?
[19:44:25] <wikibugs>	 10Scoring-platform-team, 10Collaboration-Team-Triage, 10MediaWiki-extensions-ORES, 10Regression: ORES extension highlights edits that are patrolled - https://phabricator.wikimedia.org/T187337#4028870 (10jmatazzoni) 05Open>03declined I'm going to write a task for what Roan suggests above and decline this.
[19:44:27] <awight>	 I took those from existing schemes
[19:44:37] <awight>	 research/ores/wheels.git
[19:44:42] <awight>	 mediawiki/services/ores/deploy.git
[19:44:48] <halfak>	 damn. hmm
[19:44:54] <awight>	 lessee...
[19:45:07] <halfak>	 All of these names are bad. 
[19:45:09] <halfak>	 Lets do mediawiki/services/ores/assets I think.  
[19:45:31] <awight>	 kk, fwiw that makes it parallel to our ores-prod-deploy repo, rather than parallel to current wheels
[19:47:32] <paladox>	 awight halfak i can create the repo in gerrit if you want?
[19:47:57] <awight>	 paladox: ty, we
[19:48:08] <awight>	 paladox: we’re still trying to figure out where to put it.  Any thoughts?
[19:48:34] <paladox>	 awight gerrit, but have it sync to phab?
[19:48:46] <paladox>	 ie gerrit then phab mirrors from it
[19:48:49] <awight>	 paladox: I don’t think so, cos Phab isn’t ready for git-lfs
[19:48:54] <paladox>	 ah
[19:48:57] <paladox>	 awight gerrit then
[19:49:06] <awight>	 halfak: looking through https://gerrit.wikimedia.org/r/#/admin/projects/ for "inspiration".
[19:49:12] <awight>	 paladox: +1, we just need to decide on a path
[19:49:16] <paladox>	 ah
[19:50:00] <paladox>	 awight if it's for ores, maybe
[19:50:05] <paladox>	 research/ores/assets ?
[19:50:13] <paladox>	 or ores/assets?
[19:50:25] <halfak>	 Seems like it would be better if we had ores be called "services/ores/deploy" and this new one be "services/ores/assets"
[19:50:35] <halfak>	 The "mediawiki" prefix doesn't really make sense. 
[19:51:08] <halfak>	 Though I suppose, ultimately, ORES is a "mediawiki supporting service" right now. 
[19:51:14] <halfak>	 It's a bit more general than that. 
[19:51:36] <awight>	 & not quite research either
[19:51:46] <awight>	 scoring/ores/assets?
[19:52:00] <paladox>	 +1 ^^
[19:52:03] <halfak>	 awight, I don't dislike that
[19:52:18] <halfak>	 Are we going to rename all the other ones then too? :) 
[19:52:31] <awight>	 lol
[19:52:34] <awight>	 hell no
[19:52:38] <halfak>	 Or will we have 3 competing standards?
[19:52:42] <awight>	 but we can make new repos with git-lfs there.
[19:53:31] <awight>	 scoring/ores/editquality — or maybe just scoring/editquality?
[19:53:52] <awight>	 hm bad example, I guess we don’t host that in gerrit.
[19:55:01] <halfak>	 awight, good to think about though. 
[19:55:24] <awight>	 mediawiki/extensions/ORES/scoring/research/services/editquality
[19:55:33] <halfak>	 makes sense to me 
[19:55:35] <halfak>	 ;) 
[19:55:37] * awight dodges tomatoes
[19:55:38] <halfak>	 scoring/ores/deploy/, scoring/ores/wheels/, scoring/ores/assets/
[19:55:43] <halfak>	 Woops. 
[19:55:50] <halfak>	 Oh wait no.  That looks right. 
[19:55:53] <awight>	 +1
[19:56:15] <awight>	 we can eventually move the research- repo, although it doesn’t need lfs-ification
[19:56:16] <halfak>	 paladox, if we rename a repo in gerrit, is there a redirect left in the old name?
[19:56:25] <paladox>	 halfak nope
[19:56:30] <halfak>	 Can there be?
[19:56:40] <paladox>	 halfak though you could write in the description the new location
[19:56:48] <awight>	 harrr
[19:56:49] <awight>	 hahaha
[19:57:10] <halfak>	 yeah... hmm. 
[19:57:32] <awight>	 IMO nobody outside our team needs to care that we moved it.
[19:57:46] <awight>	 The phab mirror can stay its ground.
[19:57:57] <halfak>	 GOod point. 
[19:58:28] <halfak>	 paladox, is renaming a repo a thing?  Or are we really just creating a new repo and loading in the git history?
[19:58:46] <awight>	 seems effectively the same
[19:58:52] <paladox>	 halfak nope, you would have to create the new repo and copy over the repo.
[19:59:03] <paladox>	 ie you could git push --mirror
[19:59:08] <halfak>	 awight, +1.  But then there's gerrit stuff in the history :) 
[19:59:15] <awight>	 ooh nuts
[19:59:24] <paladox>	 git clone --mirror <url>
[20:00:59] <awight>	 I shall go ahead with requesting scoring/ores/assets, I assume?
[20:01:25] <halfak>	 awight, only if you file a task for renaming all of the other repos. :P 
[20:01:34] <awight>	 gauntlet thrown.
[20:01:38] <halfak>	 And then put it on the backlog so we can ignore it officially. 
[20:01:55] <awight>	 ah but—we’re only renaming the ones that don’t benefit from LFS conversion
[20:02:34] <halfak>	 Why not rename all of 'em?
[20:03:24] <awight>	 maybe I’m wrong about how git-lfs history will work...
[20:03:33] <awight>	 but I don’t want us to have the bloated histories
[20:03:40] <halfak>	 Oh, we'll re-write history
[20:03:54] <halfak>	 All the commits will have the same stuff, but it will seem like we always had git-lfs :) 
[20:03:56] <awight>	 AIUI, we re-write history, then need to push that to an empty repo
[20:04:09] <halfak>	 Oh...  That must be a gerrit thing. 
[20:04:12] <awight>	 I don’t think we can push rewritten history to an existing repo
[20:04:19] <awight>	 I’ll ask...
[20:04:47] <paladox>	 nope you carn't push with a rewritten history
[20:04:50] <paladox>	 that's a git thing
[20:04:54] <paladox>	 awight ^^
[20:06:23] <halfak>	 paladox, of course you can
[20:06:25] <halfak>	 push -f
[20:06:27] <halfak>	 :P 
[20:06:31] <paladox>	 oh
[20:06:32] <halfak>	 rebase == rewriting history
[20:06:36] <paladox>	 oh
[20:06:59] <awight>	 The only time it won’t work is if you change the root revision.
[20:09:01] <awight>	 halfak: I’ll rename wikiclass -> articlequality in the meantime.  Was that the name we wanted?
[20:10:08] <halfak>	 yes.  When we make that transition, we should do it all at once.  We need to make changes in the code. 
[20:10:37] <awight>	 +1
[20:11:36] <Hauskatze>	 fetch
[20:12:10] * awight grabs a bone
[20:13:14] <wikibugs>	 10Scoring-platform-team, 10Gerrit, 10ORES, 10Operations, 10Patch-For-Review: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#4029020 (10awight) We're currently thinking that we want to normalize our repo locations in gerrit, and introduce git-lfs in the new locations....
[20:13:36] <paladox>	 awight halfak i can setup those repos
[20:13:42] <paladox>	 would you like me to do that?
[20:14:21] <awight>	 paladox: Thanks!  I started adding to the Gerrit/New repositories/Requests page, do you still want that for paperwork reasons?
[20:14:36] <paladox>	 awight you can if you want :)
[20:14:38] <paladox>	 yeh
[20:14:42] <awight>	 lol
[20:14:43] <awight>	 ok
[20:15:45] <paladox>	 awight i guess you want to have this group https://gerrit.wikimedia.org/r/#/admin/groups/uuid-cddbf2315647ba438f5741826fffaeedfdcdfe8a own the repos
[20:15:53] <paladox>	 (seems the group is not visable to me)
[20:16:01] <awight>	 paladox: perfect, ty
[20:16:05] <paladox>	 :)
[20:16:08] <awight>	 nicely done for not being able to see it ;-)
[20:16:34] <paladox>	 heh
[20:16:43] <paladox>	 awight it will inherit from All-Projects
[20:16:54] <paladox>	 unless you have a project in mind you want it to inherit from?
[20:17:19] <paladox>	 awight would you like me to also tick "Create initial empty commit"?
[20:17:41] <awight>	 paladox: Only for the repos that I’ve commented should be empty.
[20:17:53] <paladox>	 ok
[20:18:42] <paladox>	 awight i get "Group research-ores does not exist or is not visible to you."
[20:19:13] <awight>	 d’oh
[20:19:26] <awight>	 lemme add you.
[20:19:50] <awight>	 done.
[20:19:56] <paladox>	 ok thanks :)
[20:20:13] <awight>	 halfak: ^ fyi, temporarily adding paladox to our group ACL
[20:20:28] <paladox>	 https://gerrit.wikimedia.org/r/#/c/416760/
[20:20:34] <paladox>	 i carn't seem to merge that
[20:20:37] <paladox>	 i can +2 and v+2
[20:21:39] <codezee>	 this is strange, as soon as multiprocessing module tries to split observations into child processes, memory usage shoots up
[20:21:51] <codezee>	 i'll discuss this in tomorrow's backlog
[20:22:29] <codezee>	 as if 3gb vectors are getting replicated but I read that linux uses copy_on_write semantics and nothing is being written to the vectors as such
[20:22:41] <awight>	 codezee: That does sound weird.  AFAIK, you’re right about copy-on-write.
[20:23:42] <codezee>	 just when the code hits "for observation in extractor_pool.imap(extract_and_cache, labelings)" it gets killed without evening solving for any observation
[20:23:54] <codezee>	 https://github.com/wiki-ai/drafttopic/blob/extract-from-text/drafttopic/utilities/extract_from_text.py#L83
[20:24:21] <codezee>	 i see that its not even getting a chance to solve for one vector, so solved vectors filling up the cache is not an issue
[20:24:33] <codezee>	 I mean * solve for one article
[20:26:02] <awight>	 What is len(extractors)?
[20:26:53] <codezee>	 8
[20:26:55] <wikibugs>	 10Scoring-platform-team (Current), 10Gerrit, 10ORES, 10Operations, 10Patch-For-Review: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#4029065 (10awight)
[20:26:59] <codezee>	 awight: ^
[20:27:36] <awight>	 humm
[20:27:58] <codezee>	 let me try with 4
[20:28:25] <awight>	 Sounds like 8 should be fine, but yeah good idea.  Drop to 1 :)
[20:29:48] <codezee>	 well with 4 it did get killed but somewhat later after loading the vectors than in case of 8
[20:30:49] <codezee>	 trying 1
[20:32:20] <codezee>	 :0 :0 this is against logic, even with one extractor its getting killed bec of OOM but logically that should work because the solver script i wrote works fine in a single thread
[20:32:36] <codezee>	 it seems python's multiprocessing is messing inside the code somewhere with the data
[20:32:47] <awight>	 :) I think you have your prey in sight
[20:32:50] <awight>	 bugs for dinner ;-)
[20:33:15] <awight>	 Does it work if you eliminate the Pool.imap and call directly?
[20:33:20] <codezee>	 haha, bugs would be bad for me, besides already had dinner long time back :P
[20:33:54] <codezee>	 yeah, as i mentioned i have a script that does almost the same thing in a single thread reading observation one by one solving and writing bacj
[20:34:02] <codezee>	 its super slow but is working fine
[20:34:38] <codezee>	 and RAM mem usage is pegged at 3.62G
[20:35:34] <awight>	 that’s intense.
[20:37:30] <awight>	 yeah we’d better be able to share that among all workers on a machine.
[20:44:40] <awight>	 codezee: What are “labelings”?
[20:44:43] <awight>	 https://stackoverflow.com/questions/38084401/when-is-copy-on-write-invoked-for-python-multiprocessing-across-class-methods
[20:44:57] <awight>	 apparently that argument will be pickled and duplicated for each process.
[20:49:31] <codezee>	 awight: labelings are observations
[20:50:21] <codezee>	 collection of individual entries containing rev_id and text
[20:50:36] <awight>	 ooh are there a billion?
[20:50:42] <codezee>	 93000
[20:50:59] <codezee>	 awight: maybe its the data thats getting replicated, and i'm banging my head on vectors, not sure tho
[20:51:24] <awight>	 Can you try with a really short list of labelings?
[20:55:09] <awight>	 I’m confused about why the data would be replicated, AIUI you should get a generator which doesn’t actually duplicate that data, and then observations are passed to the 8 threads one at a time, so it’s only pickling and duplicating 1-8 observations at a time.
[20:55:30] <codezee>	 which is my thinking too
[20:55:39] <awight>	 But if there’s a bug where the entire list is passed as an argument to the first extractor… for example, an extra array around the args, [ [1, 2, 3…] ]
[20:56:41] <codezee>	 trying with 100 labelings
[20:58:35] <awight>	 read_observations looks like it does what we expect.  Mysterious
[21:02:06] <codezee>	 even with 100 observations it OOMs but when i limit the vectors to 150000(150thousand) instead of 3000000(3million)  it seems to work
[21:03:30] <awight>	 argh
[21:03:39] <awight>	 So this is a limit on how much of word2vec is loaded?
[21:03:52] <codezee>	 yes
[21:04:32] <awight>	 kk
[21:05:07] <awight>	 terrible to hear, it sounds unpleasant to debug Python forking, memory management, and garbage collection.
[21:06:38] <codezee>	 it seems to work with 1 million vectors and 2 extractors as well as 8 extractors so num extractors might not be a problem
[21:08:21] <codezee>	 alright going now
[21:10:07] <awight>	 good find!
[21:10:22] <awight>	 That narrows down what could be going wrong, although nothing is occuring to me :)
[21:11:14] <codezee>	 yeah its the size of vectors for earlier experiements i was using 150thousand words as i was hoping those many eng words might be useful, even now with 150thousand works good
[21:11:44] <codezee>	 when we scale up, to say 1 million words, both memory and time becomes an issue
[21:11:45] <awight>	 How much free memory does your test machine have?
[21:12:25] <codezee>	 currently extraction is running with 150k words on ores-stat you can see  half is being used
[21:12:51] <codezee>	 although extraction is not taking up all of that half
[21:13:26] <awight>	 I’m baffled...
[21:45:20] <wikibugs>	 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4029180 (10demon) Just following up: 50gb is fine for now sure. We've got several TB of free space :)
[22:11:44] <halfak>	 OKAY writing stuff done for the day. 
[22:11:51] <halfak>	 Now on to fawiki article quality
[22:14:34] <wikibugs>	 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#4029236 (10Halfak) @Ladsgroup, please re-review :)
[22:50:33] <Amir1>	 halfak: I think there is bidi issues, can I make a follow-up PR for that?
[22:50:56] <awight>	 until later o/
[22:52:38] <halfak>	 Amir1, yes please!
[23:20:04] <Amir1>	 https://github.com/wiki-ai/wikilabels-wmflabs-deploy/pull/45
[23:20:09] <Amir1>	 halfak: ^
[23:36:01] <halfak>	  Thanks Amir1 
[23:36:07] <halfak>	 I'll aim for a deploy tomorrow :)