[00:00:23] <ori>	 would you specify an array of classes, in declining order of preference?
[00:00:56] <ori>	 or is the code to always use the DBA implementation, if available?
[00:01:17] <ori>	 I guess that if it is available, you're running Zend, so you probably want to use it
[00:01:29] <ori>	 so the $option could be restricted to specifying which class should be used as a fallback
[00:02:07] <TimStarling>	 DBA is not necessarily the fastest, I think we had it patched to disable DBA while writing while we were still using zend
[00:02:14] <TimStarling>	 since the implementation is so broken
[00:02:45] <TimStarling>	 so I would be inclined to treat the configuration parameter as a hard requirement, and throw an exception if DBA is requested but not available
[00:03:04] <TimStarling>	 then if no configuration parameter is given, try DBA first, then low-mem PHP
[00:03:40] <TimStarling>	 if you can't reject low-mem PHP and fall back to high-memory then it's not really a complete fallback array
[00:04:56] <ori>	 maybe we should just leave Reader::open alone then, and keep the implementation in this patch usable only when directly and explicitly constructed
[00:07:19] <TimStarling>	 could do
[00:08:44] <ori>	 yeah, I think that would be best. so $wgLCStoreCDBReader (or whatever) would default to 'Cdb\Reader', meaning the default would be to choose between the DBA and low-mem implementations, but it could be set to Cdb\Reader\MemoryBuffered as well.
[00:29:04] <ori>	 updated the patch
[01:30:51] <robla>	 looks like cool work, ori!
[01:31:59] * robla has nothing else to add...he's just procrastinating on writing an email he needs to write  :-)
[05:41:10] <ori>	 TimStarling: you can still get a substantial benefit from just reading the hash table in the first 2048 bytes into memory
[05:41:19] <ori>	 and then doing fseeks / fread for the actual data blobs
[05:41:43] <TimStarling>	 yeah, makes sense
[05:42:50] <ori>	 that might be best then. and that way it can just replace PHP.php. The pay-off for reading the file in one gulp only comes when reading >150 keys, which is more than I expected
[05:44:54] <ori>	 though enwiki's main page calls Cdb\Reader\PHP::get 217 times, barack obama 716 times
[05:45:32] <ori>	 i'm not sure if those are all against l10n_cache-en.cdb, but they probably are
[05:55:09] <ori>	 there's another win to be had in re-ordering some of the reads to be sequential and to avoid calling fseek if the file pointer is already where we want it to be
[06:44:06] <TimStarling>	 this is 5 days old now and needs merging: https://gerrit.wikimedia.org/r/#/c/235406
[06:44:16] <TimStarling>	 I will just self-merge it if nobody wants to review it
[06:45:00] <TimStarling>	 this is the trouble with being the only MW developer on my team
[06:45:16] <TimStarling>	 nobody to merge all these changesets for me
[06:58:02] <TimStarling>	 thanks ori
[22:30:28] <ori>	 TimStarling: I updated the Cdb patch to read input in one-kilobyte chunks and attempt to fulfill subsequent reads from the buffer. It provides ~50% speed improvement over the previous implementation, even when reading relatively few keys, and it does not add memory pressure.
[22:31:41] <ori>	 (over the current HEAD, that is -- it doesn't beat file_get_contents() when reading lots of keys, obviously)