[00:00:23] would you specify an array of classes, in declining order of preference? [00:00:56] or is the code to always use the DBA implementation, if available? [00:01:17] I guess that if it is available, you're running Zend, so you probably want to use it [00:01:29] so the $option could be restricted to specifying which class should be used as a fallback [00:02:07] DBA is not necessarily the fastest, I think we had it patched to disable DBA while writing while we were still using zend [00:02:14] since the implementation is so broken [00:02:45] so I would be inclined to treat the configuration parameter as a hard requirement, and throw an exception if DBA is requested but not available [00:03:04] then if no configuration parameter is given, try DBA first, then low-mem PHP [00:03:40] if you can't reject low-mem PHP and fall back to high-memory then it's not really a complete fallback array [00:04:56] maybe we should just leave Reader::open alone then, and keep the implementation in this patch usable only when directly and explicitly constructed [00:07:19] could do [00:08:44] yeah, I think that would be best. so $wgLCStoreCDBReader (or whatever) would default to 'Cdb\Reader', meaning the default would be to choose between the DBA and low-mem implementations, but it could be set to Cdb\Reader\MemoryBuffered as well. [00:29:04] updated the patch [01:30:51] looks like cool work, ori! [01:31:59] * robla has nothing else to add...he's just procrastinating on writing an email he needs to write :-) [05:41:10] TimStarling: you can still get a substantial benefit from just reading the hash table in the first 2048 bytes into memory [05:41:19] and then doing fseeks / fread for the actual data blobs [05:41:43] yeah, makes sense [05:42:50] that might be best then. and that way it can just replace PHP.php. The pay-off for reading the file in one gulp only comes when reading >150 keys, which is more than I expected [05:44:54] though enwiki's main page calls Cdb\Reader\PHP::get 217 times, barack obama 716 times [05:45:32] i'm not sure if those are all against l10n_cache-en.cdb, but they probably are [05:55:09] there's another win to be had in re-ordering some of the reads to be sequential and to avoid calling fseek if the file pointer is already where we want it to be [06:44:06] this is 5 days old now and needs merging: https://gerrit.wikimedia.org/r/#/c/235406 [06:44:16] I will just self-merge it if nobody wants to review it [06:45:00] this is the trouble with being the only MW developer on my team [06:45:16] nobody to merge all these changesets for me [06:58:02] thanks ori [22:30:28] TimStarling: I updated the Cdb patch to read input in one-kilobyte chunks and attempt to fulfill subsequent reads from the buffer. It provides ~50% speed improvement over the previous implementation, even when reading relatively few keys, and it does not add memory pressure. [22:31:41] (over the current HEAD, that is -- it doesn't beat file_get_contents() when reading lots of keys, obviously)