[DCOM-128] RedisCache uses IGBINARY which is not always available Created: 20/Oct/12  Updated: 20/Oct/12

Status: Open
Project: Doctrine Common
Component/s: Caching
Affects Version/s: 2.3
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Sander Marechal Assignee: Benjamin Eberlei
Resolution: Unresolved Votes: 0
Labels: None


 Description   

The RedisCache uses Redis::SERIALIZER_IGBINARY. See https://github.com/doctrine/common/blob/master/lib/Doctrine/Common/Cache/RedisCache.php line 47.

The problem is that the php Redis extension can be compiled without IGBINARY support. In that case, this code causes a fatal error because the constant does not exist.

The DotDeb package of php5-redis (often used on Debian systems) for example comes compiled without IGBINARY support.

The code should probably check if the constant exists. If not, the default to Redis::SERIALIZER_PHP






[DCOM-130] Paths in Doctrine\Common\Cache\FileCache could create large directory indexes Created: 23/Oct/12  Updated: 10/May/13

Status: Open
Project: Doctrine Common
Component/s: Caching
Affects Version/s: 2.3
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: R Churchill Assignee: Benjamin Eberlei
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Any



 Description   

The way paths are created within FileCache currently, there is a theoretical maximum of 16^12 directories in the cache directory, which is quite a large number. Usually schemes like this are used to restrict the number of files in one directory.

Comparing with git, for example, the dirs are arranged

00/
1c/
...
ff/

and then the object store within those directories, which is a lot more manageable, say if you happen to type ls in the cache directory, you will get a maximum listing of 256 dirs. PhpThumb does something similar when caching images.

How about something like this for getFilename():

$idHash = md5($id);
$path = substr($idHash, 0, 2) . DIRECTORY_SEPARATOR . substr($idHash, 2, 2) . DIRECTORY_SEPARATOR . substr($idHash, 4);
$path = $this->directory . DIRECTORY_SEPARATOR . $path;

return $path . $id . $this->extension;

Not nearly so elegant, but I think this has better properties for the file system. Also I would be tempted to use one of the sha family hashes and not to include the $id within the filename, but perhaps this is helpful for debugging?



 Comments   
Comment by Julian Higman [ 10/May/13 ]

We hit this problem in a live system - with a lot of cached items, the number of subdirectories that FileCache creates can exceed the number that an ext3 filesystem allows in a single directory (about 32000).

After that, an attempt to cache a new item can get an error like this:

mkdir() [function.mkdir]: Too many links

Our solution was similar to that suggested:


    protected function getFilename($id) {
        $path = implode(str_split(md5($id), 2), DIRECTORY_SEPARATOR);
        $path = $this->directory . DIRECTORY_SEPARATOR . $path;
        return $path . DIRECTORY_SEPARATOR . $id . $this->extension;
    }

It splits the md5 of the item id into parts of length 2, rather than the original 12. This creates a deeply nested structure, but which won't ever exceed the limit on number of subdirectories in any one directory. It's the same subdirectory pattern used by default by Apache mod_disk_cache, as well.





Generated at Wed May 22 18:50:21 UTC 2013 using JIRA 5.2.7#850-sha1:b2af0c8dc8537b36121c6a579fabbdf79fc919e5.