[DCOM-130] Paths in Doctrine\Common\Cache\FileCache could create large directory indexes Created: 23/Oct/12 Updated: 10/May/13 |
|
| Status: | Open |
| Project: | Doctrine Common |
| Component/s: | Caching |
| Affects Version/s: | 2.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | R Churchill | Assignee: | Benjamin Eberlei |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Any |
||
| Description |
|
The way paths are created within FileCache currently, there is a theoretical maximum of 16^12 directories in the cache directory, which is quite a large number. Usually schemes like this are used to restrict the number of files in one directory. Comparing with git, for example, the dirs are arranged 00/ and then the object store within those directories, which is a lot more manageable, say if you happen to type ls in the cache directory, you will get a maximum listing of 256 dirs. PhpThumb does something similar when caching images. How about something like this for getFilename(): $idHash = md5($id); return $path . $id . $this->extension; Not nearly so elegant, but I think this has better properties for the file system. Also I would be tempted to use one of the sha family hashes and not to include the $id within the filename, but perhaps this is helpful for debugging? |
| Comments |
| Comment by Julian Higman [ 10/May/13 ] |
|
We hit this problem in a live system - with a lot of cached items, the number of subdirectories that FileCache creates can exceed the number that an ext3 filesystem allows in a single directory (about 32000). After that, an attempt to cache a new item can get an error like this: mkdir() [function.mkdir]: Too many links Our solution was similar to that suggested:
protected function getFilename($id) {
$path = implode(str_split(md5($id), 2), DIRECTORY_SEPARATOR);
$path = $this->directory . DIRECTORY_SEPARATOR . $path;
return $path . DIRECTORY_SEPARATOR . $id . $this->extension;
}
It splits the md5 of the item id into parts of length 2, rather than the original 12. This creates a deeply nested structure, but which won't ever exceed the limit on number of subdirectories in any one directory. It's the same subdirectory pattern used by default by Apache mod_disk_cache, as well. |