Details
Description
I was just checking the execution path for efficiency and bumped into this block of code in /Doctrine/ORM/AbstracQuery.php,
starting on line 560:
protected function _getResultCacheId()
{
if ($this->_resultCacheId)
else
{ $sql = $this->getSql(); ksort($this->_hints); return md5(implode(";", (array)$sql) . var_export($this->_params, true) . var_export($this->_hints, true)."&hydrationMode=".$this->_hydrationMode); }}
Of course, MD5 hashes can never be guaranteed to be unique, so this block of the doExecute-Method
starting on line 508 can return false positives and thus break the user's assumptions about the ORM's behavior:
// Check result cache
if ($this->_useResultCache && $cacheDriver = $this->getResultCacheDriver()) {
$id = $this->_getResultCacheId();
$cached = $this->_expireResultCache ? false : $cacheDriver->fetch($id);
if ($cached === false)
{ // Cache miss. $stmt = $this->_doExecute(); $result = $this->_em->getHydrator($this->_hydrationMode)->hydrateAll( $stmt, $this->_resultSetMapping, $this->_hints ); $cacheDriver->save($id, $result, $this->_resultCacheTTL); return $result; }else
{ // Cache hit. return $cached; }}
It is not likely to happen, but possible. Rely on it a couple of million times in a production system, and one day, eventually, it will fail
and hell will break loose.
I think it is prohibitive to try to save memory by giving up correctness. If your program does not work, it's simply broken and there is no need to tune it unless you fix it functionally.
The only safe solution would be to use a non-lossy hash (i. e. compression) method, if any. Plain query text would do fine too.
Unless you expect the user to anticipate false positives, i. e.
Activity
| Field | Original Value | New Value |
|---|---|---|
| Description |
I was just checking the execution path for efficiency and bumped into this block of code in /Doctrine/ORM/AbstracQuery.php, starting on line 560: protected function _getResultCacheId() { if ($this->_resultCacheId) { return $this->_resultCacheId; } else { $sql = $this->getSql(); ksort($this->_hints); return md5(implode(";", (array)$sql) . var_export($this->_params, true) . var_export($this->_hints, true)."&hydrationMode=".$this->_hydrationMode); } } Of course, MD5 hashes can never be guaranteed to be unique, so this block of the doExecute-Method starting on line 508 can return false positives and thus break your app's assumptions on the ORM behavior: // Check result cache if ($this->_useResultCache && $cacheDriver = $this->getResultCacheDriver()) { $id = $this->_getResultCacheId(); $cached = $this->_expireResultCache ? false : $cacheDriver->fetch($id); if ($cached === false) { // Cache miss. $stmt = $this->_doExecute(); $result = $this->_em->getHydrator($this->_hydrationMode)->hydrateAll( $stmt, $this->_resultSetMapping, $this->_hints ); $cacheDriver->save($id, $result, $this->_resultCacheTTL); return $result; } else { // Cache hit. return $cached; } } It is not likely to happen, but possible. Rely on it a couple of million times in a production system, and one day, eventually, it will fail and hell will break loose. I think it is prohibitive to try to save memory by giving up correctness. If your program does not work, it's simply broken and there is no need to tune it unless you fix it functionally. The only safe solution would be to use a non-lossy hash (i. e. compression) method, if any. Plain query text would do fine too. Unless you expect the user to anticipate false positives, i. e. |
I was just checking the execution path for efficiency and bumped into this block of code in /Doctrine/ORM/AbstracQuery.php, starting on line 560: protected function _getResultCacheId() { if ($this->_resultCacheId) { return $this->_resultCacheId; } else { $sql = $this->getSql(); ksort($this->_hints); return md5(implode(";", (array)$sql) . var_export($this->_params, true) . var_export($this->_hints, true)."&hydrationMode=".$this->_hydrationMode); } } Of course, MD5 hashes can never be guaranteed to be unique, so this block of the doExecute-Method starting on line 508 can return false positives and thus break the user's assumptions about the ORM's behavior: // Check result cache if ($this->_useResultCache && $cacheDriver = $this->getResultCacheDriver()) { $id = $this->_getResultCacheId(); $cached = $this->_expireResultCache ? false : $cacheDriver->fetch($id); if ($cached === false) { // Cache miss. $stmt = $this->_doExecute(); $result = $this->_em->getHydrator($this->_hydrationMode)->hydrateAll( $stmt, $this->_resultSetMapping, $this->_hints ); $cacheDriver->save($id, $result, $this->_resultCacheTTL); return $result; } else { // Cache hit. return $cached; } } It is not likely to happen, but possible. Rely on it a couple of million times in a production system, and one day, eventually, it will fail and hell will break loose. I think it is prohibitive to try to save memory by giving up correctness. If your program does not work, it's simply broken and there is no need to tune it unless you fix it functionally. The only safe solution would be to use a non-lossy hash (i. e. compression) method, if any. Plain query text would do fine too. Unless you expect the user to anticipate false positives, i. e. |
| Summary | Cache System can potentially return wrong results due to use of MD5 hash codes as keys. A classic. | Caches can potentially return false positives due to use of MD5 hash codes as keys. A classic. |
| Status | Open [ 1 ] | Resolved [ 5 ] |
| Assignee | Roman S. Borschel [ romanb ] | Benjamin Eberlei [ beberlei ] |
| Fix Version/s | 2.0.1 [ 10114 ] | |
| Fix Version/s | 2.1 [ 10022 ] | |
| Resolution | Fixed [ 1 ] |
| Workflow | jira [ 12160 ] | jira-feedback [ 14656 ] |
| Workflow | jira-feedback [ 14656 ] | jira-feedback2 [ 16520 ] |
| Workflow | jira-feedback2 [ 16520 ] | jira-feedback3 [ 18773 ] |
- Request to http://www.doctrine-project.org/fisheye/ failed: Error in remote call to 'FishEye 0 (http://www.doctrine-project.org/fisheye/)' (http://www.doctrine-project.org/fisheye) [AbstractRestCommand{path='/rest-service-fe/search-v1/crossRepositoryQuery', params={query=DDC-892, expand=changesets[-21:-1].revisions[0:29],reviews}, methodType=GET}] : Received status code 503 (Service Temporarily Unavailable)
Has anyone taken note of this one?
As it seems, this critical bug has been sitting here for a month, untouched, and apparently even made it into the final release.
I would offer to fix it myself, but as the fix would consist in removing just one or two MD5 hash method calls, and it would probably have to be validated anyway by a core developer, there is actually not much technical effort I could save you.
Just a reminder, thanks for your time.