Details
Description
I was just checking the execution path for efficiency and bumped into this block of code in /Doctrine/ORM/AbstracQuery.php,
starting on line 560:
protected function _getResultCacheId()
{
if ($this->_resultCacheId)
else
{ $sql = $this->getSql(); ksort($this->_hints); return md5(implode(";", (array)$sql) . var_export($this->_params, true) . var_export($this->_hints, true)."&hydrationMode=".$this->_hydrationMode); }}
Of course, MD5 hashes can never be guaranteed to be unique, so this block of the doExecute-Method
starting on line 508 can return false positives and thus break the user's assumptions about the ORM's behavior:
// Check result cache
if ($this->_useResultCache && $cacheDriver = $this->getResultCacheDriver()) {
$id = $this->_getResultCacheId();
$cached = $this->_expireResultCache ? false : $cacheDriver->fetch($id);
if ($cached === false)
{ // Cache miss. $stmt = $this->_doExecute(); $result = $this->_em->getHydrator($this->_hydrationMode)->hydrateAll( $stmt, $this->_resultSetMapping, $this->_hints ); $cacheDriver->save($id, $result, $this->_resultCacheTTL); return $result; }else
{ // Cache hit. return $cached; }}
It is not likely to happen, but possible. Rely on it a couple of million times in a production system, and one day, eventually, it will fail
and hell will break loose.
I think it is prohibitive to try to save memory by giving up correctness. If your program does not work, it's simply broken and there is no need to tune it unless you fix it functionally.
The only safe solution would be to use a non-lossy hash (i. e. compression) method, if any. Plain query text would do fine too.
Unless you expect the user to anticipate false positives, i. e.
Activity
| Field | Original Value | New Value |
|---|---|---|
| Description |
I was just checking the execution path for efficiency and bumped into this block of code in /Doctrine/ORM/AbstracQuery.php, starting on line 560: protected function _getResultCacheId() { if ($this->_resultCacheId) { return $this->_resultCacheId; } else { $sql = $this->getSql(); ksort($this->_hints); return md5(implode(";", (array)$sql) . var_export($this->_params, true) . var_export($this->_hints, true)."&hydrationMode=".$this->_hydrationMode); } } Of course, MD5 hashes can never be guaranteed to be unique, so this block of the doExecute-Method starting on line 508 can return false positives and thus break your app's assumptions on the ORM behavior: // Check result cache if ($this->_useResultCache && $cacheDriver = $this->getResultCacheDriver()) { $id = $this->_getResultCacheId(); $cached = $this->_expireResultCache ? false : $cacheDriver->fetch($id); if ($cached === false) { // Cache miss. $stmt = $this->_doExecute(); $result = $this->_em->getHydrator($this->_hydrationMode)->hydrateAll( $stmt, $this->_resultSetMapping, $this->_hints ); $cacheDriver->save($id, $result, $this->_resultCacheTTL); return $result; } else { // Cache hit. return $cached; } } It is not likely to happen, but possible. Rely on it a couple of million times in a production system, and one day, eventually, it will fail and hell will break loose. I think it is prohibitive to try to save memory by giving up correctness. If your program does not work, it's simply broken and there is no need to tune it unless you fix it functionally. The only safe solution would be to use a non-lossy hash (i. e. compression) method, if any. Plain query text would do fine too. Unless you expect the user to anticipate false positives, i. e. |
I was just checking the execution path for efficiency and bumped into this block of code in /Doctrine/ORM/AbstracQuery.php, starting on line 560: protected function _getResultCacheId() { if ($this->_resultCacheId) { return $this->_resultCacheId; } else { $sql = $this->getSql(); ksort($this->_hints); return md5(implode(";", (array)$sql) . var_export($this->_params, true) . var_export($this->_hints, true)."&hydrationMode=".$this->_hydrationMode); } } Of course, MD5 hashes can never be guaranteed to be unique, so this block of the doExecute-Method starting on line 508 can return false positives and thus break the user's assumptions about the ORM's behavior: // Check result cache if ($this->_useResultCache && $cacheDriver = $this->getResultCacheDriver()) { $id = $this->_getResultCacheId(); $cached = $this->_expireResultCache ? false : $cacheDriver->fetch($id); if ($cached === false) { // Cache miss. $stmt = $this->_doExecute(); $result = $this->_em->getHydrator($this->_hydrationMode)->hydrateAll( $stmt, $this->_resultSetMapping, $this->_hints ); $cacheDriver->save($id, $result, $this->_resultCacheTTL); return $result; } else { // Cache hit. return $cached; } } It is not likely to happen, but possible. Rely on it a couple of million times in a production system, and one day, eventually, it will fail and hell will break loose. I think it is prohibitive to try to save memory by giving up correctness. If your program does not work, it's simply broken and there is no need to tune it unless you fix it functionally. The only safe solution would be to use a non-lossy hash (i. e. compression) method, if any. Plain query text would do fine too. Unless you expect the user to anticipate false positives, i. e. |
| Summary | Cache System can potentially return wrong results due to use of MD5 hash codes as keys. A classic. | Caches can potentially return false positives due to use of MD5 hash codes as keys. A classic. |
| Status | Open [ 1 ] | Resolved [ 5 ] |
| Assignee | Roman S. Borschel [ romanb ] | Benjamin Eberlei [ beberlei ] |
| Fix Version/s | 2.0.1 [ 10114 ] | |
| Fix Version/s | 2.1 [ 10022 ] | |
| Resolution | Fixed [ 1 ] |
| Workflow | jira [ 12160 ] | jira-feedback [ 14656 ] |
| Workflow | jira-feedback [ 14656 ] | jira-feedback2 [ 16520 ] |
| Workflow | jira-feedback2 [ 16520 ] | jira-feedback3 [ 18773 ] |