[DDC-851] Automerge of detached entities passed to doctrine Created: 31/Oct/10  Updated: 30/Dec/10

Status: Open
Project: Doctrine 2 - ORM
Component/s: ORM
Affects Version/s: 2.0-BETA4
Fix Version/s: None
Security Level: All

Type: Improvement Priority: Major
Reporter: Daniel Alvarez Arribas Assignee: Roman S. Borschel
Resolution: Unresolved Votes: 0
Labels: None


 Description   

This is a feature request.

Currently it is not possible to assign a detached entity to a relationship. You have to manually "merge" it, and only then you are able to assign it to relationships of managed objects.

This can become complicated to do. The way it is now, when assigning an entity to a relationship in a process using a large number of entities, the entity's state needs to be checked and the entity possibly merged - all in userland code. This adds a level of complexity and potential for errors, while it could be solved transparently and elegantly within the ORM. There are ways to implement it in userland code, too, with moderate effort (see below), but this does not change the fact that responsibility for implementing a purely technical feature is delegated to the user, who could be spending his time much better writing business code. And if the user actually implements it, it will clutter the application with non-problem-domain code.

To keep things simple, I propose Doctrine be extended to simply auto-merge any detached entities passed to it. That would save the programmer the manual tracking of object states and merge() calls.

This would be especially handy when using cascades, as keeping track of deep object graphs in userland code would duplicate substantial ORM functionality.

In programs that work with massive amounts of data, it is practically impossible to keep all entities managed due to resource constraints (see e.g. the batch processing patterns documented in the Doctrine 2 reference at http://www.doctrine-project.org/projects/orm/2.0/docs/reference/batch-processing/en). In a situation like that, one would probably simply flush and clear the entity manager regularly. Doctrine 2 currently forces the user to manually "merge" all persistent objects he/she still holds references to and wants to assign e.g. to other newly created persistable objects. I can not think of any reason why Doctrine 2 should not be able to do it automatically.

Below is another comment originally attached to the GitHub proposal, containing a userland implementation of the feature as a temporary fix, for whoever cares.

Here is a userland implementation for the functionality I am proposing, though I feel it is technical clutter that belongs into the ORM. Changing doctrine to be able to auto-merge unmanaged entities would be ideal. I thought I'd share this, for use as long as Doctrine 2 does not provide equivalent functionality. The implementation assumes all entities inherit from a base class (named "YourEntityBaseClass here") and intercepts the assignment to ToOne-relationships in a __set() method provided in that base class. For ToMany-relationships we extend ArrayCollection to intercept calls to add() and set() to accomplish the same.

As an alternative to defining a __set() method in a base class you could also implement the interception by changing any mutator methods you define in your entities. But that would bloat your code quickly as you define more and more relationship attributes on your entities.

The following __set() method implementation relies on reflection to parse the DocBlock-Comment with the Annotation and determine whether or not the property to be set is a ToOne-relationship.

   public function __set($name, &$value) {
      
      $reflectionClass = new ReflectionClass($this);
      
      $property = $reflectionClass->getProperty($name);

      if (   self::isToOneRelationship($property)
          && $value !== null) {
            
         $value = self::mergeIfDetached($value);
      }
      
      $this->$name = $value;
   }

The following is an implementation of mergeIfDetached(), that assumes there is a __get defined on the entity, to be able to access the protected mapped properties.

public static function mergeIfDetached(YourEntityBaseClass $dataObject) {

   $doctrineEntityManager = DB::getDoctrineEntityManager();

   if ($doctrineEntityManager->getUnitOfWork()->getEntityState($dataObject) == \Doctrine\ORM\UnitOfWork::STATE_DETACHED) {
       
      $dataObject = $doctrineEntityManager->merge($dataObject);
   }

   return $dataObject;
}

For your purposes, consider DB to be just a class holding a reference to the Doctrine entity manager.

Here are the helper methods for the reflection:

   private static function isToOneRelationship(ReflectionProperty $property) {

      return self::matchDoctrineAnnotation($property, self::$doctrineToOneRelationshipAnnotation);
   }

   private static function matchDoctrineAnnotation(ReflectionProperty $property, $pattern) {
      
      return preg_match('/\@' . $pattern . '/', $property->getDocComment()) != 0;
   }

Here is the drop-in-replacement class for use with ToMany-Relationships. It uses the static reloadIfDetached method defined in the entity base class:

use Doctrine\Common\Collections\ArrayCollection;


class Collection extends ArrayCollection {
   
    public function set($key, $value) {
       
       $value = YourEntityBaseClass::mergeIfDetached($value);
       
       parent::set($key, $value);
    }


    public function add($value) {
       
       $value = YourEntityBaseClass::mergeIfDetached($value);
       
       return parent::add($value);
    }
}

This approach keeps the amount of unnecessary code to a minimum, so that merges are not scattered throughout the problem-domain code.



 Comments   
Comment by Daniel Alvarez Arribas [ 29/Dec/10 ]

I have to note that the code I listed above turned out to be broken. There is nothing that guarantees that a data object just merged will not become detached again after being merged on assignment, unless the object is immediately persisted afterwards.

The correct solution would be to merge all data objects found through relationships for a given data object, right from the persistence manager, immediately before calling persist() on the data object.

I am currently using this solution (save() saves a data object safely for use within long-running batch jobs):


   public static function save(DataObject $dataObject) {
      
      self::mergeRelatedDataObjectsIfDetached($dataObject);
      
      self::$doctrineEntityManager->persist($dataObject);
   }
   

   public static function merge(DataObject $dataObject) {
      
      return self::$doctrineEntityManager->merge($dataObject);
   }


  protected static function mergeRelatedDataObjectsIfDetached(DataObject $dataObject) {
      
      $reflectionClass = new ReflectionClass($dataObject);
      
      $properties = $reflectionClass->getProperties();
      
      foreach ($properties as $property) {

         $propertyName = $property->getName();
         
         $propertyValue = $dataObject->__get($propertyName);
         
         
         if (MetadataReader::isToOneRelationship($property)) {
            
            if (   $propertyValue !== null
                && ! $propertyValue instanceof Proxy
                && self::isDetached($propertyValue)) {
               
               $relatedDataObject = self::merge($propertyValue);
               
               $dataObject->__set($propertyName, $relatedDataObject);
            }
         }
         else {
            
            if (MetadataReader::isToManyRelationship($property)) {
               
               $relatedDataObjects = $propertyValue->toArray();
               
               foreach ($relatedDataObjects as $index => $relatedDataObject) {
                  
                  if ( ! $relatedDataObject instanceof Proxy
                      && self::isDetached($relatedDataObject)) {
                     
                     $relatedDataObject = self::merge($relatedDataObject);
                     
                     
                     // Replace the entry in the collection with the merged copy.
                     
                     $propertyValue->set($index, $relatedDataObject);
                  }
               }
            }
         }
      }
   }
   
   
   protected static function isDetached(DataObject $dataObject) {
      
      return self::$doctrineEntityManager->getUnitOfWork()->getEntityState($dataObject) == UnitOfWork::STATE_DETACHED;
   }

I still wish there would be an automerge feature, kind of Hibernate's "update".

Comment by Daniel Alvarez Arribas [ 29/Dec/10 ]

Wrapped the code sections into proper code blocks...

Generated at Wed Oct 01 14:16:12 UTC 2014 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.