Doctrine 2 - ORM
  1. Doctrine 2 - ORM
  2. DDC-763

Cascade merge on associated entities can insert too many rows through "Persistence by Reachability"

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.x
    • Component/s: ORM
    • Security Level: All
    • Labels:
      None

      Description

      I think that the UnitOfWork needs to maintain a map of spl_object_hash($newEntity)->$managedEntity for entities that were persisted via reachability during a merge. doMerge should then only call persistNew if the original entity has not already been persisted (if it has already been persisted it should merge the managed entity from the map). The map should be maintained until a flush() or until the UnitOfWork is cleared. The reasoning is as follows.

      Imagine we have a simple doctor object with no associations:

      $doctor = new Doctor();
      $em->persist($doctor);
      $em->persist($doctor);
      $em->flush();
      

      After the first persist() $doctor is MANAGED so the second persist has no effect and this results in a single Doctor row.

      If we do the same thing using merge and persistence by reachability:

      $doctor = new Doctor();
      $em->merge($doctor);
      $em->merge($doctor);
      $em->flush();
      

      we get 2 Doctor rows being added.

      Obviously in this particular case we should use the return value from the first merge() as the parameter of the second merge which would give correct behaviour.

      However, now imagine one Doctor has many Patients and many Patients have one Doctor, all the associations have cascade merge enabled, and further assume that $d1 (Doctor id=1) is already in the database. We now attempt to create two patients and assign them to the existing doctor:

      $d1= new Doctor(); $d1->id = 1; // This is a DETACHED entity
      
      $p1 = new Patient();
      $p2 = new Patient();
      
      $d1->patients->add($p1); $p1->doctor = $d1;
      $d1->patients->add($p2); $p2->doctor = $d1;
      
      $em->merge($p1);
      $em->merge($p2);
      
      $em->flush();
      

      This actually results in 4 rows being added to the 'patients' table instead of 2, I think because $p1 and $p2 are getting persisted both as the root objects and then again from the patient->doctor->patients array. Since the cascade merging happens internally we can't replace the array contents with the managed return values without walking through the object graph (in which case there is no point in using cascade merge in the first place). Maintaining a map in UnitOfWork will allow doMerge to ensure it doesn't persist the same entities twice.

      I'm not sure, but this might be relevant for cascade persist too.

      P.S. Another bug report on this can be found at http://code.google.com/p/flextrine2/issues/detail?id=32 (it basically says the same thing with different entities).

      1. multipleaddmerge.diff
        1 kB
        Dave Keen
      2. DDC763Test.php
        7 kB
        Dave Keen
      3. 0149-DDC-763.patch
        16 kB
        Dave Keen

        Activity

        Hide
        Benjamin Eberlei added a comment -

        What bit of that huge patch is that? Can you extract it into another ticket if thats possible?

        Show
        Benjamin Eberlei added a comment - What bit of that huge patch is that? Can you extract it into another ticket if thats possible?
        Hide
        Benjamin Eberlei added a comment -

        I added it to "Working with Objects" and the descripton of Merge. Its not yet live on the site.

        Using this current workaround has a performance impact, since more SELECT statements have to be issued against the database.

        Show
        Benjamin Eberlei added a comment - I added it to "Working with Objects" and the descripton of Merge. Its not yet live on the site. Using this current workaround has a performance impact, since more SELECT statements have to be issued against the database.
        Hide
        Dave Keen added a comment -

        Apologies for not being clear - only the 3rd patch (multipleaddmerge.diff) is relevant to the 'DUPLICATE KEY' error I am now talking about, but I'll put it in a nother ticket if you prefer.

        Show
        Dave Keen added a comment - Apologies for not being clear - only the 3rd patch (multipleaddmerge.diff) is relevant to the 'DUPLICATE KEY' error I am now talking about, but I'll put it in a nother ticket if you prefer.
        Hide
        Benjamin Eberlei added a comment -

        please add a new ticket, patch looks good.

        Show
        Benjamin Eberlei added a comment - please add a new ticket, patch looks good.
        Hide
        Dave Keen added a comment -

        Created as DDC-875

        Show
        Dave Keen added a comment - Created as DDC-875

          People

          • Assignee:
            Benjamin Eberlei
            Reporter:
            Dave Keen
          • Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated: