<!-- 
RSS generated by JIRA (5.2.7#850-sha1:b2af0c8dc8537b36121c6a579fabbdf79fc919e5) at Sun May 26 03:23:29 UTC 2013

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary add field=key&field=summary to the URL of your request.
For example:
http://www.doctrine-project.org/jira/si/jira.issueviews:issue-xml/DDC-763/DDC-763.xml?field=key&field=summary
-->
<rss version="0.92" >
<channel>
    <title>Doctrine Project</title>
    <link>http://www.doctrine-project.org/jira</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>5.2.7</version>
        <build-number>850</build-number>
        <build-date>21-02-2013</build-date>
    </build-info>

<item>
            <title>[DDC-763] Cascade merge on associated entities can insert too many rows through &quot;Persistence by Reachability&quot;</title>
                <link>http://www.doctrine-project.org/jira/browse/DDC-763</link>
                <project id="10032" key="DDC">Doctrine 2 - ORM</project>
                        <description>&lt;p&gt;I think that the UnitOfWork needs to maintain a map of spl_object_hash($newEntity)-&amp;gt;$managedEntity for entities that were persisted via reachability during a merge.  doMerge should then only call persistNew if the original entity has not already been persisted (if it has already been persisted it should merge the managed entity from the map).  The map should be maintained until a flush() or until the UnitOfWork is cleared.  The reasoning is as follows.&lt;/p&gt;

&lt;p&gt;Imagine we have a simple doctor object with no associations:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;$doctor = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Doctor();
$em-&amp;gt;persist($doctor);
$em-&amp;gt;persist($doctor);
$em-&amp;gt;flush();
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After the first persist() $doctor is MANAGED so the second persist has no effect and this results in a single Doctor row.&lt;/p&gt;

&lt;p&gt;If we do the same thing using merge and persistence by reachability:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;$doctor = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Doctor();
$em-&amp;gt;merge($doctor);
$em-&amp;gt;merge($doctor);
$em-&amp;gt;flush();
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;we get 2 Doctor rows being added.&lt;/p&gt;

&lt;p&gt;Obviously in this particular case we should use the return value from the first merge() as the parameter of the second merge which would give correct behaviour.&lt;/p&gt;

&lt;p&gt;However, now imagine one Doctor has many Patients and many Patients have one Doctor, all the associations have cascade merge enabled, and further assume that $d1 (Doctor id=1) is already in the database.  We now attempt to create two patients and assign them to the existing doctor:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;$d1= &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Doctor(); $d1-&amp;gt;id = 1; &lt;span class=&quot;code-comment&quot;&gt;// This is a DETACHED entity
&lt;/span&gt;
$p1 = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Patient();
$p2 = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Patient();

$d1-&amp;gt;patients-&amp;gt;add($p1); $p1-&amp;gt;doctor = $d1;
$d1-&amp;gt;patients-&amp;gt;add($p2); $p2-&amp;gt;doctor = $d1;

$em-&amp;gt;merge($p1);
$em-&amp;gt;merge($p2);

$em-&amp;gt;flush();
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This actually results in 4 rows being added to the &apos;patients&apos; table instead of 2, I think because $p1 and $p2 are getting persisted both as the root objects and then again from the patient-&amp;gt;doctor-&amp;gt;patients array.  Since the cascade merging happens internally we can&apos;t replace the array contents with the managed return values without walking through the object graph (in which case there is no point in using cascade merge in the first place).  Maintaining a map in UnitOfWork will allow doMerge to ensure it doesn&apos;t persist the same entities twice.&lt;/p&gt;

&lt;p&gt;I&apos;m not sure, but this might be relevant for cascade persist too.&lt;/p&gt;

&lt;p&gt;P.S. Another bug report on this can be found at &lt;a href=&quot;http://code.google.com/p/flextrine2/issues/detail?id=32&quot; class=&quot;external-link&quot;&gt;http://code.google.com/p/flextrine2/issues/detail?id=32&lt;/a&gt; (it basically says the same thing with different entities).&lt;/p&gt;</description>
                <environment></environment>
            <key id="11812">DDC-763</key>
            <summary>Cascade merge on associated entities can insert too many rows through &quot;Persistence by Reachability&quot;</summary>
                <type id="4" iconUrl="http://www.doctrine-project.org/jira/images/icons/issuetypes/improvement.png">Improvement</type>
                                <priority id="3" iconUrl="http://www.doctrine-project.org/jira/images/icons/priorities/major.png">Major</priority>
                    <status id="1" iconUrl="http://www.doctrine-project.org/jira/images/icons/statuses/open.png">Open</status>
                    <resolution id="-1">Unresolved</resolution>
                    <security id="10000">All</security>
                        <assignee username="beberlei">Benjamin Eberlei</assignee>
                                <reporter username="ccapndave">Dave Keen</reporter>
                        <labels>
                    </labels>
                <created>Mon, 23 Aug 2010 05:55:30 +0000</created>
                <updated>Mon, 4 Jul 2011 21:47:46 +0000</updated>
                                                    <fixVersion>2.x</fixVersion>
                                <component>ORM</component>
                        <due></due>
                    <votes>2</votes>
                        <watches>2</watches>
                        <comments>
                    <comment id="14135" author="beberlei" created="Sun, 29 Aug 2010 04:59:05 +0000"  >&lt;p&gt;@Roman A possible fix for this in my opinion is another map in UnitOfWork $mergedEntities = array(); and a patch like this:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;diff --git a/lib/Doctrine/ORM/UnitOfWork.php b/lib/Doctrine/ORM/UnitOfWork.php
index 242d84b..1d0d8b3 100644
--- a/lib/Doctrine/ORM/UnitOfWork.php
+++ b/lib/Doctrine/ORM/UnitOfWork.php
@@ -1340,6 +1340,10 @@ class UnitOfWork &lt;span class=&quot;code-keyword&quot;&gt;implements&lt;/span&gt; PropertyChangedListener
             &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt;; &lt;span class=&quot;code-comment&quot;&gt;// Prevent infinite recursion
&lt;/span&gt;         }
 
+        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (isset($&lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt;-&amp;gt;mergedEntities[$oid])) {
+            &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; $&lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt;-&amp;gt;mergedEntities[$oid];
+        }
+
         $visited[$oid] = $entity; &lt;span class=&quot;code-comment&quot;&gt;// mark visited
&lt;/span&gt; 
         $class = $&lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt;-&amp;gt;em-&amp;gt;getClassMetadata(get_class($entity));
@@ -1468,6 +1472,8 @@ class UnitOfWork &lt;span class=&quot;code-keyword&quot;&gt;implements&lt;/span&gt; PropertyChangedListener
 
         $&lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt;-&amp;gt;cascadeMerge($entity, $managedCopy, $visited);
 
+        $&lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt;-&amp;gt;mergedEntities[$oid] = $managedCopy;
+
         &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; $managedCopy;
     }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                    <comment id="14139" author="ccapndave" created="Sun, 29 Aug 2010 05:38:51 +0000"  >&lt;p&gt;I have tested this patch with my application and it fixes the problem in all my relevant test cases apart from one.  The test case that&apos;s failing is one that persists a bi-directional many to many relationship, so the associations interweave with each other (if you know what I mean).&lt;/p&gt;

&lt;p&gt;I wonder if perhaps doMerge need to continue cascading even if it finds an item in $this-&amp;gt;mergedEntities&lt;/p&gt;

&lt;p&gt;This is the Flextrine code that fails - it results in no entries in movie_artist.  This might also be related to &lt;a href=&quot;http://www.doctrine-project.org/jira/browse/DDC-758&quot; title=&quot;When merging many to many entites back into the repository changes to the associations are not respected&quot;&gt;&lt;del&gt;DDC-758&lt;/del&gt;&lt;/a&gt;?&lt;/p&gt;

&lt;p&gt;m1 = new Movie();&lt;br/&gt;
m1.title = &quot;Movie 1&quot;;&lt;/p&gt;

&lt;p&gt;m2 = new Movie();&lt;br/&gt;
m2.title = &quot;Movie 2&quot;;&lt;/p&gt;

&lt;p&gt;a1 = new Artist();&lt;br/&gt;
a1.name = &quot;Artist 1&quot;;&lt;/p&gt;

&lt;p&gt;a2 = new Artist();&lt;br/&gt;
a2.name = &quot;Artist 2&quot;;&lt;/p&gt;

&lt;p&gt;m1.artists.addItem(a1); a1.movies.addItem(m1);&lt;br/&gt;
m1.artists.addItem(a2); a2.movies.addItem(m1);&lt;/p&gt;

&lt;p&gt;m2.artists.addItem(a1); a1.movies.addItem(m2);&lt;br/&gt;
m2.artists.addItem(a2); a2.movies.addItem(m2);&lt;/p&gt;

&lt;p&gt;// These translate to cascade merges on the server &lt;br/&gt;
em.persist(m1);&lt;br/&gt;
em.persist(m2);&lt;br/&gt;
em.persist(a1);&lt;br/&gt;
em.persist(a2);&lt;/p&gt;

&lt;p&gt;// Now flush&lt;br/&gt;
em.flush();&lt;/p&gt;</comment>
                    <comment id="14140" author="ccapndave" created="Sun, 29 Aug 2010 05:40:26 +0000"  >&lt;p&gt;P.S. This test passes if I translate em.persist() to $em-&amp;gt;persist() (not cascading) on the server instead of translating it to a cascade merge; not sure if that helps&lt;/p&gt;</comment>
                    <comment id="14149" author="romanb" created="Mon, 30 Aug 2010 06:17:09 +0000"  >&lt;p&gt;I&apos;d really like to avoid introducing an additional instance variable just to solve this issue but I did not find the time yet to really look into it.&lt;/p&gt;

&lt;p&gt;Does someone have a unit test for this already and can attach it to the issue? &lt;/p&gt;</comment>
                    <comment id="14198" author="romanb" created="Tue, 31 Aug 2010 14:56:58 +0000"  >&lt;p&gt;Rescheduling for RC1.&lt;/p&gt;</comment>
                    <comment id="14356" author="ccapndave" created="Mon, 13 Sep 2010 07:27:17 +0000"  >&lt;p&gt;Here is a functional test case containing three tests:&lt;/p&gt;

&lt;p&gt;testMultiMerge tests basic merging of two new entities, checking that only a single entity ends up in the database.  This passes with Benjamin&apos;s patch.&lt;/p&gt;

&lt;p&gt;testMultiCascadeMerge tests the more complex case of merging a OneToMany association. This also passes with Benjamin&apos;s patch.&lt;/p&gt;

&lt;p&gt;testManyToManyPersistByReachability tests the ManyToMany case described above and this fails with Benjamin&apos;s patch, probably because doMerge doesn&apos;t cascade down entities that it has already merged and some ManyToMany associations are being ignored.  Its a bit hard to be certain what is causing this as even without Benjamin&apos;s patch this test would fail due to &lt;a href=&quot;http://www.doctrine-project.org/jira/browse/DDC-758&quot; title=&quot;When merging many to many entites back into the repository changes to the associations are not respected&quot;&gt;&lt;del&gt;DDC-758&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                    <comment id="14397" author="beberlei" created="Wed, 15 Sep 2010 16:38:30 +0000"  >&lt;p&gt;@Roman i thought about this issue, its not possible without that additional map of merged entities. There is no way we can get that information from other sources. &lt;/p&gt;

&lt;p&gt;Problem is rather that the use-case probably only applies in mass-merging scenarios and client-server serialization.&lt;/p&gt;</comment>
                    <comment id="14442" author="ccapndave" created="Tue, 21 Sep 2010 19:48:39 +0000"  >&lt;p&gt;Added another failing test case - adding the same entity from different ends of a many to many bi-directional association to check that there isn&apos;t an integrity constraint violation caused by Doctrine trying to add the same row twice.&lt;/p&gt;</comment>
                    <comment id="14443" author="ccapndave" created="Tue, 21 Sep 2010 20:14:42 +0000"  >&lt;p&gt;Attached a patch for this issue.&lt;/p&gt;</comment>
                    <comment id="14444" author="beberlei" created="Wed, 22 Sep 2010 03:13:36 +0000"  >&lt;p&gt;can you comment why all the additionall stuff is necessary compared to my patch?&lt;/p&gt;</comment>
                    <comment id="14445" author="ccapndave" created="Wed, 22 Sep 2010 06:05:40 +0000"  >&lt;p&gt;It fixes the two additional test cases - testManyToManyPersistByReachability and testManyToManyDuplicatePersistByReachability.&lt;/p&gt;

&lt;p&gt;testManyToManyPersistByReachability was failing with your original patch because there are ManyToMany cases where an entity may have already been merged, but its still necessary to add it to an association and continue to cascade.  Running the following with the original patch will miss out some of the associations.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;$m1 = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Movie();
$m1-&amp;gt;title = &lt;span class=&quot;code-quote&quot;&gt;&quot;Movie 1&quot;&lt;/span&gt;;

$m2 = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Movie();
$m2-&amp;gt;title = &lt;span class=&quot;code-quote&quot;&gt;&quot;Movie 2&quot;&lt;/span&gt;;

$a1 = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Artist();
$a1-&amp;gt;name = &lt;span class=&quot;code-quote&quot;&gt;&quot;Artist 1&quot;&lt;/span&gt;;

$a2 = &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; Artist();
$a2-&amp;gt;name = &lt;span class=&quot;code-quote&quot;&gt;&quot;Artist 2&quot;&lt;/span&gt;;

$m1-&amp;gt;artists-&amp;gt;add($a1); $a1-&amp;gt;movies-&amp;gt;add($m1);
$m1-&amp;gt;artists-&amp;gt;add($a2); $a2-&amp;gt;movies-&amp;gt;add($m1);
$m2-&amp;gt;artists-&amp;gt;add($a1); $a1-&amp;gt;movies-&amp;gt;add($m2);
$m2-&amp;gt;artists-&amp;gt;add($a2); $a2-&amp;gt;movies-&amp;gt;add($m2);

$em-&amp;gt;merge($a1);
$em-&amp;gt;merge($a2);
$em-&amp;gt;flush();
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The other change in my patch is to protect against this case.  It ensures that the following code doesn&apos;t add the same entity twice to a collection.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;$em-&amp;gt;merge($m1);
$em-&amp;gt;merge($m2);
$em-&amp;gt;merge($a2);
$em-&amp;gt;merge($a2);
$em-&amp;gt;flush();
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                    <comment id="14627" author="beberlei" created="Sun, 31 Oct 2010 02:19:03 +0000"  >&lt;p&gt;I am not sure if the issue here is rather multiple calls to merge that contain different parts of the same object-graph.&lt;/p&gt;

&lt;p&gt;There should be a very simple fix for this, call -&amp;gt;clear() after each merge.&lt;/p&gt;

&lt;p&gt;I am not sure if this patch drags us into a blackhole of issues with merging.&lt;/p&gt;</comment>
                    <comment id="14649" author="ccapndave" created="Sun, 31 Oct 2010 08:48:37 +0000"  >&lt;p&gt;Calling -&amp;gt;clear() and -&amp;gt;flush() after each merge is a workaround for the simple case, but unless I am misunderstanding I don&apos;t think its a solution for cases where the merging is happening automatically in cascadeMerge.  I&apos;ve actually encountered this issue in another project and scenario to do with creating REST APIs and merging JSON objects into entities, and applying the patch fixed it so a) I think this issue might be a more common that we first thought and b) the patch basically seems to work (plus it doesn&apos;t introduce any failing cases in the existing test suite).  I can actually still find one edge case to do with cascading merging interlinked many to many associations that this doesn&apos;t fix, but I was planning to open that as a new ticket after this   My feeling is that the current merge already has issues and this definitely improves it.&lt;/p&gt;</comment>
                    <comment id="14652" author="beberlei" created="Mon, 1 Nov 2010 02:45:36 +0000"  >&lt;p&gt;It cannot happen inside a single merge, single merges use the $visited to avoid infinite recursions, each entity can only be merged once inside a single merge operation.&lt;/p&gt;</comment>
                    <comment id="14713" author="beberlei" created="Wed, 10 Nov 2010 17:50:54 +0000"  >&lt;p&gt;Added a note into the documentation about using EntityManager#clear between merging of entities which share subgraphs and cascade merge.&lt;/p&gt;

&lt;p&gt;Handling this issue in UnitOfwork will be declared an improvement, not a bug anymore and be scheduled for later releases. The required changes to the core are to dangerous and big.&lt;/p&gt;</comment>
                    <comment id="14714" author="ccapndave" created="Thu, 11 Nov 2010 03:49:15 +0000"  >&lt;p&gt;Where in the docs is that?&lt;/p&gt;

&lt;p&gt;Just to summarize, the equivalent operation to having multiple merges and a single flush is to call merge followed by flush each time, with the whole thing surrounded by a transaction?  Does this have a big impact on performance?&lt;/p&gt;</comment>
                    <comment id="14715" author="ccapndave" created="Thu, 11 Nov 2010 04:49:18 +0000"  >&lt;p&gt;Ben - even given the decision not to implement this (and I do understand your thinking, as it is a major change), is there any reason not to implement the bit that ensures that the same entity isn&apos;t added to a collection twice during a merge?  I can&apos;t think of a situation where this should be allowed, and I have a use case where I get &apos;DUPLICATE KEY&apos; errors if this isn&apos;t there.&lt;/p&gt;

&lt;p&gt;Please see attached patch.&lt;/p&gt;</comment>
                    <comment id="14716" author="beberlei" created="Thu, 11 Nov 2010 06:35:37 +0000"  >&lt;p&gt;What bit of that huge patch is that? Can you extract it into another ticket if thats possible?&lt;/p&gt;</comment>
                    <comment id="14717" author="beberlei" created="Thu, 11 Nov 2010 06:36:52 +0000"  >&lt;p&gt;I added it to &quot;Working with Objects&quot; and the descripton of Merge. Its not yet live on the site.&lt;/p&gt;

&lt;p&gt;Using this current workaround has a performance impact, since more SELECT statements have to be issued against the database. &lt;/p&gt;</comment>
                    <comment id="14718" author="ccapndave" created="Thu, 11 Nov 2010 08:30:42 +0000"  >&lt;p&gt;Apologies for not being clear - only the 3rd patch (multipleaddmerge.diff) is relevant to the &apos;DUPLICATE KEY&apos; error I am now talking about, but I&apos;ll put it in a nother ticket if you prefer.&lt;/p&gt;</comment>
                    <comment id="14719" author="beberlei" created="Thu, 11 Nov 2010 08:35:47 +0000"  >&lt;p&gt;please add a new ticket, patch looks good.&lt;/p&gt;</comment>
                    <comment id="14720" author="ccapndave" created="Thu, 11 Nov 2010 08:51:11 +0000"  >&lt;p&gt;Created as &lt;a href=&quot;http://www.doctrine-project.org/jira/browse/DDC-875&quot; title=&quot;Merge can sometimes add the same entity twice into a collection&quot;&gt;&lt;del&gt;DDC-875&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                </comments>
                    <attachments>
                    <attachment id="10812" name="0149-DDC-763.patch" size="16017" author="ccapndave" created="Tue, 21 Sep 2010 20:14:42 +0000" />
                    <attachment id="10811" name="DDC763Test.php" size="6795" author="ccapndave" created="Tue, 21 Sep 2010 19:48:39 +0000" />
                    <attachment id="10858" name="multipleaddmerge.diff" size="1161" author="ccapndave" created="Thu, 11 Nov 2010 04:49:18 +0000" />
                </attachments>
            <subtasks>
        </subtasks>
        </item>
</channel>
</rss>