[MODM-92] Changing all data in embedMany collections can create duplicate embedded objects Created: 22/Oct/10 Updated: 23/Nov/10 Resolved: 23/Nov/10
|Project:||Doctrine MongoDB ODM|
|Reporter:||Jeremy Mikola||Assignee:||Jonathan H. Wage|
MongoDB 1.6.3, with Mongo ODM master as of 10/22/2010
Repeatable in the following use case, when the default pushPull strategy is used for an embedMany collection:
I investigated this as best I could, and traced the problem to BasicDocumentPersister::prepareUpdateData() and the data it receives from UnitOfWork's changeset.
The UoW changeset simply reports the old/new embedded collections as arrays of raw data. This makes BasicDocumentPersister unable to distinguish whether a $set or $pullAll/$pushAll is an appropriate course of action, since it cannot discern whether the embedded objects are actually different (by SPL object hash) or merely changed. Following the logic in prepareUpdateData(), we end up doing two things when processing the update for the collection field:
This chain of commands will then reach BasicDocumentPersister::update() later on and given the logic check within:
> if ((isset($update[$this->cmd . 'pushAll']) || isset($update[$this->cmd . 'pullAll'])) && isset($update[$this->cmd . 'set'])) ...
The $set command will be executed first, followed by $pullAll/$pushAll later. This $set actually changes the very data that $pullAll expects to remove, so $pullAll will do nothing but $pushAll will still insert a copy of our new data. The end result is that we now have two copies of our new data in the collection.
|Comment by Jeremy Mikola [ 22/Oct/10 ]|
Failing test case here: http://github.com/jmikola/mongodb-odm/commit/336a700ac6fe0cdf013fc864d8e70f43cb5b73e2
Temporary fix is to switch the mapping strategy to "set" instead of the default "pushPull"
I spoke with Bulat and believe the long-term fix for this will be having UoW preserve the SPL object hash of embedded objects in changsets, so that we can compare something more than arrays when deciding what MongoDB commands to utilize. For cases such as this, the $set is superfluous in my opinion - it should only be used if the same object remained in the collection and had its value(s) changed.
|Comment by Jonathan H. Wage [ 23/Nov/10 ]|
Thanks! This is now fixed with the refactoring we did the persisters last week!