[DDC-178] Query Hint for LOCK mechanisms plus support in $em->find() Created: 26/Nov/09  Updated: 04/Jul/10  Resolved: 04/Jul/10

Status: Resolved
Project: Doctrine 2 - ORM
Component/s: None
Affects Version/s: None
Fix Version/s: 2.0-BETA3
Security Level: All

Type: New Feature Priority: Major
Reporter: Benjamin Eberlei Assignee: Benjamin Eberlei
Resolution: Fixed Votes: 2
Labels: None

Attachments: File ddc178_locking.diff     File ddc178_pessimistic.diff    
Issue Links:
Reference
relates to DDC-410 review all sql to ensure that it work... Resolved
Sub-Tasks:
Key
Summary
Type
Status
Assignee
DDC-590 Create Documentation for new Lock sup... Sub-task Resolved Benjamin Eberlei  
DDC-591 Support Pessimistic Locks for Entitie... Sub-task Resolved Benjamin Eberlei  

 Description   

In some scenarios it is necessary to explicitly lock rows for update in a select query. My idea would be to support it twofold:

1. Add a LockMode Class:

final abstract class Lock
{
    const NONE = 0;
    const OPTIMISTIC = 1;
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 4;
}

2. Add methods to platforms that can add necessary READ/WRITE Lock query additions like FOR UPDATE/SHARED which afaik are supported by all rmdbs in some way.
3. Add a query hint "lockMode" which takes a Lock constant.
4. Add a query hint "lockVersion" which takes an integer or timestamp versioning value.
5. Change DQL Parser to apply lock mode if set, and if lock_mode = optimistic, add a where clause for the only top-level class (more not supported? Reread Evans DDD / Aggregates)
6. Change the $em->find() method to the following signature:

public function find($class, $identifier, $lockMode=0, $lockVersion=null);

And if the values are set, set the appropriate query hints.


Updated API specification

LockModes

final abstract class LockMode
{
    const NONE = 0;
    const OPTIMISTIC = 1;
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 4;
}

Constraints

  • LockMode::OPTIMISTIC requires entities to be versioned
  • LockMode::PESSIMISTIC_READ/WRITE works similarly for versioned as well as non-versioned entities. However, lock() after read only works for versioned entities.

API spec

$query->setLockMode(LockMode::OPTIMISTIC)

Effects:

  • Throw OptimisticLockException if any of the entities fetched (or fetch-joined) by the query are not versioned.
  • Otherwise proceed normally, SQL is not modified.

$query->setLockMode(LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no running transaction.
  • Modify the SQL with an appropriate locking clause (i.e. FOR UPDATE) that can be platform-specific, to acquire a pessimistic lock on all read entities.
  • Throw PessimisticLockException if lock(s) could not be obtained.

$em->find($entity, LockMode::OPTIMISTIC) (or findBy et al)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • Otherwise proceeed normally.

$em->find($entity, LockMode::OPTIMISTIC, $version) (or findBy et al)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • find() entity normally, no SQL modification.
  • Throw OptimisticLockException if there is a version mismatch ($version != $entity->version)
  • Otherwise proceed normally.

$em->find($entity, LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no active transaction.
  • Modify the SQL to incude an appropriate platform-specific pessimistic lock (i.e. FOR UPDATE)
  • throw PessimisticLockException If lock could not be obtained
  • Otherwise proceed normally
  • Refresh entity with lock when entity with id exists in identity map already

$em->refresh($entity, LockMode::OPTIMISTIC)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • Otherwise proceed normally. (What about cascades here? Need to take that into account probably)

$em->refresh($entity, LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no active transaction.
  • Ensure the SQL used for refreshing is modified accordingly with the platform-specific pessimistic locking clause.
  • Throw PessimisticLockException if the lock could not be obtained.
  • Otherwise proceed normally.

$em->lock($entity, LockMode::OPTIMISTIC)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • Otherwise do nothing (NOOP).

$em->lock($entity, LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no active transaction.
  • Throw PessimisticLockException if entity is not versioned (this is always a "lock after read")
  • Issue straight, minimal locking SQL (we probably must include the version column in the select), platform-specific. Note: Probably get the SQL from the persisters to account for different inheritance mapping strategies. The last part of the SQL, the locking clause, is taken from the platforms.
  • Throw PessimisticLockException if the lock could not be obtained.
  • Compare the newly read version with the old version. If they dont match throw PessimisticLockException (this means the entity was changed since it was read).
  • Otherwise proceed normally.


 Comments   
Comment by Guilherme Blanco [ 09/Dec/09 ]

We should be aware that PHP 5.3 now uses mysqlnd driver internally.
This means that queries like LOCK are applied via unbuffered queries, which may compromise subsequent data changes on DB table.

We need to do some testing before effectively apply any type of approach here.

Comment by Benjamin Eberlei [ 09/Dec/09 ]

What does this have to do with locking? I don't understand it Please elaborate

Comment by Benjamin Eberlei [ 09/Dec/09 ]

Attached a quick try for pessimistic lock.

Optimistic lock enforcement is much more difficult...

Comment by Benjamin Eberlei [ 09/Dec/09 ]

Update of the patch adding tests to the Query SELECT tests.

Comment by Roman S. Borschel [ 19/Dec/09 ]

Regarding Nr. 5, indeed more than one root entity is supported in DOctrine 2 when querying.

Example (assuming Customer and Employee are not related in any way):

$q = $em->createQuery("SELECT c, e FROM Customer c, Employee e WHERE c.hatsize = e.shoesize"

The result would (or should) be an array with both Customer and Employee objects.

Comment by Benjamin Eberlei [ 26/Jan/10 ]

We should probably go only for the PESSIMISTIC lock in the interface.

However it might be convenient to make optimistic locking possible via an additional parameter. Given that PHP is Stateless it might be a good pattern to do:

$id = $_POST['id'];
$lastKnownVersion = $_POST['version'];
$entity = $em->find('Yadda', $id, $lastKnownVersion);
// do stuff with $entity

$em->flush();
Comment by Benjamin Eberlei [ 13/Mar/10 ]

Hm given our discussion yesterday, we should translate the OPTIMISTIC_FORCE_INCREMENT into PHPs execution context (script per request):

final abstract class Lock
{
    const NONE = 0;
    const OPTIMISTIC_READ = 1;
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 4;
}

OPTIMISTIC_READ requires the fourth parameter of EM::find() to find being a positive integer and adds a version chck for that given version. This wont work with Dql though.

EntityRepository should also be changed, this affects:

EntityRepository::find($id, Lock $lockMode = null, $lockVersion = null);
EntityRpository::findOneBy(array $criteria, Lock $lockMode = null, $lockVersion = null);

Entity Repository would translate a $lockVersion != null into a $criteria. Then we onlly need to change:

StandardEntityPersister::load($criteria, $entity, $assoc, $hints = array(), $lockMode = null)
StandardEntityPersister::_getSelectEntitiesSQL($criteria, $assoc=null, $lockMode = null)

The all queries have to be done with DQL to get pessimistic locks imho. A DQL Query with locking would look like:

$q->setHint(Query::HINT_LOCK_MODE, Lock::PESSIMISTIC_READ);
Comment by Roman S. Borschel [ 13/Mar/10 ]

I would also like to see an EntityManager#lock() method. However, after re-reading the spec and testing with Hibernate 3.5.0-CR-2, I'm not sure we really need the *_FORCE_INCREMENT variations. In Hibernate, em.lock() with OPTIMISTIC doesnt do anything (it doesnt need to, versions are compared on update anyway). Even with OPTIMISTIC_FORCE_INCREMENT nothing happens, except the normal version update when the entity changed. Not sure whether this is a bug or intended. PESSIMISTIC_FORCE_INCREMENT works as expected.

So I currently think we should have the following as a start:

final abstract class LockMode
{
    const NONE = 1; // default
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 3;
}

The LockMode can then be used either in queries or with EntityManager#lock(). Moreso, these lock modes work independantly of whether the entity is versioned (@Version) or not. PESSIMISTIC_READ would acquire a shared read lock while PESSIMISTIC_WRITE would acquire an exclusive write lock.

What I'm not sure yet about are the following things:

1) Whether to add a query hint or an explicit setLockMode() method + $_lockMode property on the query. I'm tending towards the latter to give a more explicit API for locking (hints are a bit non-obvious and may be overlooked).

2) Whether we need any kind of OPTIMISTIC support in LockMode or *_FORCE_INCREMENT support. I currently think we should start without both.

Comment by Benjamin Eberlei [ 14/Mar/10 ]

In my opinion EntityManager#lock() is not necessary for PHP

I think it has a place in long running scripts, where you dont know if you have already a write lock for an entity you have retrieved from the session some time ago, however in PHP you certainly know if a script you are executing needs a read/write consistency lock or not.

This is also where optimistic force increment shines in Java, because you update the version column you are certain you have a read consistent version of your entity. In PHP the time-frame between retrievial and a call to lock() is just unnecessary small and could be solved by directly locking the entity.

This is why i think EntityManager::find needs to be able to set the version also. A long running script in php means:

1. Session A retrieves Entity with Id 1 and Version 1 and displays a form
2. Session B retrieves Entity with Id 1 and Version 1and displays a form
3. Session B retrieves Entity withId 1 and Version 1 and updates it to Version 2
4. Session A retrieves Entity with Id 1 and Version 2 and a lost update happens.

This would be the case without checking the version column. A pessimistic lock in this case is not possible, because it would be lost between point 1 and point 4. However an optimistic lock could check upon retrieval of the entity in point 4, if it is still at version 1.

Comment by Roman S. Borschel [ 14/Mar/10 ]

You dont always know at the point of reading an entity whether you want to lock it or not because these decisions can be made in different layers of the application, even during a single HTTP request. And pessimistic (online) locks are never retrieved in "long-running scripts" / "business transactions", i.e. held during user think time, that would be a concurrency nightmare! And the pessimistic locks in JPA are online locks (select ... for update), not offline locks, so there is no difference between Java and PHP when it comes to the pessimistic online locks (select ... for update).

We are really only talking about pessimistic online locks and optimistic offline locks here. (I dont even know if there is such a thing as an optimistic online lock).

All/Most of the examples on the following blog do not refer to long-running business transactions (transactions that span multiple database transactions with user-think time in between).
http://weblogs.java.net/blog/2009/07/30/jpa-20-concurrency-and-locking

Of course EntityManaher#find et al should be extended with optional lockMode and lockVersion arguments like you said but nevertheless there should still be EntityManager#lock().

Comment by Roman S. Borschel [ 14/Mar/10 ]

Let me clarify, pessimistic online locks only really make sense for the duration of a transaction since they're released at the end of a transaction. You dont want to hold such a transaction open during a long-running business transaction (user-think time / multiple requests). Thats why there is no difference in usage of such locks in Java or PHP.

Comment by Lukas Kahwe [ 14/Mar/10 ]

actually pessimistic (time limited) offline locks are also commonly used.

Comment by Roman S. Borschel [ 14/Mar/10 ]

Moreso, even if you do know you want to lock at the point of reading the entity (which may not be the case like I said above, these decisions can be made in different layers, or even in code of an extension that receives an existing object and now wants to lock it to do its work properly. Without em.lock() that would mean re-reading the whole entity), you may still not want to lock at the point of reading:

1) read entity
2) do some (potentially expensive) computations/calculations/whatever based on the state of the entity but we dont need a lock yet
3) after inspecting the outcome of 2) we decide we want to lock

(yes, locking after reading risks stale data but it locks for a shorter duration, its a compromise)

When you can only lock during reading, this would mean in such a case holding the lock unnecessarily long.

Comment by Lukas Kahwe [ 14/Mar/10 ]

Err .. what you just described is optimistic offline locking. Aka you read the version identifier, go off do your thing, commit .. possibly discover that there was a concurrent write ..

Comment by Roman S. Borschel [ 14/Mar/10 ]

@Lukas: Yea. We're really only talking about pessimistic online locks and optimistic offline locks here. I know there are pessimistic offline locks (D1 even has an implementation of that which I wrote) its not currently in the focus for D2 and can even be provided by an extension.

Comment by Benjamin Eberlei [ 14/Mar/10 ]

Ok given layering EntityManager#lock() has use-cases.

However from my experience I want to lock entities at the beginning of long running scripts, because these runs are more important to my business than those small runs from users.

Say i have an entity that at a certain point in time needs to calculate very expensive stuff, before I start the calculations i would want to make sure that these wont get busted by a user updating the entity during that run, so i would apply SELECT * FOR UPDATE before the calculations and not after them. At point 3 when you apply the lock, you only know if you have a consistent lock between 1 and 3 when you have a version field to check against, a pessimistic lock wont help there.

The pessimistic lock in EntityManager#lock() makes sense if you want to make sure that after the aquiring no changes happen, for example:

1. read entity
2. do some simple checks
3. realize after 2, we need to make expensive computations, lock() or refresh(entity, lockMode) here? (probably depends if the entity has a version column or not)
4. do expensive computations
5. flush

Comment by Roman S. Borschel [ 14/Mar/10 ]

Based on all the discussion so far I updated the main issue description with an API specification. Feel free to comment.

Comment by Roman S. Borschel [ 14/Mar/10 ]

Next on my list is working out the transaction semantics for all the cases, i.e. where to enforce that it is invoked inside an active transaction and in which cases the Optimistic/Pessimistic exceptions cause a transaction rollback (or whether there is actually a case in which they do not result in a rollback).

Comment by Roman S. Borschel [ 14/Mar/10 ]

First update for transaction requirements.

Comment by Roman S. Borschel [ 15/Mar/10 ]

Updated refresh() specification.

Comment by Benjamin Eberlei [ 03/Apr/10 ]

Add another condition for PESSIMITIC locks in combination with find()*

  • Refresh entity with lock when entity with id exists in identity map already
Comment by Benjamin Eberlei [ 03/Apr/10 ]

hm actually only EntityManager#lock($entity, $mode) has to be called.

Comment by Benjamin Eberlei [ 03/Apr/10 ]

Patch for the said functionality, its missing refresh() changes though.

Comment by Benjamin Eberlei [ 04/Apr/10 ]

I made the following observations on lock timeouts:

  • MySQL supports them for InnoDb at the Database Configuration level
  • Postgres has either NO WAIT or no timeout at all
  • Oracle has either NO WAIT, no timeout, or a specified number of miliseconds.
  • Sqlite has no locking

Unrelated but:

  • MsSql has a very weird row lock syntax, we need a new construct in AbstractPlatform for this.

Question:

Should we support something like lock timeouts via a query hint? If yes:

  • Should we only specifiy something like setHint(Query::HINT_LOCK_TIMEOUT_NOWAIT, true);
  • or like setHint(Query::HINT_LOCK_TIMEOUT, 0);

Vendors except Oracle and Postgres would simply ignore this option. Postgres would only support TIMEOUT = 0 as NOWAIT

Comment by Benjamin Eberlei [ 02/May/10 ]

The latest version of lock-support is available here:

http://github.com/beberlei/doctrine2/tree/lock-support

The Lock Support is now tested using Gearman Job Server allowing to have functional scenarios where waiting for lock releases is necessary, see:

http://www.whitewashing.de/blog/articles/129

Refresh() is still missing, I am not sure if this should be included in the 2.0 version (use-case is very slim).

Comment by Roman S. Borschel [ 19/May/10 ]

Great work so far. I think we can skip refresh() support for now, so post-2.0 if at all.

@"Should we support something like lock timeouts via a query hint?"

I think setHint(Query::HINT_LOCK_TIMEOUT, 0) would be good. That way we keep the possibility open for later enhancements regarding other timeout values (i.e. if features change on databases) without requiring public API changes. However, it should be clearly documented that this is a hint and not a guarantee and it should be documented which database vendors interpret the timeout in which way I think.
As a start, only having timeout = 0 => NOWAIT would be enough.

Comment by Roman S. Borschel [ 19/May/10 ]

You can reschedule the lock timeout / nowait to beta3 if you want that. I think we already have enough for beta2.

Comment by Roman S. Borschel [ 05/Jun/10 ]

Pushing outstanding work back to beta3.

Comment by Benjamin Eberlei [ 04/Jul/10 ]

Implemented, Lock Timeouts will be handled in a dedicated ticket.

Generated at Mon Jul 28 18:33:55 UTC 2014 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.