[DDC-821] Consider adding Query-Join as another join method for DQL Created: 29/Sep/10  Updated: 30/Jul/13  Resolved: 30/Jul/13

Status: Closed
Project: Doctrine 2 - ORM
Component/s: ORM
Affects Version/s: None
Fix Version/s: 2.x
Security Level: All

Type: New Feature Priority: Major
Reporter: Marc Hodgins Assignee: Guilherme Blanco
Resolution: Fixed Votes: 2
Labels: None


 Description   

Some ORM systems support an alternative to fetch-join queries, called a "query-join". See http://www.avaje.org/ebean/introquery_joinquery.html.

A query-join accomplishes the same as a fetch-join (hydrating a larger object graph across all associations types) but executes more than one SQL query in sequence in order to hydrate the requested portions of the graph in a result set. The first query retrieves data from the base entity/table and the next queries retrieve the data for the requested associations.

In some cases this approach is more efficient to a fetch-join:

(1) No data duplication in the SQL result as occurs in a fetch-join on to-many associations. Instead, this data is loaded through a second query. This saves network traffic, memory and general overhead in hydrating the returned results. In the case where large TEXT data is included in result sets, the savings here may be substantial.

(2) setFirstResult() and setMaxResult() are again effective (for pagination) and more importantly more efficient on these query-joins. The current DoctrineExtension solution to enable pagination on fetch-joins requires a series of queries to determine the target primary keys of the root entity of the query. The primary key lookup query requires DISTINCT or GROUP BY – which often triggers filesorts, temporary tables, etc (at least on MySQL) and greatly slows down the query. Query joins would not require this.

Possible implementation example:

// existing fetch-join
$query = $em->createQuery('SELECT c, o FROM Customers c JOIN c.orders o');
$query->setFirstResult(10)->setMaxResult(20); // doesn't do what you'd hope it would do, no ability to use this for pagination
$customersAndOrders = $query->getResult(); // array of Customer objects with Orders hydrated

// proposed query-join
$query = $em->createQuery('SELECT c, o FROM Customers c QUERY JOIN c.orders o');
$query->setFirstResult(10)->setMaxResult(20); // now works for pagination
$customersAndOrders = $query->getResult(); // array of Customer objects with Orders hyrdated
// this would execute a series of queries -- i.e. in SQL
// SELECT ... FROM customers LIMIT 10, 20
// SELECT ... FROM orders WHERE customer_id IN (.....)

and/or, could there be a way to trigger a "query-join" against an existing array of entities? for example

$query = $em->createQuery('SELECT c FROM Customers c'); // single query to fetch customers
$customers = $query->getResult(); // array of Customer objects
$em->join($customers, 'orders'); // fetch and hydrate the 'orders' association on each Customer using a single query

Perhaps at some point in the future Doctrine/DBAL could even make use of asynchronous queries (i.e. mysqlnd supports this) to allow these query-joins to run in parallel and the result would be more efficient paginated resultsets.

Thoughts/feedback?



 Comments   
Comment by Benjamin Eberlei [ 30/Sep/10 ]

There is another approach for this using several subqueries to build an IN clause, the Paginator extension supports this: http://github.com/beberlei/DoctrineExtensions

I rather go the extension approach than changing the DQL for this feature.

Comment by Benjamin Eberlei [ 30/Sep/10 ]

I just saw your second example, that is rather cool though and gets +1 from me.

I had the same idea for "not initialized proxies", i.e.

$em->getUnitOfWork()->initializeProxies('Customer');
Comment by Marc Hodgins [ 29/Dec/10 ]

Second example is a duplicate of DDC-734

Comment by Guilherme Blanco [ 30/Jul/13 ]

Fixed based on this related issue: http://www.doctrine-project.org/jira/browse/DDC-734

Generated at Tue Sep 16 05:07:53 UTC 2014 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.