Posted about 1 year ago by beberlei
By default DQL supports a limited subset of all the vendor-specific SQL functions common between all the vendors. However in many cases once you have decided on a specific database vendor, you will never change it during the life of your project. This decision for a specific vendor potentially allows you to make use of powerful SQL features that are unique to the vendor.
It is worth to mention that Doctrine 2 also allows you to handwrite your SQL instead of extending the DQL parser, which is sort of an advanced extension point. You can map arbitrary SQL to your objects and gain access to vendor specific functionalities using the
EntityManager#createNativeQuery()API.
The DQL Parser has hooks to register functions that can then be used in your DQL queries and transformed into SQL, allowing to extend Doctrines Query capabilities to the vendors strength. This post explains the Used-Defined Functions API (UDF) of the Dql Parser and shows some examples to give you some hints how you would extend DQL.
There are three types of functions in DQL, those that return a numerical value, those that return a string and those that return a Date. Your custom method has to be registered as either one of those. The return type information is used by the DQL parser to check possible syntax errors during the parsing process, for example using a string function return value in a math expression.
You can register your functions adding them to the ORM configuration:
<?php $config = new \Doctrine\ORM\Configuration(); $config->addCustomStringFunction($name, $class); $config->addCustomNumericFunction($name, $class); $config->addCustomDatetimeFunction($name, $class); $em = EntityManager::create($dbParams, $config);
The $name is the name the function will be referred to in the DQL query. $class is a
string of a class-name which has to extend Doctrine\ORM\Query\Node\FunctionNode.
This is a class that offers all the necessary API and methods to implement
a UDF.
In this post we will implement some MySql specific Date calculation methods, which are quite handy in my opinion:
Mysql's DateDiff function
takes two dates as argument and calculates the difference in days with date1-date2.
The DQL parser is a top-down recursive descent parser to generate the Abstract-Syntax Tree (AST) and uses a TreeWalker approach to generate the appropriate SQL from the AST. This makes reading the Parser/TreeWalker code managable in a finite amount of time.
The FunctionNode class I referred to earlier requires you to implement
two methods, one for the parsing process (obviously) called parse and
one for the TreeWalker process called getSql(). I show you the code for
the DateDiff method and discuss it step by step:
<?php /** * DateDiffFunction ::= "DATEDIFF" "(" ArithmeticPrimary "," ArithmeticPrimary ")" */ class DateDiff extends FunctionNode { // (1) public $firstDateExpression = null; public $secondDateExpression = null; public function parse(\Doctrine\ORM\Query\Parser $parser) { $parser->match(Lexer::T_IDENTIFIER); // (2) $parser->match(Lexer::T_OPEN_PARENTHESIS); // (3) $this->firstDateExpression = $parser->ArithmeticPrimary(); // (4) $parser->match(Lexer::T_COMMA); // (5) $this->secondDateExpression = $parser->ArithmeticPrimary(); // (6) $parser->match(Lexer::T_CLOSE_PARENTHESIS); // (3) } public function getSql(\Doctrine\ORM\Query\SqlWalker $sqlWalker) { return 'DATEDIFF(' . $this->firstDateExpression->dispatch($sqlWalker) . ', ' . $this->secondDateExpression->dispatch($sqlWalker) . ')'; // (7) } }
The Parsing process of the DATEDIFF function is going to find two expressions the date1 and the date2 values, whose AST Node representations will be saved in the variables of the DateDiff FunctionNode instance at (1).
The parse() method has to cut the function call "DATEDIFF" and its argument into pieces. Since the parser detects the function using a lookahead the T_IDENTIFIER of the function name has to be taken from the stack (2), followed by a detection of the arguments in (4)-(6). The opening and closing parenthesis have to be detected also. This happens during the Parsing process and leads to the generation of a DateDiff FunctionNode somewhere in the AST of the dql statement.
The ArithmeticPrimary method call is the most common denominator of valid
EBNF tokens taken from the DQL EBNF grammer
that matches our requirements for valid input into the DateDiff Dql function.
Picking the right tokens for your methods is a tricky business, but the EBNF
grammer is pretty helpful finding it, as is looking at the Parser source code.
Now in the TreeWalker process we have to pick up this node and generate SQL from it, which apprently is quite easy looking at the code in (7). Since we don't know which type of AST Node the first and second Date expression are we are just dispatching them back to the SQL Walker to generate SQL from and then wrap our DATEDIFF function call around this output.
Now registering this DateDiff FunctionNode with the ORM using:
<?php $config = new \Doctrine\ORM\Configuration(); $config->addCustomStringFunction('DATEDIFF', 'DoctrineExtensions\Query\MySql\DateDiff');
We can do fancy stuff like:
SELECT
p
FROM DoctrineExtensions\Query\BlogPost p
WHERE DATEDIFF(CURRENT_TIME(),
p.created) < 7
Often useful it the ability to do some simple date calculations in your DQL query using MySql's DATE_ADD function.
I'll skip the bla and show the code for this function:
<?php /** * DateAddFunction ::= * "DATE_ADD" "(" ArithmeticPrimary ", INTERVAL" ArithmeticPrimary Identifier ")" */ class DateAdd extends FunctionNode { public $firstDateExpression = null; public $intervalExpression = null; public $unit = null; public function parse(\Doctrine\ORM\Query\Parser $parser) { $parser->match(Lexer::T_IDENTIFIER); $parser->match(Lexer::T_OPEN_PARENTHESIS); $this->firstDateExpression = $parser->ArithmeticPrimary(); $parser->match(Lexer::T_COMMA); $parser->match(Lexer::T_IDENTIFIER); $this->intervalExpression = $parser->ArithmeticPrimary(); $parser->match(Lexer::T_IDENTIFIER); /* @var $lexer Lexer */ $lexer = $parser->getLexer(); $this->unit = $lexer->token['value']; $parser->match(Lexer::T_CLOSE_PARENTHESIS); } public function getSql(\Doctrine\ORM\Query\SqlWalker $sqlWalker) { return 'DATE_ADD(' . $this->firstDateExpression->dispatch($sqlWalker) . ', INTERVAL ' . $this->intervalExpression->dispatch($sqlWalker) . ' ' . $this->unit . ')'; } }
The only difference compared to the DATEDIFF here is, we additionally need the Lexer to access
the value of the T_IDENTIFIER token for the Date Interval unit, for example the MONTH in:
SELECT
p
FROM DoctrineExtensions\Query\BlogPost p
WHERE DATE_ADD(CURRENT_TIME(),
INTERVAL 4 MONTH) > p.created
The above method now only supports the specification using INTERVAL, to also
allow a real date in DATE_ADD we need to add some decision logic to the parsing
process (makes up for a nice excercise).
Now as you see, the Parsing process doesn't catch all the possible SQL errors, here we don't match for all the valid inputs for the interval unit. However where necessary we rely on the database vendors SQL parser to show us further errors in the parsing process, for example if the Unit would not be one of the supported values by MySql.
Now that you all know how you can implement vendor specific SQL functionalities in DQL, we would be excited to see user extensions that add vendor specific function packages, for example more math functions, XML + GIS Support, Hashing functions and so on.
For 2.0 we will come with the current set of functions, however for a future version we will re-evaluate if we can abstract even more vendor sql functions and extend the DQL languages scope.
Code for this Extension to DQL and other Doctrine Extensions can be found in my Github DoctrineExtensions repository.
Comments (11) [ add comment ]
devil's Posted by Colleen Dick about about 1 year ago.
I'm gonna play devils advocate here. Back in the bad ol' days when I actually hand wrote my queries I just used vendor specific functions such as DATEDIFF (even tho I sort of abstracted DB with PEAR DB) whenever I needed them. Would you mind providing a little more justification for why a developer should write all this wrapping, how much of it is boilerplate, if any and how it can be automatic, and what you have to know about internals & parsing in order to be able to hook one of these up?
I second the comments above Posted by Tayhimself about about 1 year ago.
I don't see the point of this either
Native Sql Posted by beberlei about about 1 year ago.
Yeah maybe i should have mentioned it in the blog post, it is still possible to use Native Sql in Doctrine 2 and map its result to an object structure, so its not necessary to extend the parser (which is pretty advanced) to have access to all the vendor specific functionalities.
That said, DQL is the cornerstone of Doctrine 2 and its incredible flexible and powerful. This is why we allow to add functionality to the Parser, so that one can make use of the powerful DQL features combined with vendor specific features.
We as Doctrine team are aware that extending the parser isn't something you would do on a daily basis on your project to get access to vendor functionalities, but we hope to get the word out that this is possible, and maybe find people that help implement one or two functions for any vendor so that we can wrap them all together in an extension project for Doctrines DQL in the near future.
generic -> specific mappings Posted by drm about about 1 year ago.
Does the parser somehow get notified about the connection type used?
In my case I had some trouble with porting MySQL to SQLite because of some MySQL specific functions. It would be cool if you could add generic functions which will map to vendor specific ones at parse time. That would seriously increase portability.
Platforms Posted by beberlei about about 1 year ago.
@drm: Yes the Parser has access to the AbstractPlatform instance, which abstract generation of vendor specific sql snippets, you could either do something like:
if ($platform instanceof MySQLPlatform) {
} else {
}
Or you can add your own platforms and extend their functionality, accessing that at parsing time. Platforms can be set during Doctrine\DBAL\Connection construction in the EntityManager's factory method.
Its OK Posted by romanb about about 1 year ago.
It's OK not to see the point (yet).
@drm: "It would be cool if you could add generic functions which will map to vendor specific ones at parse time. That would seriously increase portability."
This is exactly what is happening already for all built-in DQL functions :-) Of course custom functions can be made portable in a similar manner.
=
The fundamental idea to understand the point of this "feature", and this is nothing new, is that DQL is NOT SQL. It is a domain-specific language and more specifically a sort of object query language. What is demonstrated in this post is a way to customize this language in a fully integrated way, a way that does not feel "patched on" and does not bypass the strict syntax checks of the parser.
Now what is the point of adding non-portable functions to DQL? Database portability is one thing but its really not the main point of an ORM and neither is to "hide/abstract away SQL". The main point of an ORM is, and this may come as a shock, mapping (between an object model and a relational schema).
So even if custom DQL functions are not portable, they still have the advantage of being neatly integrated with DQL and thus all the OO abstraction provided by it (field names, class names, associations and not table names, column names, foreign keys, ...).
Implementing custom DQL functions for RDBMS specific functions is one thing but it does not have to stop here. What you have at your hands is the ability to influence the design of the domain-specific language DQL, to your particular application domain. You can have functions that encapsulate domain-specific logic and give them domain-specific names. What about CALCULATE_TAX, CART_TOTAL? My fantasy is lacking but you get the point.
You can use Doctrine 2 completely without using DQL at all, using native queries and hand-crafted ResultSetMapping's and you still get all the main functionality of the ORM: the mapping, state-management, lazy-loading, etc. (again, abstraction from SQL or rdbms-vendor-independance is not the main point of an ORM, just icing on the cake). DQL is "nothing more" than a domain-specific, object-oriented language (OO is the "domain" of the language) that outputs SQL + a ResultSetMapping.
good Posted by alexey_baranov about about 1 year ago.
good for us. thank you folks.
starting to catch on Posted by Colleen Dick about about 1 year ago.
I think I am just barely starting to get my head around this. What you're saying then is that someone (not ordinary devs) will eventually gather a package of (e.g. mySQL) vendor specific functions as an add-on to Doctrine DQL and give you all the abstraction and ORM stuff baked in. So you'd use Doctrine and the strap-on mysql package to be swapped out by the strap-on PostGRES package if you decided to change vendors, (hopefully you comment your deps) and once you hook up these packages you can use the specific fcns directly in your DQL instead of handwriting native queries where you have to fudge around with your result sets. I get the principle of integrating at a deeper level rather than patched on. What you're doing here is telling the DQL parser how to access vendor specific functions in the db engine so it can generate the queries for you LATER. Even though the name of the function may look the same, it's not -- it's a program to allow the query generator to correctly generate the native query. And it doesn't have to end with vendor provided functions.
Am I understanding correctly? And I think until the tools get more refined, the ORM situation will get more complex before it finally settles down and gets easier for us tool using apes. Kinda like front end was a gawd awful mess until Jquery came along and settled everyone's hash.
@Colleen Posted by romanb about about 1 year ago.
Yes, I think your understanding is correct. As mentioned previously this is a rather advanced feature though and most users probably will never come in touch with it and wont need it, nevertheless, we wanted to show it.
Custom DQL functions are a relatively easy and well-integrated way (i.e. strict parsing and syntax errors!) to customize DQL a bit to your needs without falling down to a native query, in which case you have to fiddle with the ResultSetMapping by hand, like you say. You can always do everything with native queries if you want and still let Doctrine do the mapping for you, you just have to create the ResultSetMappings by hand then, which can sometimes be non-trivial.
DQL is simply an abstraction for easy and convenient creation of:
1) A native (often even database-specific) SQL query 2) A ResultSetMapping that describes how to turn the SQL result set into a different structure (an object graph, most of the time).
You can use Doctrine without using DQL and still get all the main ORM functionality like tracking the state of your objects, writing object changes to the database, populating objects from SQL results, etc. (Yes, again, hiding SQL is not a main ORM functionality and neither is database vendor independance, not for us at least. These things are just additional benefits. We do not think developers should be shielded from SQL in any way. Instead we think a solid understanding and knowledge of both, relational databases & SQL as well as object-oriented programming, is the best foundation to make good use out of any ORM tool.)
Identifier? Posted by Vincent about 4 months ago.
Hello,
This line bothers me: $parser->match(Lexer::T_IDENTIFIER);
I'm wondering when the content of this token is checked, because custom function names are not a special token recognized by the lexer, and the parser does not seem to care about it either. So where is the magic?
Welcome back to the stoneage Posted by anon about 3 months ago.
Adding grammar logic to php is just wrong really. If I woulda wanna make my own language woulda done it in C not php lol, this is simply fail. Maybe instead add some hook pattern, take a look at some UX and learn something, yes UX applies to apis as well..
And on the other side, if you really insist of having this here, learn to implement all the vendor specific functions on it, and not cutting down functionality like you did here, before releasing a package. Overall doctrine is full of bugs, warnings and fatals everywhere, and this adds up to the beauty.