[DBAL-96] Make approach towards identifier quoting consistent Created: 26/Feb/11  Updated: 28/Apr/15

Status: Open
Project: Doctrine DBAL
Component/s: Platforms, Schema Managers
Affects Version/s: None
Fix Version/s: 2.6

Type: Improvement Priority: Major
Reporter: Benjamin Eberlei Assignee: Benjamin Eberlei
Resolution: Unresolved Votes: 2
Labels: None

Issue Links:
Duplicate
is duplicated by DBAL-120 MySql platform getAlterTableSQL does ... Resolved
Reference
relates to DBAL-45 Add CLI tool that checks for Reserved... Resolved
relates to DBAL-477 Just doublequote all schema names and... Open
is referenced by DBAL-40 Transparent table&column names escaping Open

 Description   
  • Make the use of `` a general approach for explicit quoting of identifiers
  • introduce AbstractPlatform::getRegularSQLIdentifierCase($identifier)
  • Introduce AbstractPlatform::isRegularIdentifier($identifier)
  • Fix Schema Assets not to lower-case, but to check for explicit quoting before.
  • Filter values of identifiers passed to all platform functions when they are used in information schema queries according to `` explicit quoting rules.

Problem: Schema is independent of a vendor, this means we have to pick a behavior, i propose SQL-92

This means:

  • strtoupper() ALL tables, column, index, foreign key names that are not quoted by ``
  • For any Quoted identifiers by `` the case is kept.
  • We can introduce a validator to detect a schema that cannot be implemented with a given vendor platform.

In conjunction with the SQL reserved keywords tickets we can then improve the DatabaseDriver considerably to detect identifier casings



 Comments   
Comment by Steve Müller [ 24/Dec/13 ]

Benjamin Eberlei this is an interesting approach and I like it. But I have some complaints about it.
1. I doubt users will be happy about forced default casing rules (ALL upper or ALL lower). Therefore we should think about adding a simple configuration option in DBAL allowing to override the default casing behaviour to the user's preference.
2. Using a consistent default casing means we ALWAYS have to quote identifiers as otherwise the underlying database could silently change the case again (don't know if this is an issue).
3. Introducing this approach in 2.x branch is a BC break as it breaks users' mixed-case identifier mappings.

For 2.x we should maybe at least make use of Identifier class throughout the platforms where necessary.

Comment by Sebastien Lavoie [ 28/Mar/15 ]

My 2 cents:

1. Users should not have to worry about platform-specific quoting when using the query builder or helpers, the DBAL should do that for you.
2. Users should be able to explicitly quote using a standard quote (`), the quote would then be converted to the platform’s quote upon SQL generation, without any case change.
3. DBAL should not needlessly quote, it adds bloat and it has been said in DBAL-40 that there is a performance hit.
4. DBAL should not change the case without the user’s knowledge.
5. A connection configuration option (normalize_case) could be added:
• uppercase: always convert unquoted identifiers to uppercase
• lowercase: always convert unquoted identifiers to lowercase
• platform: will use the default value for specific platform. For the case of case-sensitive platform, even when unquoted (MySQL on UNIX), do nothing.
• null (default): no normalization
6. Future versions of DBAL could change the default value to platform, but this would greatly reduce the risk of causing BC breaks at the beginning, giving time to test everything.
7. When using Doctrine\DBAL\Connection::query directly, you must do the quoting yourself since the SQL is executed directly.

Comment by Arthur Bodera [ 28/Mar/15 ]

Sebastien, ad 3. that is incorrect. Read the ticket more closely, look at the PR, look inside schema tool and platform classes. There is already a lot of quoting+unquoting being performed in 2.* and a lot of assumptions. Having quoting enabled across the board might actually increase performance in some cases, because there will be less scanning for keywords (see platform classes) and possibly less quoting/unquoting across Schema*.

The problem is, the quoting right now works in some places and in some platforms and is being performed only when schema/schematool/dql needs it, but is being ignored in all other cases. This means that columns like "group" or table names like "platform" will fail randomly depending on platform/rdbms you actually use. It's a nightmare with cross-platform apps and a struggle for single-platform apps, where your tables are named according to domain-rules and happen to overlap with some rdbms.

Quoting identifiers being "a bloat" is similar to saying, that implicit quoting values is a bloat. Although from security standpoint the former is much rarer, it's the same for portability and stability of the DBAL across platforms.

Comment by Andreas Prucha [ 28/Apr/15 ]

IMO the big problem is, that behaviour across the RDBMs may be completely different:

Some preserve case and are case-insenstive if not quoted (the nicest approach)
Some do not preserve case and normalize to lower or uppercase and are case-insensitive
And some do not preserve case and are case-sensitive (the worst)

The biggest issue arises if the DB needs to be used outside the Doctrine-Environment and all identifiers need to be quoted in statements.

Generated at Sat Aug 29 15:11:52 EDT 2015 using JIRA 6.4.10#64025-sha1:5b8b74079161cd76a20ab66dda52747ee6701bd6.