Doctrine DBAL
  1. Doctrine DBAL
  2. DBAL-96

Make approach towards identifier quoting consistent

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.6
    • Component/s: Platforms, Schema Managers
    • Labels:
      None

      Description

      • Make the use of `` a general approach for explicit quoting of identifiers
      • introduce AbstractPlatform::getRegularSQLIdentifierCase($identifier)
      • Introduce AbstractPlatform::isRegularIdentifier($identifier)
      • Fix Schema Assets not to lower-case, but to check for explicit quoting before.
      • Filter values of identifiers passed to all platform functions when they are used in information schema queries according to `` explicit quoting rules.

      Problem: Schema is independent of a vendor, this means we have to pick a behavior, i propose SQL-92

      This means:

      • strtoupper() ALL tables, column, index, foreign key names that are not quoted by ``
      • For any Quoted identifiers by `` the case is kept.
      • We can introduce a validator to detect a schema that cannot be implemented with a given vendor platform.

      In conjunction with the SQL reserved keywords tickets we can then improve the DatabaseDriver considerably to detect identifier casings

        Issue Links

          Activity

          Hide
          Steve Müller added a comment -

          Benjamin Eberlei this is an interesting approach and I like it. But I have some complaints about it.
          1. I doubt users will be happy about forced default casing rules (ALL upper or ALL lower). Therefore we should think about adding a simple configuration option in DBAL allowing to override the default casing behaviour to the user's preference.
          2. Using a consistent default casing means we ALWAYS have to quote identifiers as otherwise the underlying database could silently change the case again (don't know if this is an issue).
          3. Introducing this approach in 2.x branch is a BC break as it breaks users' mixed-case identifier mappings.

          For 2.x we should maybe at least make use of Identifier class throughout the platforms where necessary.

          Show
          Steve Müller added a comment - Benjamin Eberlei this is an interesting approach and I like it. But I have some complaints about it. 1. I doubt users will be happy about forced default casing rules (ALL upper or ALL lower). Therefore we should think about adding a simple configuration option in DBAL allowing to override the default casing behaviour to the user's preference. 2. Using a consistent default casing means we ALWAYS have to quote identifiers as otherwise the underlying database could silently change the case again (don't know if this is an issue). 3. Introducing this approach in 2.x branch is a BC break as it breaks users' mixed-case identifier mappings. For 2.x we should maybe at least make use of Identifier class throughout the platforms where necessary.
          Hide
          Sebastien Lavoie added a comment - - edited

          My 2 cents:

          1. Users should not have to worry about platform-specific quoting when using the query builder or helpers, the DBAL should do that for you.
          2. Users should be able to explicitly quote using a standard quote (`), the quote would then be converted to the platform’s quote upon SQL generation, without any case change.
          3. DBAL should not needlessly quote, it adds bloat and it has been said in DBAL-40 that there is a performance hit.
          4. DBAL should not change the case without the user’s knowledge.
          5. A connection configuration option (normalize_case) could be added:
          • uppercase: always convert unquoted identifiers to uppercase
          • lowercase: always convert unquoted identifiers to lowercase
          • platform: will use the default value for specific platform. For the case of case-sensitive platform, even when unquoted (MySQL on UNIX), do nothing.
          • null (default): no normalization
          6. Future versions of DBAL could change the default value to platform, but this would greatly reduce the risk of causing BC breaks at the beginning, giving time to test everything.
          7. When using Doctrine\DBAL\Connection::query directly, you must do the quoting yourself since the SQL is executed directly.

          Show
          Sebastien Lavoie added a comment - - edited My 2 cents: 1. Users should not have to worry about platform-specific quoting when using the query builder or helpers, the DBAL should do that for you. 2. Users should be able to explicitly quote using a standard quote (`), the quote would then be converted to the platform’s quote upon SQL generation, without any case change. 3. DBAL should not needlessly quote, it adds bloat and it has been said in DBAL-40 that there is a performance hit. 4. DBAL should not change the case without the user’s knowledge. 5. A connection configuration option (normalize_case) could be added: • uppercase: always convert unquoted identifiers to uppercase • lowercase: always convert unquoted identifiers to lowercase • platform: will use the default value for specific platform. For the case of case-sensitive platform, even when unquoted (MySQL on UNIX), do nothing. • null (default): no normalization 6. Future versions of DBAL could change the default value to platform, but this would greatly reduce the risk of causing BC breaks at the beginning, giving time to test everything. 7. When using Doctrine\DBAL\Connection::query directly, you must do the quoting yourself since the SQL is executed directly.
          Hide
          Arthur Bodera added a comment - - edited

          Sebastien, ad 3. that is incorrect. Read the ticket more closely, look at the PR, look inside schema tool and platform classes. There is already a lot of quoting+unquoting being performed in 2.* and a lot of assumptions. Having quoting enabled across the board might actually increase performance in some cases, because there will be less scanning for keywords (see platform classes) and possibly less quoting/unquoting across Schema*.

          The problem is, the quoting right now works in some places and in some platforms and is being performed only when schema/schematool/dql needs it, but is being ignored in all other cases. This means that columns like "group" or table names like "platform" will fail randomly depending on platform/rdbms you actually use. It's a nightmare with cross-platform apps and a struggle for single-platform apps, where your tables are named according to domain-rules and happen to overlap with some rdbms.

          Quoting identifiers being "a bloat" is similar to saying, that implicit quoting values is a bloat. Although from security standpoint the former is much rarer, it's the same for portability and stability of the DBAL across platforms.

          Show
          Arthur Bodera added a comment - - edited Sebastien, ad 3. that is incorrect. Read the ticket more closely, look at the PR, look inside schema tool and platform classes. There is already a lot of quoting+unquoting being performed in 2.* and a lot of assumptions. Having quoting enabled across the board might actually increase performance in some cases, because there will be less scanning for keywords (see platform classes) and possibly less quoting/unquoting across Schema*. The problem is, the quoting right now works in some places and in some platforms and is being performed only when schema/schematool/dql needs it, but is being ignored in all other cases. This means that columns like "group" or table names like "platform" will fail randomly depending on platform/rdbms you actually use. It's a nightmare with cross-platform apps and a struggle for single-platform apps, where your tables are named according to domain-rules and happen to overlap with some rdbms. Quoting identifiers being "a bloat" is similar to saying, that implicit quoting values is a bloat. Although from security standpoint the former is much rarer, it's the same for portability and stability of the DBAL across platforms.

            People

            • Assignee:
              Benjamin Eberlei
              Reporter:
              Benjamin Eberlei
            • Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: