[DDC-2134] Add referential integrity check for MySQL to console commands Created: 09/Nov/12  Updated: 09/Nov/12

Status: Open
Project: Doctrine 2 - ORM
Component/s: ORM
Affects Version/s: None
Fix Version/s: None
Security Level: All

Type: New Feature Priority: Minor
Reporter: Menno Holtkamp Assignee: Benjamin Eberlei
Resolution: Unresolved Votes: 0
Labels: None
Environment:

MySQL



 Description   

Today I spent some time solving a PHP 'White Screen of Death'. I traced it back to a Entity of which the proxy's __load() function was invoked because af a EXTRA_LAZY association. Due to incorrect database contents (the entry ID was changed due to an update: referential integrity broke), the __load() query resulted in no results. The EntityNotFoundException did for some reason not show up in our logs, probably because the lazy load was triggered by a magic __toString() function.

The cause is because of the way we populate or tables with domain data:

SET FOREIGN_KEY_CHECKS = 0;
#IMPORT STUFF from CSV
SET FOREIGN_KEY_CHECKS = 1;

MySQL does not trigger any errors when the foreign key checks are turned back on, leaving the table in an inconsistent state.

To prevent this, I found some information in this post: http://www.mysqlperformanceblog.com/2011/11/18/eventual-consistency-in-mysql/, which I used to come with the following queries

#Check the constraints of a specific database
SELECT *
	FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE 
		WHERE TABLE_SCHEMA = 'databaseName'
		AND REFERENCED_TABLE_NAME IS NOT NULL

#Generate table specific queries to find orphaned entries
SELECT CONCAT(
	 'SELECT ', GROUP_CONCAT(DISTINCT CONCAT(K.CONSTRAINT_NAME, '.', P.COLUMN_NAME,
	  ' AS `', P.TABLE_SCHEMA, '.', P.TABLE_NAME, '.', P.COLUMN_NAME, '`') ORDER BY P.ORDINAL_POSITION), ' ',
	 	'FROM ', K.TABLE_SCHEMA, '.', K.TABLE_NAME, ' AS ', K.CONSTRAINT_NAME, ' ',
	 		'LEFT OUTER JOIN ', K.REFERENCED_TABLE_SCHEMA, '.', K.REFERENCED_TABLE_NAME, ' AS ', K.REFERENCED_TABLE_NAME, ' ',
	 		' ON (', GROUP_CONCAT(CONCAT(K.CONSTRAINT_NAME, '.', K.COLUMN_NAME) ORDER BY K.ORDINAL_POSITION),
	 		') = (', GROUP_CONCAT(CONCAT(K.REFERENCED_TABLE_NAME, '.', K.REFERENCED_COLUMN_NAME) ORDER BY K.ORDINAL_POSITION), ') ',
	 		'WHERE ', K.REFERENCED_TABLE_NAME, '.', K.REFERENCED_COLUMN_NAME, ' IS NULL;'
	  )
    INTO OUTFILE '/tmp/verifyDatabaseTableIntegrity.sql'
    FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE K
      INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE P
        ON (K.TABLE_SCHEMA, K.TABLE_NAME) = (P.TABLE_SCHEMA, P.TABLE_NAME)
        AND P.CONSTRAINT_NAME = 'PRIMARY'
    WHERE K.TABLE_SCHEMA = 'databaseName'
      AND K.REFERENCED_TABLE_NAME IS NOT NULL
      GROUP BY K.CONSTRAINT_NAME;
	

By running the generated queries, we can now easily find the records that break referential integrity.

It might be an idea of adding this functionality to the orm:validate-schema, or a new orm:validate-database-integrity?


Generated at Fri Dec 19 13:41:07 UTC 2014 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.