[DBAL-65] No way to store binary data in PostgreSQL with Doctrine Created: 20/Nov/10  Updated: 09/Oct/12  Resolved: 20/Nov/10

Status: Resolved
Project: Doctrine DBAL
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Tomasz Jędrzejewski Assignee: Benjamin Eberlei
Resolution: Invalid Votes: 0
Labels: None
Environment:

PostgreSQL 8.4



 Description   

The type system introduced by Doctrine makes impossible to store binary data in PostgreSQL databases that use Unicode. The `text` type is mapped to `TEXT`, but any trial to place some binary data there ends up with a database error, like this:

SQLSTATE[22021]: Character not in repertoire: 7 ERROR: invalid byte sequence for encoding "UTF8": 0x9c

This is a critical limitation, because Doctrine cannot be used now in projects that for any reasons have to use PostgreSQL, and their databases must store binary data. Even if it cannot be fixed right now, it should be clearly pointed out in the documentation in "Known vendor issues".

A possible solution for this problem is creating an equivalent of 'text' field, called 'binary' or something like that. It must be a simple type that is mapped to the simplest, but large type available in the database engine without any form of data structure validation. For PostgreSQL, this could be 'blob', but other database engines can use different types.



 Comments   
Comment by Tomasz Jędrzejewski [ 20/Nov/10 ]

Just a small note why I consider this bug as quite serious: for many programmers and their projects the lack of both support for such content type and any information about the limitation can be very dangerous. It can be impossible to remove ORM, if such an issue is encountered in the implementation process, and trials to workaround it are time-consuming.

If I'm about to decide whether to use a particular ORM or not, I must have full information about ORM and database-specific limitations.

One more update: shame on me, obviously there is no "blob" type in PostgreSQL; in this database engine binary data could be represented by 'BYTEA'.

Comment by Benjamin Eberlei [ 20/Nov/10 ]

This is not an issue, there are two options to "solve" your problem in userland:

1. Create your own DBAL type - http://www.doctrine-project.org/projects/orm/2.0/docs/reference/basic-mapping/en#custom-mapping-types
2. Use columnDefinition Attribute of @column - http://www.doctrine-project.org/projects/orm/2.0/docs/reference/annotations-reference/en#ann_column

Comment by Tomasz Jędrzejewski [ 12/Dec/10 ]

I know I can create a custom type, but I'd like to have a portable binary type by default in Doctrine DBAL, not reinventing the wheel every time I want to have one. I consider binary data as one of the primitive types that every database engine supports.

Comment by Jon Wadsworth [ 09/Oct/12 ]

This is an old post but just in case somebody else finds it. There is no need to do any of the above to store binary data in Postgres. I had the same situation and was easily solved by compressing file, base64 encoding it, and finally serializing it.

public static function prepareFileforDatabase($file)

{ $compressor = new \Zend_Filter_Compress_Gz(); $file = $compressor->compress($file); $file = base64_encode($file); return serialize($file); }

We use Zend and you may be able to get away with not compressing if you wanted to avoid the extra overhead on your server. To undo it is exactly the opposite.

public static function prepareFileforPHP($file)

{ $compressor = new \Zend_Filter_Compress_Gz(); $file = unserialize($file); $file = base64_decode($file); return $compressor->decompress($file); }

Sorry for the code coming out in all one line, but you get the idea.

Generated at Fri Oct 31 22:25:11 UTC 2014 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.