[DCOM-41] Make annotation parser a bit cleverer Created: 09/Mar/11  Updated: 07/Apr/11  Resolved: 07/Apr/11

Status: Resolved
Project: Doctrine Common
Component/s: None
Affects Version/s: None
Fix Version/s: 2.0.2
Security Level: All

Type: Bug Priority: Major
Reporter: Johannes Schmitt Assignee: Benjamin Eberlei
Resolution: Fixed Votes: 0
Labels: None


From an initial look that I had at the annotation parser, it simply search for anything starting with an "@" somewhere in the doc comment and assumes that it is an annotation. Now, if someone uses the @ somewhere in his doc comment but not as an annotation, the parser breaks.

An example for this can be found in the Doctrine code base, the following comment will result in a parse error:

     * Annotation     ::= "@" AnnotationName ["(" [Values] ")"]
     * AnnotationName ::= QualifiedName | SimpleName | AliasedName
     * QualifiedName  ::= NameSpacePart "\" {NameSpacePart "\"}* SimpleName
     * AliasedName    ::= Alias ":" SimpleName
     * NameSpacePart  ::= identifier
     * SimpleName     ::= identifier
     * Alias          ::= identifier
     * @return mixed False if it is not a valid annotation.

Obviously the first @ is not used to refer to an annotation here. I think it makes sense to allow new annotations only to start at a new line, what do you think?

Comment by Johannes Schmitt [ 09/Mar/11 ]

The pull request is here:

Comment by Benjamin Eberlei [ 09/Mar/11 ]

Does this still work with nested annotations?

 * @annot({@annot2})
Comment by Johannes Schmitt [ 09/Mar/11 ]

Sure, my patch only changes the way how the first @ is found.

Comment by Roman S. Borschel [ 09/Mar/11 ]

First of all, thanks for the patch. I'm just not sure whether the added complexity (nontrivial regexp vs substr/strpos) and performance penalty (to be tested) can justify fixing these such edge-cases where an @ appears in the comment block that is not one of the already stripped inline docblocks. Just my two cents, I think this needs further investigation and more extensive testing.

Edit: Since the regex is "only" applied once per parsed docblock the perf. difference might be negligible but should be tested anyway. Remains the verification of the correctness of the regex, given more extensive testing and test-cases.

Comment by Johannes Schmitt [ 09/Mar/11 ]

I think performance should not be an issue since this data is cached anyway.

As for the necessity of this, it's a pain if you parse third party files and they use @ somewhere which then breaks the parser. Since Symfony2 is heavily relying on the annotation parser not only for Doctrine metadata, but for annotation metadata in general, imo this needs to be fixed. I'm not 100% happy with my solution since it covers not all, but only the most likely cases. Ideally, the lexer would have a better way to detect if an @ is used to mark the beginning of an annotation, and simply ignore the @ if it does not. I'm not familar with the lexer code, but you probably have a better understand of how it works and whether this would make sense.

EDIT: I've made an alternative implementation which you can find here: https://github.com/schmittjoh/common/commit/123315e21aff6d7bce7cb77ab798660d0e68b139

Comment by Benjamin Eberlei [ 05/Apr/11 ]

Fixed formatting of code block

Comment by Benjamin Eberlei [ 07/Apr/11 ]


Generated at Mon Nov 30 10:11:57 EST 2015 using JIRA 6.4.10#64025-sha1:5b8b74079161cd76a20ab66dda52747ee6701bd6.