Simple Parser Example

Extend the Doctrine\Common\Lexer\AbstractLexer class and implement the getCatchablePatterns, getNonCatchablePatterns, and getType methods. Here is a very simple example lexer implementation named CharacterTypeLexer. It tokenizes a string to T_UPPER, T_LOWER andT_NUMBER tokens:

1<?php use Doctrine\Common\Lexer\AbstractLexer; class CharacterTypeLexer extends AbstractLexer { const T_UPPER = 1; const T_LOWER = 2; const T_NUMBER = 3; protected function getCatchablePatterns() { return array( '[a-bA-Z0-9]', ); } protected function getNonCatchablePatterns() { return array(); } protected function getType(&$value) { if (is_numeric($value)) { return self::T_NUMBER; } if (strtoupper($value) === $value) { return self::T_UPPER; } if (strtolower($value) === $value) { return self::T_LOWER; } } }
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

Use CharacterTypeLexer to extract an array of upper case characters:

1<?php class UpperCaseCharacterExtracter { private $lexer; public function __construct(CharacterTypeLexer $lexer) { $this->lexer = $lexer; } public function getUpperCaseCharacters($string) { $this->lexer->setInput($string); $this->lexer->moveNext(); $upperCaseChars = array(); while (true) { if (!$this->lexer->lookahead) { break; } $this->lexer->moveNext(); if ($this->lexer->token['type'] === CharacterTypeLexer::T_UPPER) { $upperCaseChars[] = $this->lexer->token['value']; } } return $upperCaseChars; } } $upperCaseCharacterExtractor = new UpperCaseCharacterExtracter(new CharacterTypeLexer()); $upperCaseCharacters = $upperCaseCharacterExtractor->getUpperCaseCharacters('1aBcdEfgHiJ12'); print_r($upperCaseCharacters);
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

The variable $upperCaseCharacters contains all of the upper case characters:

1Array ( [0] => B [1] => E [2] => H [3] => J )
2
3
4
5
6
7

This is a simple example but it should demonstrate the low level API that can be used to build more complex parsers.