laser.regularlanguage.regularexpression.parser
Class RegularExpressionParser

java.lang.Object
  extended by antlr.LLkParser
      extended by laser.regularlanguage.regularexpression.parser.RegularExpressionParser
All Implemented Interfaces:
RegularExpressionParserTokenTypes

public class RegularExpressionParser
extends antlr.LLkParser
implements RegularExpressionParserTokenTypes

The precedence rules for regular expressions are derived from the rules grep and perl implement, plus Aho, Sethi, and Ullman's Compilers: Principles, Techniques, and Tools (The Dragon book). They are as follows: 1. The regular operations have the highest precendence (*, +, ?, ^k) 2. Concatenation (;) has the second highest precedence 3. Choice (|) has the lowest precedence TODO: Whenever a SemanticException is thrown it should include the line and column number.


Field Summary
static java.lang.String[] _tokenNames
           
protected  AlphabetInterface alphabet_
           
 
Fields inherited from interface laser.regularlanguage.regularexpression.parser.RegularExpressionParserTokenTypes
CHOICE, COMMA, COMMENT, CONCAT, DASH, DOT, EMPTY, EOF, EPSILON, ESC, EXPONENT, IDENTIFIER, KLEENEPLUS, KLEENESTAR, LBRACE, LBRACKET, LETTER, LPAREN, NEWLINE, NULL_TREE_LOOKAHEAD, NUMBER, OPTION, POSINT, QUOTED_STRING, RBRACE, RBRACKET, RPAREN, TILDE, WHITESPACE
 
Constructor Summary
  RegularExpressionParser(TokenBuffer tokenBuf)
           
protected RegularExpressionParser(TokenBuffer tokenBuf, int k)
           
 
Method Summary
 int aPosInt()
          A positive integer
 java.lang.String aQuotedString()
          A quoted string
 TreeNode classSet()
          classSet := ( LBRACKET labelList RBRACKET | TILDE LBRACKET labelList RBRACKET ) A ClassSet is either a choice between a list of events (the rule without with TILDE), or a choice between a list of events that can't occur (the rule with the TILDE).
 Exponent exponent()
          exponent := ( aPosInt | LBRACE aPosInt (DASH (aPosInt)?)? RBACE ) An exponent in a regular expression.
 TreeNode expression()
          expression := term (CHOICE term)* A regular expression is a list of terms separarated by the CHOICE operator.
 LabelInterface label()
          A label in a regular expression.
 java.util.Set<LabelInterface> labelList()
          labelList := label (COMMA label)* A labelList is a comma separated list of labels.
 TreeNode leafExpression()
          leafExpression := ( DOT | label | classSet | LPAREN expression RPAREN | EPSILON )
 TreeNode modifiedExpression()
          modifiedExpression := expression (modifier)?
 TreeNode modifier(TreeNode expression_)
          modifier := ( KLEENESTAR | KLEENEPLUS | OPTION | EXPONENT exponent ) A modifier to a regular expression.
 TreeNode regularExpression(AlphabetInterface alphabet)
          regularExpression := EMPTY | expression A regular expression is either empty or an expression.
 TreeNode term()
          term := modifiedExpression (CONCAT modifiedExpression)* A Term is a list of ModifiedExpressions separated by the CONCAT operator.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

alphabet_

protected AlphabetInterface alphabet_

_tokenNames

public static final java.lang.String[] _tokenNames
Constructor Detail

RegularExpressionParser

protected RegularExpressionParser(TokenBuffer tokenBuf,
                                  int k)

RegularExpressionParser

public RegularExpressionParser(TokenBuffer tokenBuf)
Method Detail

regularExpression

public final TreeNode regularExpression(AlphabetInterface alphabet)
                                 throws RecognitionException,
                                        TokenStreamException
regularExpression := EMPTY | expression A regular expression is either empty or an expression. This is to ensure that empty will only appear as the root node.

Returns:
the TreeNode that represents this regularExpression
Throws:
RecognitionException
TokenStreamException

expression

public final TreeNode expression()
                          throws RecognitionException,
                                 TokenStreamException
expression := term (CHOICE term)* A regular expression is a list of terms separarated by the CHOICE operator. This is to ensure the concatenation has a higher precedence than choice.

Returns:
the TreeNode that represents this expression
Throws:
RecognitionException
TokenStreamException

term

public final TreeNode term()
                    throws RecognitionException,
                           TokenStreamException
term := modifiedExpression (CONCAT modifiedExpression)* A Term is a list of ModifiedExpressions separated by the CONCAT operator. This ensures that the regular operations (*, +, ?, ^k) have a higher precedence than concatenation and that concatenation has a higher precedence than choice.

Returns:
the TreeNode that represents this term
Throws:
RecognitionException
TokenStreamException

modifiedExpression

public final TreeNode modifiedExpression()
                                  throws RecognitionException,
                                         TokenStreamException
modifiedExpression := expression (modifier)?

A ModifiedExpression is AnExpression followed by an optional modifier (*, +, ?, ^). These modifiers have the highest precedence.

Returns:
the TreeNode that represents this modified expression
Throws:
RecognitionException
TokenStreamException

leafExpression

public final TreeNode leafExpression()
                              throws RecognitionException,
                                     TokenStreamException
leafExpression := ( DOT | label | classSet | LPAREN expression RPAREN | EPSILON )

A LeafExpression is either a ".", a single identifier, a list of identifiers, an expression, in parenthesis, or epsilon.

Returns:
the TreeNode that represents this leaf expression
Throws:
RecognitionException
TokenStreamException

modifier

public final TreeNode modifier(TreeNode expression_)
                        throws RecognitionException,
                               TokenStreamException
modifier := ( KLEENESTAR | KLEENEPLUS | OPTION | EXPONENT exponent ) A modifier to a regular expression. The KLEENESTAR repeats a regular expression zero or more times, KLEENEPLUS one or more times, OPTION zero or one times, EXPONENT a number of times based on the exponent that follows.

Parameters:
expression_ - the TreeNode this Modifier applies to
Returns:
a TreeNode represnting the modified regular expression
Throws:
RecognitionException
TokenStreamException

label

public final LabelInterface label()
                           throws RecognitionException,
                                  TokenStreamException
A label in a regular expression.

Returns:
the label read in
Throws:
RecognitionException
TokenStreamException

classSet

public final TreeNode classSet()
                        throws RecognitionException,
                               TokenStreamException
classSet := ( LBRACKET labelList RBRACKET | TILDE LBRACKET labelList RBRACKET ) A ClassSet is either a choice between a list of events (the rule without with TILDE), or a choice between a list of events that can't occur (the rule with the TILDE).

Returns:
the TreeNode that represents this ClassSet
Throws:
RecognitionException
TokenStreamException

labelList

public final java.util.Set<LabelInterface> labelList()
                                              throws RecognitionException,
                                                     TokenStreamException
labelList := label (COMMA label)* A labelList is a comma separated list of labels.

Returns:
a Set containing the labels in the list. Labels that are repeated more than once in the list are only listed once in the Set.
Throws:
RecognitionException
TokenStreamException

exponent

public final Exponent exponent()
                        throws RecognitionException,
                               TokenStreamException
exponent := ( aPosInt | LBRACE aPosInt (DASH (aPosInt)?)? RBACE ) An exponent in a regular expression. This can take one of three forms: a single positive integer (raise to that power exactly), a single positive integer followed by a dash (raise to a power greater than or equal to the number), or two positive integers with a dash between them (raise to a power that is greater than or equal to the first and less than or equal to the second).

Returns:
an Exponent object
Throws:
RecognitionException
TokenStreamException

aPosInt

public final int aPosInt()
                  throws RecognitionException,
                         TokenStreamException
A positive integer

Returns:
the int value of the number read in
Throws:
RecognitionException
TokenStreamException

aQuotedString

public final java.lang.String aQuotedString()
                                     throws RecognitionException,
                                            TokenStreamException
A quoted string

Returns:
the string read in
Throws:
RecognitionException
TokenStreamException