core/ZTokenizer.pas File Reference

Go to the source code of this file.

Classes

class  IZTokenizer
 A tokenizer divides a string into tokens. More...
class  TZCCommentState
 This state will either delegate to a comment-handling state, or return a token with just a slash in it. More...
class  TZCommentState
 A CommentState object returns a comment from a reader. More...
class  TZCppCommentState
 This state will either delegate to a comment-handling state, or return a token with just a slash in it. More...
class  TZNumberState
 A NumberState object returns a number from a reader. More...
class  TZQuoteState
 A quoteState returns a quoted string token from a reader. More...
class  TZSymbolRootNode
 This class is a special case of a SymbolNode. More...
class  TZSymbolState
 The idea of a symbol is a character that stands on its own, such as an ampersand or a parenthesis. More...
struct  TZToken
 A token represents a logical chunk of a string. More...
class  TZTokenizer
 Implements a default tokenizer object. More...
class  TZTokenizerState
 A tokenizerState returns a token, given a reader, an initial character read from the reader, and a tokenizer that is conducting an overall tokenization of the reader. More...
class  TZWhitespaceState
 A whitespace state ignores whitespace (such as blanks and tabs), and returns the tokenizer's next token. More...
class  TZWordState
 A wordState returns a word from a reader. More...

Typedefs

typedef array< TZTokenTZTokenDynArray
 Defines a dynamic array of tokens.
typedef set< TZTokenOptionTZTokenOptions

Enumerations

enum  TZTokenOption {
  toSkipUnknown, toSkipWhitespaces, toSkipComments, toSkipEOF,
  toUnifyWhitespaces, toUnifyNumbers, toDecodeStrings
}
 Defines options for tokenizing strings. More...
enum  TZTokenType {
  ttUnknown, ttEOF, ttFloat, ttInteger,
  ttHexDecimal, ttNumber, ttSymbol, ttQuoted,
  ttQuotedIdentifier, ttWord, ttKeyword, ttWhitespace,
  ttComment, ttSpecial
}
 Objects of this class represent a type of token, such as "number", "symbol" or "word". More...

Functions

protected AddDescendantLine (const string Value)
string Ancestry ()
public Create (TZSymbolNode Parent, Char Character)
TZSymbolNode DeepestRead (TStream Stream)
 Destroy ()
TZSymbolNode EnsureChildWithChar (Char Value)
TZSymbolNode FindChildWithChar (Char Value)
TZSymbolNode FindDescendant (const string Value)
 if (ReadNum > 0) and(ReadChar
TZSymbolNode UnreadToValid (TStream Stream)

Variables

FChildren __pad0__
FValid __pad1__
FParent __pad2__
ReadNum __pad3__
Char Character
TZSymbolNodeArray Children
if FirstChar
TZSymbolNode Parent
or the results of delegating
to a comment handling state
function TZCCommentState
NextToken(Stream:TStream;FirstChar:Char;Tokenizer:TZTokenizer)
var 
ReadChar
begin Result TokenType
TZSymbolNode = class
 Fix for C++ Builder hpp generation bug - #817612.
 TZSymbolNodeArray = array of TZSymbolNode
Boolean Valid
Result Value


Typedef Documentation

typedef array<TZToken> TZTokenDynArray

Defines a dynamic array of tokens.

Definition at line 122 of file ZTokenizer.pas.

Definition at line 101 of file ZTokenizer.pas.


Enumeration Type Documentation

Defines options for tokenizing strings.

Enumerator:
toSkipUnknown 
toSkipWhitespaces 
toSkipComments 
toSkipEOF 
toUnifyWhitespaces 
toUnifyNumbers 
toDecodeStrings 

Definition at line 90 of file ZTokenizer.pas.

Objects of this class represent a type of token, such as "number", "symbol" or "word".

Enumerator:
ttUnknown 
ttEOF 
ttFloat 
ttInteger 
ttHexDecimal 
ttNumber 
ttSymbol 
ttQuoted 
ttQuotedIdentifier 
ttWord 
ttKeyword 
ttWhitespace 
ttComment 
ttSpecial 

Definition at line 68 of file ZTokenizer.pas.


Function Documentation

protected AddDescendantLine ( const string  Value  ) 

string Ancestry (  ) 

public Create ( TZSymbolNode  Parent,
Char  Character 
)

TZSymbolNode DeepestRead ( TStream  Stream  ) 

Destroy (  ) 

TZSymbolNode EnsureChildWithChar ( Char  Value  ) 

TZSymbolNode FindChildWithChar ( Char  Value  ) 

TZSymbolNode FindDescendant ( const string  Value  ) 

if ( ReadNum  ,
 
)

TZSymbolNode UnreadToValid ( TStream  Stream  ) 


Variable Documentation

FChildren __pad0__

Definition at line 315 of file ZTokenizer.pas.

FValid __pad1__

Definition at line 316 of file ZTokenizer.pas.

FParent __pad2__

Definition at line 317 of file ZTokenizer.pas.

Definition at line 1000 of file ZTokenizer.pas.

Char Character

See also:
FCharacter For reading

FCharacter For writing

Definition at line 348 of file ZTokenizer.pas.

See also:
FChildren For reading

FChildren For writing

Definition at line 345 of file ZTokenizer.pas.

Initial value:

 '/' then
  begin
    ReadNum := Stream.Read(ReadChar, 1)

Definition at line 1005 of file ZTokenizer.pas.

See also:
FParent For reading

FParent For writing

Definition at line 354 of file ZTokenizer.pas.

or the results of delegating to a comment handling state function TZCCommentState NextToken (Stream: TStream; FirstChar: Char; Tokenizer: TZTokenizer) var ReadChar

Definition at line 996 of file ZTokenizer.pas.

begin Result TokenType

Definition at line 1002 of file ZTokenizer.pas.

TZSymbolNode = class

Fix for C++ Builder hpp generation bug - #817612.

A SymbolNode object is a member of a tree that contains all possible prefixes of allowable symbols.

Multi- character symbols appear in a SymbolNode tree with one node for each character.

For example, the symbol =:~ will appear in a tree as three nodes. The first node contains an equals sign, and has a child; that child contains a colon and has a child; this third child contains a tilde, and has no children of its own. If the colon node had another child for a dollar sign character, then the tree would contain the symbol =:$.

A tree of SymbolNode objects collaborate to read a (potentially multi-character) symbol from an input stream. A root node with no character of its own finds an initial node that represents the first character in the input. This node looks to see if the next character in the stream matches one of its children. If so, the node delegates its reading task to its child. This approach walks down the tree, pulling symbols from the input that match the path down the tree.

When a node does not have a child that matches the next character, we will have read the longest possible symbol prefix. This prefix may or may not be a valid symbol. Consider a tree that has had =:~ added and has not had =: added. In this tree, of the three nodes that contain =:~, only the first and third contain complete symbols. If, say, the input contains =:a, the colon node will not have a child that matches the 'a' and so it will stop reading. The colon node has to "unread": it must push back its character, and ask its parent to unread. Unreading continues until it reaches an ancestor that represents a valid symbol.

Definition at line 272 of file ZTokenizer.pas.

Definition at line 273 of file ZTokenizer.pas.

Boolean Valid

See also:
FValid For reading

FValid For writing

Definition at line 351 of file ZTokenizer.pas.

Result Value

Definition at line 1003 of file ZTokenizer.pas.


Generated on Wed Dec 30 08:42:41 2009 for zeoslib by  doxygen 1.5.7.1