start-colorer in color:text<%>
Starts tokenizing the buffer for coloring and parenthesis matching.
(-> voidsenda-color:textstart-colorertoken-sym-styleget-tokenpairs)
token-sym-style: (symbol? . -> . string?)
get-token: (input-port? . -> . (values any? symbol? (union false? symbol?) natural-number? natural-number?))
pairs: (listof (list/p symbol? symbol?))token-sym-style will be passed the first return symbol from get-token and should return the style-name that the token should be colored.
get-token takes an input port and returns the next token as 5 values:
An unused value. This value is intended to represent the textual component of the token and may be used as such in the future.
A symbol describing the type of the token. This symbol is transformed into a style-name via the token-sym->style argument. The symbols 'white-space and 'comment have special meaning and should always be returned for white space and comment tokens respectively. The symbol 'no-color can be used to indicate that although the token is not white space, it should not be colored. The symbol 'eof must be used to indicate when all the tokens have been consumed.
A symbol indicating how the token should be treated by the paren matcher or #f. This symbol should be in the pairs argument.
The starting position of the token.
The ending position of the token.
get-token will usually be implemented with a lexer using the (lib "lex.ss" "parser-tools") library.
get-token must obey the following invariants:Every position in the buffer must be accounted for in exactly one token.
The token returned by get-token must rely only on the contents of the input port argument. This means that the tokenization of some part of the input cannot depend on earlier parts of the input.
No edit to the buffer can change the tokenization of the buffer prior to the token immediately preceding the edit. In the following example this invariant does not hold. If the buffer contains:
" 1 2 3
and the tokenizer treats the unmatched " as its own token (a string error token), and separately tokenizes the 1 2 and 3, an edit to make the buffer look like:
" 1 2 3"
would result in a single string token modifying previous tokens. To handle these situations, get-token must treat the first line as a single token.
pairs is a list of different kinds of matching parens. The second value returned by get-token is compared to this list to see how the paren matcher should treat the token. An example: Suppose pairs is
'((|(| |)|) (|[| |]|) (begin end)). This means that there are three kinds of parens. Any token which has 'begin as its second return value will act as an open for matching tokens with 'end. Similarly any token with'|]|will act as a closing match for tokens with'|[|. When trying to correct a mismatched closing parenthesis, each closing symbol in pairs will be converted to a string and tried as a closing parenthesis.