Date: Fri, 10 May 2013 19:44:03 +0000 (UTC) From: Jung-uk Kim <jkim@FreeBSD.org> To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r250476 - projects/flex-sf/usr.bin/lex Message-ID: <201305101944.r4AJi3Dx021866@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: jkim Date: Fri May 10 19:44:02 2013 New Revision: 250476 URL: http://svnweb.freebsd.org/changeset/base/250476 Log: Adopt some useful changes from NetBSD, e.g., http://mail-index.netbsd.org/source-changes/2009/10/26/msg002361.html Modified: projects/flex-sf/usr.bin/lex/lex.1 Modified: projects/flex-sf/usr.bin/lex/lex.1 ============================================================================== --- projects/flex-sf/usr.bin/lex/lex.1 Fri May 10 19:29:30 2013 (r250475) +++ projects/flex-sf/usr.bin/lex/lex.1 Fri May 10 19:44:02 2013 (r250476) @@ -1,8 +1,8 @@ .\" $FreeBSD$ .\" -.TH FLEX 1 "April 1995" "Version 2.5" +.TH FLEX 1 "May 1, 2013" "Version 2.5.37" .SH NAME -flex \- fast lexical analyzer generator +flex, lex \- fast lexical analyzer generator .SH SYNOPSIS .B flex .B [\-bcdfhilnpstvwBFILTV78+? \-C[aefFmr] \-ooutput \-Pprefix \-Sskeleton] @@ -11,8 +11,8 @@ flex \- fast lexical analyzer generator .SH OVERVIEW This manual describes .I flex, -a tool for generating programs that perform pattern-matching on text. The -manual includes both tutorial and reference sections: +a tool for generating programs that perform pattern-matching on text. +The manual includes both tutorial and reference sections: .nf Description @@ -95,19 +95,22 @@ programs which recognize lexical pattern .I flex reads the given input files, or its standard input if no file names are given, -for a description of a scanner to generate. The description is in -the form of pairs +for a description of a scanner to generate. +The description is in the form of pairs of regular expressions and C code, called -.I rules. flex +.I rules. +.I flex generates as output a C source file, .B lex.yy.c, which defines a routine .B yylex(). This file is compiled and linked with the .B \-ll -library to produce an executable. When the executable is run, +library to produce an executable. +When the executable is run, it analyzes its input for occurrences -of the regular expressions. Whenever it finds one, it executes +of the regular expressions. +Whenever it finds one, it executes the corresponding C code. .SH SOME SIMPLE EXAMPLES First some simple examples to get the flavor of how one uses @@ -128,7 +131,8 @@ scanner is copied to the output, so the net effect of this scanner is to copy its input file to its output with each occurrence of "username" expanded. -In this input, there is just one rule. "username" is the +In this input, there is just one rule. +"username" is the .I pattern and the "printf" is the .I action. @@ -156,13 +160,15 @@ Here's another simple example: .fi This scanner counts the number of characters and the number of lines in its input (it produces no output other than the -final report on the counts). The first line +final report on the counts). +The first line declares two globals, "num_lines" and "num_chars", which are accessible both inside .B yylex() and in the .B main() -routine declared after the second "%%". There are two rules, one +routine declared after the second "%%". +There are two rules, one which matches a newline ("\\n") and increments both the line count and the character count, and one which matches any character other than a newline (indicated by the "." regular expression). @@ -223,7 +229,8 @@ A somewhat more complicated example: .fi This is the beginnings of a simple scanner for a language like -Pascal. It identifies different types of +Pascal. +It identifies different types of .I tokens and reports on what it has seen. .PP @@ -263,7 +270,8 @@ followed by zero or more letters, digits The definition is taken to begin at the first non-white-space character following the name and continuing to the end of the line. The definition can subsequently be referred to using "{name}", which -will expand to "(definition)". For example, +will expand to "(definition)". +For example, .nf DIGIT [0-9] @@ -308,7 +316,8 @@ Finally, the user code section is simply .B lex.yy.c verbatim. It is used for companion routines which call or are called -by the scanner. The presence of this section is optional; +by the scanner. +The presence of this section is optional; if it is missing, the second .B %% in the input file may be skipped, too. @@ -339,7 +348,8 @@ beginning with "/*") is also copied verb to the next "*/". .SH PATTERNS The patterns in the input are written using an extended set of regular -expressions. These are: +expressions. +These are: .nf x match the character 'x' @@ -425,7 +435,8 @@ operators, '-', ']', and, at the beginni .PP The regular expressions listed above are grouped according to precedence, from highest precedence at the top to lowest at the bottom. -Those grouped together have equal precedence. For example, +Those grouped together have equal precedence. +For example, .nf foo|bar* @@ -438,7 +449,8 @@ is the same as .fi since the '*' operator has higher precedence than concatenation, -and concatenation higher than alternation ('|'). This pattern +and concatenation higher than alternation ('|'). +This pattern therefore matches .I either the string "foo" @@ -478,7 +490,8 @@ The valid expressions are: These expressions all designate a set of characters equivalent to the corresponding standard C .B isXXX -function. For example, +function. +For example, .B [:alnum:] designates those characters for which .B isalnum() @@ -514,16 +527,19 @@ above .I will match a newline unless "\\n" (or an equivalent escape sequence) is one of the characters explicitly present in the negated character class -(e.g., "[^A-Z\\n]"). This is unlike how many other regular +(e.g., "[^A-Z\\n]"). +This is unlike how many other regular expression tools treat negated character classes, but unfortunately the inconsistency is historically entrenched. Matching newlines means that a pattern like [^"]* can match the entire input unless there's another quote in the input. .IP - A rule can have at most one instance of trailing context (the '/' operator -or the '$' operator). The start condition, '^', and "<<EOF>>" patterns +or the '$' operator). +The start condition, '^', and "<<EOF>>" patterns can only occur at the beginning of a pattern, and, as well as with '/' and '$', -cannot be grouped inside parentheses. A '^' which does not occur at +cannot be grouped inside parentheses. +A '^' which does not occur at the beginning of a rule or a '$' which does not occur at the end of a rule loses its special properties and is treated as a normal character. .IP @@ -555,10 +571,12 @@ A similar trick will work for matching a bar-at-the-beginning-of-a-line. .SH HOW THE INPUT IS MATCHED When the generated scanner is run, it analyzes its input looking -for strings which match any of its patterns. If it finds more than +for strings which match any of its patterns. +If it finds more than one match, it takes the one matching the most text (for trailing context rules, this includes the length of the trailing part, even -though it will then be returned to the input). If it finds two +though it will then be returned to the input). +If it finds two or more matches of the same length, the rule listed first in the .I flex @@ -580,7 +598,8 @@ input is scanned for another match. If no match is found, then the .I default rule is executed: the next character in the input is considered matched and -copied to the standard output. Thus, the simplest legal +copied to the standard output. +Thus, the simplest legal .I flex input is: .nf @@ -603,7 +622,8 @@ uses by including one of the special dir .B %pointer or .B %array -in the first (definitions) section of your flex input. The default is +in the first (definitions) section of your flex input. +The default is .B %pointer, unless you use the .B -l @@ -613,7 +633,8 @@ will be an array. The advantage of using .B %pointer is substantially faster scanning and no buffer overflow when matching -very large tokens (unless you run out of dynamic memory). The disadvantage +very large tokens (unless you run out of dynamic memory). +The disadvantage is that you are restricted in how your actions can modify .B yytext (see the next section), and calls to the @@ -632,7 +653,8 @@ to your heart's content, and calls to .B unput() do not destroy .B yytext -(see below). Furthermore, existing +(see below). +Furthermore, existing .I lex programs sometimes access .B yytext @@ -650,14 +672,17 @@ defines .B yytext to be an array of .B YYLMAX -characters, which defaults to a fairly large value. You can change +characters, which defaults to a fairly large value. +You can change the size by simply #define'ing .B YYLMAX to a different value in the first section of your .I flex -input. As mentioned above, with +input. +As mentioned above, with .B %pointer -yytext grows dynamically to accommodate large tokens. While this means your +yytext grows dynamically to accommodate large tokens. +While this means your .B %pointer scanner can accommodate very large tokens (such as matching entire blocks of comments), bear in mind that each time the scanner must resize @@ -679,10 +704,13 @@ with C++ scanner classes option; see below). .SH ACTIONS Each pattern in a rule has a corresponding action, which can be any -arbitrary C statement. The pattern ends at the first non-escaped -whitespace character; the remainder of the line is its action. If the +arbitrary C statement. +The pattern ends at the first non-escaped +whitespace character; the remainder of the line is its action. +If the action is empty, then when the pattern is matched the input token -is simply discarded. For example, here is the specification for a program +is simply discarded. +For example, here is the specification for a program which deletes all occurrences of "zap me" from its input: .nf @@ -730,7 +758,8 @@ Actions are free to modify .B yytext except for lengthening it (adding characters to its end--these will overwrite later characters in the -input stream). This however does not apply when using +input stream). +This however does not apply when using .B %array (see above); in that case, .B yytext @@ -754,7 +783,8 @@ corresponding start condition (see below .IP - .B REJECT directs the scanner to proceed on to the "second best" rule which matched the -input (or a prefix of the input). The rule is chosen as described +input (or a prefix of the input). +The rule is chosen as described above in "How the Input is Matched", and .B yytext and @@ -782,7 +812,8 @@ scanner normally executes only one actio Multiple .B REJECT's are allowed, each one finding the next best choice to the currently -active rule. For example, when the following scanner scans the token +active rule. +For example, when the following scanner scans the token "abcd", it will write "abcdabcaba" to the output: .nf @@ -802,7 +833,8 @@ if it is used in .I any of the scanner's actions it will slow down .I all -of the scanner's matching. Furthermore, +of the scanner's matching. +Furthermore, .B REJECT cannot be used with the .I -Cf @@ -824,7 +856,8 @@ token should be .I appended onto the current value of .B yytext -rather than replacing it. For example, given the input "mega-kludge" +rather than replacing it. +For example, given the input "mega-kludge" the following will write "mega-mega-kludge" to the output: .nf @@ -833,7 +866,8 @@ the following will write "mega-mega-klud kludge ECHO; .fi -First "mega-" is matched and echoed to the output. Then "kludge" +First "mega-" is matched and echoed to the output. +Then "kludge" is matched, but the previous "mega-" is still hanging around at the beginning of .B yytext @@ -869,7 +903,8 @@ are adjusted appropriately (e.g., .B yyleng will now be equal to .I n -). For example, on the input "foobar" the following will write out +). +For example, on the input "foobar" the following will write out "foobarbar": .nf @@ -880,7 +915,8 @@ will now be equal to .fi An argument of 0 to .B yyless -will cause the entire current input string to be scanned again. Unless you've +will cause the entire current input string to be scanned again. +Unless you've changed how the scanner will subsequently process its input (using .B BEGIN, for example), this will result in an endless loop. @@ -893,7 +929,8 @@ other source files. .B unput(c) puts the character .I c -back onto the input stream. It will be the next character scanned. +back onto the input stream. +It will be the next character scanned. The following action will take the current token and cause it to be rescanned enclosed in parentheses. .nf @@ -926,7 +963,8 @@ is that if you are using the contents of .I yytext, starting with its rightmost character and devouring one character to -the left with each call. If you need the value of yytext preserved +the left with each call. +If you need the value of yytext preserved after a call to .B unput() (as in the above example), @@ -939,7 +977,8 @@ Finally, note that you cannot put back to attempt to mark the input stream with an end-of-file. .IP - .B input() -reads the next character from the input stream. For example, +reads the next character from the input stream. +For example, the following is one way to eat up C comments: .nf @@ -986,18 +1025,20 @@ flushes the scanner's internal buffer so that the next time the scanner attempts to match a token, it will first refill the buffer using .B YY_INPUT -(see The Generated Scanner, below). This action is a special case +(see The Generated Scanner, below). +This action is a special case of the more general .B yy_flush_buffer() function, described below in the section Multiple Input Buffers. .IP - .B yyterminate() -can be used in lieu of a return statement in an action. It terminates +can be used in lieu of a return statement in an action. +It terminates the scanner and returns a 0 to the scanner's caller, indicating "all done". By default, .B yyterminate() -is also called when an end-of-file is encountered. It is a macro and -may be redefined. +is also called when an end-of-file is encountered. +It is a macro and may be redefined. .SH THE GENERATED SCANNER The output of .I flex @@ -1006,7 +1047,8 @@ is the file which contains the scanning routine .B yylex(), a number of tables used by it for matching tokens, and a number -of auxiliary routines and macros. By default, +of auxiliary routines and macros. +By default, .B yylex() is declared as follows: .nf @@ -1019,7 +1061,8 @@ is declared as follows: .fi (If your environment supports function prototypes, then it will be "int yylex( void )".) This definition may be changed by defining -the "YY_DECL" macro. For example, you could use: +the "YY_DECL" macro. +For example, you could use: .nf #define YY_DECL float lexscan( a, b ) float a, b; @@ -1027,7 +1070,8 @@ the "YY_DECL" macro. For example, you c .fi to give the scanning routine the name .I lexscan, -returning a float, and taking two floats as arguments. Note that +returning a float, and taking two floats as arguments. +Note that if you give arguments to the scanning routine using a K&R-style/non-prototyped function declaration, you must terminate the definition with a semi-colon (;). @@ -1036,7 +1080,8 @@ Whenever .B yylex() is called, it scans tokens from the global input file .I yyin -(which defaults to stdin). It continues until it either reaches +(which defaults to stdin). +It continues until it either reaches an end-of-file (at which point it returns the value 0) or one of its actions executes a .I return @@ -1058,7 +1103,8 @@ to scan from a source other than .I yyin), and initializes .I yyin -for scanning from that file. Essentially there is no difference between +for scanning from that file. +Essentially there is no difference between just assigning .I yyin to a new input file or using @@ -1096,8 +1142,8 @@ calls to read characters from The nature of how it gets its input can be controlled by defining the .B YY_INPUT macro. -YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)". Its -action is to place up to +YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)". +Its action is to place up to .I max_size characters in the character array .I buf @@ -1105,7 +1151,8 @@ and return in the integer variable .I result either the number of characters read or the constant YY_NULL (0 on Unix systems) -to indicate EOF. The default YY_INPUT reads from the +to indicate EOF. +The default YY_INPUT reads from the global file-pointer "yyin". .PP A sample definition of YY_INPUT (in the definitions @@ -1127,14 +1174,17 @@ one character at a time. When the scanner receives an end-of-file indication from YY_INPUT, it then checks the .B yywrap() -function. If +function. +If .B yywrap() returns false (zero), then it is assumed that the function has gone ahead and set up .I yyin -to point to another input file, and scanning continues. If it returns +to point to another input file, and scanning continues. +If it returns true (non-zero), then the scanner terminates, returning 0 to its -caller. Note that in either case, the start condition remains unchanged; +caller. +Note that in either case, the start condition remains unchanged; it does .I not revert to @@ -1167,9 +1217,11 @@ by assigning it to some other pointer. .SH START CONDITIONS .I flex -provides a mechanism for conditionally activating rules. Any rule +provides a mechanism for conditionally activating rules. +Any rule whose pattern is prefixed with "<sc>" will only be active when -the scanner is in the start condition named "sc". For example, +the scanner is in the start condition named "sc". +For example, .nf <STRING>[^"]* { /* eat up the string body ... */ @@ -1200,9 +1252,11 @@ The former declares .I inclusive start conditions, the latter .I exclusive -start conditions. A start condition is activated using the +start conditions. +A start condition is activated using the .B BEGIN -action. Until the next +action. +Until the next .B BEGIN action is executed, rules with the given start condition will be active and @@ -1218,14 +1272,16 @@ rules qualified with the start condition A set of rules contingent on the same exclusive start condition describe a scanner which is independent of any of the other rules in the .I flex -input. Because of this, +input. +Because of this, exclusive start conditions make it easy to specify "mini-scanners" which scan portions of the input that are syntactically different from the rest (e.g., comments). .PP If the distinction between inclusive and exclusive start conditions is still a little vague, here's a simple example illustrating the -connection between the two. The set of rules: +connection between the two. +The set of rules: .nf %s example @@ -1272,8 +1328,8 @@ start condition. .PP Also note that the special start-condition specifier .B <*> -matches every start condition. Thus, the above example could also -have been written; +matches every start condition. +Thus, the above example could also have been written; .nf %x example @@ -1287,7 +1343,8 @@ have been written; .PP The default rule (to .B ECHO -any unmatched character) remains active in start conditions. It +any unmatched character) remains active in start conditions. +It is equivalent to: .nf @@ -1297,7 +1354,8 @@ is equivalent to: .PP .B BEGIN(0) returns to the original state where only the rules with -no start conditions are active. This state can also be +no start conditions are active. +This state can also be referred to as the start-condition "INITIAL", so .B BEGIN(INITIAL) is equivalent to @@ -1307,7 +1365,8 @@ are considered good style.) .PP .B BEGIN actions can also be given as indented code at the beginning -of the rules section. For example, the following will cause +of the rules section. +For example, the following will cause the scanner to enter the "SPECIAL" start condition whenever .B yylex() is called and the global variable @@ -1329,7 +1388,8 @@ is true: .PP To illustrate the uses of start conditions, here is a scanner which provides two different interpretations -of a string like "123.456". By default it will treat it as +of a string like "123.456". +By default it will treat it as three tokens, the integer "123", a dot ('.'), and the integer "456". But if the string is preceded earlier in the line by the string "expect-floats" @@ -1383,12 +1443,14 @@ maintaining a count of the current input .fi This scanner goes to a bit of trouble to match as much -text as possible with each rule. In general, when attempting to write +text as possible with each rule. +In general, when attempting to write a high-speed scanner try to match as much possible in each rule, as it's a big win. .PP Note that start-conditions names are really integer values and -can be stored as such. Thus, the above could be extended in the +can be stored as such. +Thus, the above could be extended in the following fashion: .nf @@ -1418,7 +1480,8 @@ following fashion: Furthermore, you can access the current start condition using the integer-valued .B YY_START -macro. For example, the above assignments to +macro. +For example, the above assignments to .I comment_caller could instead be written .nf @@ -1499,8 +1562,8 @@ not including checking for a string that .fi .PP Often, such as in some of the examples above, you wind up writing a -whole bunch of rules all preceded by the same start condition(s). Flex -makes this a little easier and cleaner by introducing a notion of +whole bunch of rules all preceded by the same start condition(s). +Flex makes this a little easier and cleaner by introducing a notion of start condition .I scope. A start condition scope is begun with: @@ -1511,7 +1574,8 @@ A start condition scope is begun with: .fi where .I SCs -is a list of one or more start conditions. Inside the start condition +is a list of one or more start conditions. +Inside the start condition scope, every rule automatically has the prefix .I <SCs> applied to it, until a @@ -1558,14 +1622,16 @@ pops the top of the stack and switches t returns the top of the stack without altering the stack's contents. .PP The start condition stack grows dynamically and so has no built-in -size limitation. If memory is exhausted, program execution aborts. +size limitation. +If memory is exhausted, program execution aborts. .PP To use start condition stacks, your scanner must include a .B %option stack directive (see Options below). .SH MULTIPLE INPUT BUFFERS Some scanners (such as those which support "include" files) -require reading from several input streams. As +require reading from several input streams. +As .I flex scanners do a large amount of buffering, one cannot control where the next input will be read from by simply writing a @@ -1579,7 +1645,8 @@ which requires switching the input sourc To negotiate these sorts of problems, .I flex provides a mechanism for creating and switching between multiple -input buffers. An input buffer is created by using: +input buffers. +An input buffer is created by using: .nf YY_BUFFER_STATE yy_create_buffer( FILE *file, int size ) @@ -1592,9 +1659,11 @@ file and large enough to hold .I size characters (when in doubt, use .B YY_BUF_SIZE -for the size). It returns a +for the size). +It returns a .B YY_BUFFER_STATE -handle, which may then be passed to other routines (see below). The +handle, which may then be passed to other routines (see below). +The .B YY_BUFFER_STATE type is a pointer to an opaque .B struct yy_buffer_state @@ -1602,7 +1671,8 @@ structure, so you may safely initialize .B ((YY_BUFFER_STATE) 0) if you wish, and also refer to the opaque structure in order to correctly declare input buffers in source files other than that -of your scanner. Note that the +of your scanner. +Note that the .I FILE pointer in the call to .B yy_create_buffer @@ -1632,7 +1702,8 @@ Note that may be used by yywrap() to set things up for continued scanning, instead of opening a new file and pointing .I yyin -at it. Note also that switching input sources via either +at it. +Note also that switching input sources via either .B yy_switch_to_buffer() or .B yywrap() @@ -1644,7 +1715,8 @@ change the start condition. void yy_delete_buffer( YY_BUFFER_STATE buffer ) .fi -is used to reclaim the storage associated with a buffer. ( +is used to reclaim the storage associated with a buffer. +( .B buffer can be nil, in which case the routine does nothing.) You can also clear the current contents of a buffer using: @@ -1734,12 +1806,14 @@ feature is discussed below): .fi Three routines are available for setting up input buffers for -scanning in-memory strings instead of files. All of them create +scanning in-memory strings instead of files. +All of them create a new input buffer for scanning the string, and return a corresponding .B YY_BUFFER_STATE handle (which you should delete with .B yy_delete_buffer() -when done with it). They also switch to the new buffer using +when done with it). +They also switch to the new buffer using .B yy_switch_to_buffer(), so the next call to .B yylex() @@ -1757,7 +1831,8 @@ starting at location .PP Note that both of these functions create and scan a .I copy -of the string or bytes. (This may be desirable, since +of the string or bytes. +(This may be desirable, since .B yylex() modifies the contents of the buffer it is scanning.) You can avoid the copy by using: @@ -1795,7 +1870,8 @@ reflecting the size of the buffer. The special rule "<<EOF>>" indicates actions which are to be taken when an end-of-file is encountered and yywrap() returns non-zero (i.e., indicates -no further files to process). The action must finish +no further files to process). +The action must finish by doing one of four things: .IP - assigning @@ -1819,10 +1895,12 @@ as shown in the example above. .PP <<EOF>> rules may not be used with other patterns; they may only be qualified with a list of start -conditions. If an unqualified <<EOF>> rule is given, it +conditions. +If an unqualified <<EOF>> rule is given, it applies to .I all -start conditions which do not already have <<EOF>> actions. To +start conditions which do not already have <<EOF>> actions. +To specify an <<EOF>> rule for only the initial start condition, use .nf @@ -1855,15 +1933,16 @@ An example: The macro .B YY_USER_ACTION can be defined to provide an action -which is always executed prior to the matched rule's action. For example, +which is always executed prior to the matched rule's action. +For example, it could be #define'd to call a routine to convert yytext to lower-case. When .B YY_USER_ACTION is invoked, the variable .I yy_act gives the number of the matched rule (rules are numbered starting with 1). -Suppose you want to profile how often each of your rules is matched. The -following would do the trick: +Suppose you want to profile how often each of your rules is matched. +The following would do the trick: .nf #define YY_USER_ACTION ++ctr[yy_act] @@ -1871,8 +1950,8 @@ following would do the trick: .fi where .I ctr -is an array to hold the counts for the different rules. Note that -the macro +is an array to hold the counts for the different rules. +Note that the macro .B YY_NUM_RULES gives the total number of rules (including the default rule, even if you use @@ -1902,9 +1981,11 @@ but must be used when the scanner's inpu interactive to avoid problems due to waiting to fill buffers (see the discussion of the .B \-I -flag below). A non-zero value +flag below). +A non-zero value in the macro invocation marks the buffer as interactive, a zero -value as non-interactive. Note that use of this macro overrides +value as non-interactive. +Note that use of this macro overrides .B %option interactive , .B %option always-interactive or @@ -1918,8 +1999,9 @@ The macro .B yy_set_bol(at_bol) can be used to control whether the current buffer's scanning context for the next token match is done as though at the -beginning of a line. A non-zero macro argument makes rules anchored with -\&'^' active, while a zero argument makes '^' rules inactive. +beginning of a line. +A non-zero macro argument makes rules anchored with +'^' active, while a zero argument makes '^' rules inactive. .PP The macro .B YY_AT_BOL() @@ -1929,7 +2011,8 @@ will have '^' rules active, false otherw In the generated scanner, the actions are all gathered in one large switch statement and separated using .B YY_BREAK, -which may be redefined. By default, it is simply a "break", to separate +which may be redefined. +By default, it is simply a "break", to separate each rule's action from the following rule's. Redefining .B YY_BREAK @@ -1945,7 +2028,8 @@ This section summarizes the various valu in the rule actions. .IP - .B char *yytext -holds the text of the current token. It may be modified but not lengthened +holds the text of the current token. +It may be modified but not lengthened (you cannot append characters to the end). .IP If the special directive @@ -1957,7 +2041,8 @@ is instead declared where .B YYLMAX is a macro definition that you can redefine in the first section -if you don't like the default value (generally 8KB). Using +if you don't like the default value (generally 8KB). +Using .B %array results in somewhat slower scanners, but the value of .B yytext @@ -1967,7 +2052,8 @@ and .I unput(), which potentially destroy its value when .B yytext -is a character pointer. The opposite of +is a character pointer. +The opposite of .B %array is .B %pointer, @@ -1986,9 +2072,10 @@ holds the length of the current token. .B FILE *yyin is the file which by default .I flex -reads from. It may be redefined but doing so only makes sense before -scanning begins or after an EOF has been encountered. Changing it in -the midst of scanning will have unexpected results since +reads from. +It may be redefined but doing so only makes sense before +scanning begins or after an EOF has been encountered. +Changing it in the midst of scanning will have unexpected results since .I flex buffers its input; use .B yyrestart() @@ -2001,8 +2088,10 @@ at the new input file and then call the .B void yyrestart( FILE *new_file ) may be called to point .I yyin -at the new input file. The switch-over to the new file is immediate -(any previously buffered-up input is lost). Note that calling +at the new input file. +The switch-over to the new file is immediate +(any previously buffered-up input is lost). +Note that calling .B yyrestart() with .I yyin @@ -2012,7 +2101,8 @@ scanning the same input file. .B FILE *yyout is the file to which .B ECHO -actions are done. It can be reassigned by the user. +actions are done. +It can be reassigned by the user. .IP - .B YY_CURRENT_BUFFER returns a @@ -2021,7 +2111,8 @@ handle to the current buffer. .IP - .B YY_START returns an integer value corresponding to the current start -condition. You can subsequently use this value with +condition. +You can subsequently use this value with .B BEGIN to return to that start condition. .SH INTERFACING WITH YACC @@ -2033,7 +2124,8 @@ parser-generator. .I yacc parsers expect to call a routine named .B yylex() -to find the next input token. The routine is supposed to +to find the next input token. +The routine is supposed to return the type of the next token as well as putting any associated value in the global .B yylval. @@ -2051,9 +2143,11 @@ containing definitions of all the .B %tokens appearing in the .I yacc -input. This file is then included in the +input. +This file is then included in the .I flex -scanner. For example, if one of the tokens is "TOK_NUMBER", +scanner. +For example, if one of the tokens is "TOK_NUMBER", part of the scanner might look like: .nf @@ -2070,12 +2164,14 @@ part of the scanner might look like: .I flex has the following options: .TP -.B \-b +.B \-b, --backup Generate backing-up information to .I lex.backup. This is a list of scanner states which require backing up -and the input characters on which they do so. By adding rules one -can remove backing-up states. If +and the input characters on which they do so. +By adding rules one +can remove backing-up states. +If .I all backing-up states are eliminated and .B \-Cf @@ -2083,17 +2179,19 @@ or .B \-CF is used, the generated scanner will run faster (see the .B \-p -flag). Only users who wish to squeeze every last cycle out of their -scanners need worry about this option. (See the section on Performance -Considerations below.) +flag). +Only users who wish to squeeze every last cycle out of their +scanners need worry about this option. +(See the section on Performance Considerations below.) .TP .B \-c is a do-nothing, deprecated option included for POSIX compliance. .TP -.B \-d +.B \-d, \-\-debug makes the generated scanner run in .I debug -mode. Whenever a pattern is recognized and the global +mode. +Whenever a pattern is recognized and the global .B yy_flex_debug is non-zero (which is the default), the scanner will write to @@ -2105,21 +2203,22 @@ a line of the form: .fi The line number refers to the location of the rule in the file -defining the scanner (i.e., the file that was fed to flex). Messages -are also generated when the scanner backs up, accepts the +defining the scanner (i.e., the file that was fed to flex). +Messages are also generated when the scanner backs up, accepts the default rule, reaches the end of its input buffer (or encounters a NUL; at this point, the two look the same as far as the scanner's concerned), or reaches an end-of-file. .TP -.B \-f +.B \-f, \-\-full specifies .I fast scanner. No table compression is done and stdio is bypassed. -The result is large but fast. This option is equivalent to +The result is large but fast. +This option is equivalent to .B \-Cfr (see below). .TP -.B \-h +.B \-h, \-\-help *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201305101944.r4AJi3Dx021866>