examples/fastwc/README - platform/external/flex - Git at Google

 This directory contains some examples illustrating techniques for extracting
 high-performance from flex scanners.  Each program implements a simplified
 version of the Unix "wc" tool: read text from stdin and print the number of
 characters, words, and lines present in the text.  All programs were compiled
 using gcc (version unavailable, sorry) with the -O flag, and run on a
 SPARCstation 1+.  The input used was a PostScript file, mainly containing
 figures, with the following "wc" counts:

 	lines  words  characters
 	214217 635954 2592172


 The basic principles illustrated by these programs are:

 	- match as much text with each rule as possible
 	- adding rules does not slow you down!
 	- avoid backing up

 and the big caveat that comes with them is:

 	- you buy performance with decreased maintainability; make
 	  sure you really need it before applying the above techniques.

 See the "Performance Considerations" section of flexdoc for more
 details regarding these principles.


 The different versions of "wc":

 	mywc.c
 		a simple but fairly efficient C version

 	wc1.l	a naive flex "wc" implementation

 	wc2.l	somewhat faster; adds rules to match multiple tokens at once

 	wc3.l	faster still; adds more rules to match longer runs of tokens

 	wc4.l	fastest; still more rules added; hard to do much better
 		using flex (or, I suspect, hand-coding)

 	wc5.l	identical to wc3.l except one rule has been slightly
 		shortened, introducing backing-up

 Timing results (all times in user CPU seconds):

 	program	  time 	 notes
 	-------   ----   -----
 	wc1       16.4   default flex table compression (= -Cem)
 	wc1        6.7   -Cf compression option
 	/bin/wc	   5.8	 Sun's standard "wc" tool
 	mywc	   4.6   simple but better C implementation!
 	wc2	   4.6   as good as C implementation; built using -Cf
 	wc3	   3.8   -Cf
 	wc4	   3.3   -Cf
 	wc5	   5.7   -Cf; ouch, backing up is expensive
	This directory contains some examples illustrating techniques for extracting
	high-performance from flex scanners. Each program implements a simplified
	version of the Unix "wc" tool: read text from stdin and print the number of
	characters, words, and lines present in the text. All programs were compiled
	using gcc (version unavailable, sorry) with the -O flag, and run on a
	SPARCstation 1+. The input used was a PostScript file, mainly containing
	figures, with the following "wc" counts:

	lines words characters
	214217 635954 2592172


	The basic principles illustrated by these programs are:

	- match as much text with each rule as possible
	- adding rules does not slow you down!
	- avoid backing up

	and the big caveat that comes with them is:

	- you buy performance with decreased maintainability; make
	sure you really need it before applying the above techniques.

	See the "Performance Considerations" section of flexdoc for more
	details regarding these principles.


	The different versions of "wc":

	mywc.c
	a simple but fairly efficient C version

	wc1.l a naive flex "wc" implementation

	wc2.l somewhat faster; adds rules to match multiple tokens at once

	wc3.l faster still; adds more rules to match longer runs of tokens

	wc4.l fastest; still more rules added; hard to do much better
	using flex (or, I suspect, hand-coding)

	wc5.l identical to wc3.l except one rule has been slightly
	shortened, introducing backing-up

	Timing results (all times in user CPU seconds):

	program time notes
	------- ---- -----
	wc1 16.4 default flex table compression (= -Cem)
	wc1 6.7 -Cf compression option
	/bin/wc 5.8 Sun's standard "wc" tool
	mywc 4.6 simple but better C implementation!
	wc2 4.6 as good as C implementation; built using -Cf
	wc3 3.8 -Cf
	wc4 3.3 -Cf
	wc5 5.7 -Cf; ouch, backing up is expensive