third_party/boost/spirit/doc/x3/tutorial/roman.qbk - platform/external/sdv/vsomeip - Git at Google

 [/==============================================================================
     Copyright (C) 2001-2015 Joel de Guzman
     Copyright (C) 2001-2011 Hartmut Kaiser

     Distributed under the Boost Software License, Version 1.0. (See accompanying
     file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
 ===============================================================================/]

 [section:roman Roman Numerals]

 This example demonstrates:

 * The Symbol Table
 * Non-terminal rules

 [heading Symbol Table]

 The symbol table holds a dictionary of symbols where each symbol is a sequence
 of characters. The template class, can work efficiently with 8, 16, 32 and even
 64 bit characters. Mutable data of type T are associated with each symbol.

 Traditionally, symbol table management is maintained separately outside the BNF
 grammar through semantic actions. Contrary to standard practice, the Spirit
 symbol table class `symbols` is a parser. An object of which may be used
 anywhere in the EBNF grammar specification. It is an example of a dynamic
 parser. A dynamic parser is characterized by its ability to modify its behavior
 at run time. Initially, an empty symbols object matches nothing. At any time,
 symbols may be added or removed, thus, dynamically altering its behavior.

 Each entry in a symbol table may have an associated mutable data slot. In this
 regard, one can view the symbol table as an associative container (or map) of
 key-value pairs where the keys are strings.

 The symbols class expects one template parameter to specify the data type
 associated with each symbol: its attribute. There are a couple of
 namespaces in X3 where you can find various versions of the symbols class
 for handling different  character encoding including ascii, standard,
 standard_wide, iso8859_1, and unicode. The default symbol parser type in
 the main x3 namespace is standard.

 Here's a parser for roman hundreds (100..900) using the symbol table. Keep in
 mind that the data associated with each slot is the parser's attribute (which is
 passed to attached semantic actions).

     struct hundreds_ : x3::symbols<unsigned>
     {
         hundreds_()
         {
             add
                 ("C"    , 100)
                 ("CC"   , 200)
                 ("CCC"  , 300)
                 ("CD"   , 400)
                 ("D"    , 500)
                 ("DC"   , 600)
                 ("DCC"  , 700)
                 ("DCCC" , 800)
                 ("CM"   , 900)
             ;
         }

     } hundreds;

 Here's a parser for roman tens (10..90):

     struct tens_ : x3::symbols<unsigned>
     {
         tens_()
         {
             add
                 ("X"    , 10)
                 ("XX"   , 20)
                 ("XXX"  , 30)
                 ("XL"   , 40)
                 ("L"    , 50)
                 ("LX"   , 60)
                 ("LXX"  , 70)
                 ("LXXX" , 80)
                 ("XC"   , 90)
             ;
         }

     } tens;

 and, finally, for ones (1..9):

     struct ones_ : x3::symbols<unsigned>
     {
         ones_()
         {
             add
                 ("I"    , 1)
                 ("II"   , 2)
                 ("III"  , 3)
                 ("IV"   , 4)
                 ("V"    , 5)
                 ("VI"   , 6)
                 ("VII"  , 7)
                 ("VIII" , 8)
                 ("IX"   , 9)
             ;
         }

     } ones;

 Now we can use `hundreds`, `tens` and `ones` anywhere in our parser expressions.
 They are all parsers.

 [heading Rules]

 Up until now, we've been inlining our parser expressions, passing them directly
 to the `phrase_parse` function. The expression evaluates into a temporary,
 unnamed parser which is passed into the `phrase_parse` function, used, and then
 destroyed. This is fine for small parsers. When the expressions get complicated,
 you'd want to break the expressions into smaller easier-to-understand pieces,
 name them, and refer to them from other parser expressions by name.

 A parser expression can be assigned to what is called a "rule". There are
 various ways to declare rules. The simplest form is:

     rule<ID> const r = "some-name";

 [heading Rule ID]

 At the very least, the rule needs an identification tag. This ID can be any
 struct or class type and need not be defined. Forward declaration would suffice.
 In subsequent tutorials, we will see that the rule ID can have additional
 functionalities for error handling and annotation.

 [heading Rule Name]

 The name is optional, but is useful for debugging and error handling, as
 we'll see later. Notice that rule `r` is declared `const`. Rules are
 immutable and are best declared as `const`. Rules are lightweight and can be
 passed around by value. Its only member variable is a `std::string`: its
 name.

 [note Unlike Qi (Spirit V2), X3 rules can be used with both `phrase_parse` and
 `parse` without having to specify the skip parser]

 [heading Rule Attributes]

 For our next example, there's one more rule form you should know about:

     rule<ID, Attribute> const r = "some-name";

 The Attribute parameter specifies the attribute type of the rule. You've seen
 that our parsers can have an attribute. Recall that the `double_` parser has
 an attribute of `double`. To be precise, these are /synthesized/ attributes.
 The parser "synthesizes" the attribute value. If the parser is a function,
 think of them as function return values.

 [heading Rule Definition]

 After having declared a rule, you need a definition for the rule. Example:

     auto const r_def = double_ >> *(',' >> double_);

 By convention, rule definitions have a _def suffix. Like rules, rule definitions
 are immutable and are best declared as `const`.

 [#__tutorial_spirit_define__]
 [heading BOOST_SPIRIT_DEFINE]

 Now that we have a rule and its definition, we tie the rule with a rule
 definition using the `BOOST_SPIRIT_DEFINE` macro:

     BOOST_SPIRIT_DEFINE(r);

 Behind the scenes, what's actually happening is that we are defining a `parse_rule`
 function in the client namespace that tells X3 how to invoke the rule.
 And so for each rule defined using `BOOST_SPIRIT_DEFINE`, there is an
 overloaded `parse_rule` function. At parse time, Spirit X3 recursively calls
 the appropriate `parse_rule` function.

 [note `BOOST_SPIRIT_DEFINE` is variadic and may be used for one or more rules.
 Example: `BOOST_SPIRIT_DEFINE(r1, r2, r3);`]

 [heading Grammars]

 Unlike Qi (Spirit V2), X3 discards the notion of a grammar as a concrete
 entity for encapsulating rules. In X3, a grammar is simply a logical group of
 rules that work together, typically with a single top-level start rule which
 serves as the main entry point. X3 grammars are grouped using namespaces.
 The roman numeral grammar is a very nice and simple example of a grammar:

     namespace parser
     {
         using x3::eps;
         using x3::lit;
         using x3::_val;
         using x3::_attr;
         using ascii::char_;

         auto set_zero = [&](auto& ctx){ _val(ctx) = 0; };
         auto add1000 = [&](auto& ctx){ _val(ctx) += 1000; };
         auto add = [&](auto& ctx){ _val(ctx) += _attr(ctx); };

         x3::rule<class roman, unsigned> const roman = "roman";

         auto const roman_def =
             eps                 [set_zero]
             >>
             (
                 -(+lit('M')     [add1000])
                 >>  -hundreds   [add]
                 >>  -tens       [add]
                 >>  -ones       [add]
             )
         ;

         BOOST_SPIRIT_DEFINE(roman);
     }

 Things to take notice of:

 * The start rule's attribute is `unsigned`.

 * `_val(ctx)` gets a reference to the rule's synthesized attribute.

 * `_attr(ctx)` gets a reference to the parser's synthesized attribute.

 * `eps` is a special spirit parser that consumes no input but is always
   successful. We use it to initialize the rule's synthesized
   attribute, to zero before anything else. The actual parser starts at
   `+lit('M')`, parsing roman thousands. Using `eps` this way is good
   for doing pre and post initializations.

 * The rule `roman` and the definition `roman_def` are const objects.

 * The rule's ID is `class roman`. C++ allows you to declare the class
   in the actual template declaration as you can see in the example:

     x3::rule<class roman, unsigned> const roman = "roman";

 [heading Let's Parse!]

     bool r = parse(iter, end, roman, result);

     if (r && iter == end)
     {
         std::cout << "-------------------------\n";
         std::cout << "Parsing succeeded\n";
         std::cout << "result = " << result << std::endl;
         std::cout << "-------------------------\n";
     }
     else
     {
         std::string rest(iter, end);
         std::cout << "-------------------------\n";
         std::cout << "Parsing failed\n";
         std::cout << "stopped at: \": " << rest << "\"\n";
         std::cout << "-------------------------\n";
     }

 `roman` is our roman numeral parser. This time around we are using the
 no-skipping version of the parse functions. We do not want to skip any spaces!
 We are also passing in an attribute, `unsigned result`, which will receive the
 parsed value.

 The full cpp file for this example can be found here:
 [@../../../example/x3/roman.cpp roman.cpp]

 [endsect]
	[/==============================================================================
	Copyright (C) 2001-2015 Joel de Guzman
	Copyright (C) 2001-2011 Hartmut Kaiser

	Distributed under the Boost Software License, Version 1.0. (See accompanying
	file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
	===============================================================================/]

	[section:roman Roman Numerals]

	This example demonstrates:

	* The Symbol Table
	* Non-terminal rules

	[heading Symbol Table]

	The symbol table holds a dictionary of symbols where each symbol is a sequence
	of characters. The template class, can work efficiently with 8, 16, 32 and even
	64 bit characters. Mutable data of type T are associated with each symbol.

	Traditionally, symbol table management is maintained separately outside the BNF
	grammar through semantic actions. Contrary to standard practice, the Spirit
	symbol table class `symbols` is a parser. An object of which may be used
	anywhere in the EBNF grammar specification. It is an example of a dynamic
	parser. A dynamic parser is characterized by its ability to modify its behavior
	at run time. Initially, an empty symbols object matches nothing. At any time,
	symbols may be added or removed, thus, dynamically altering its behavior.

	Each entry in a symbol table may have an associated mutable data slot. In this
	regard, one can view the symbol table as an associative container (or map) of
	key-value pairs where the keys are strings.

	The symbols class expects one template parameter to specify the data type
	associated with each symbol: its attribute. There are a couple of
	namespaces in X3 where you can find various versions of the symbols class
	for handling different character encoding including ascii, standard,
	standard_wide, iso8859_1, and unicode. The default symbol parser type in
	the main x3 namespace is standard.

	Here's a parser for roman hundreds (100..900) using the symbol table. Keep in
	mind that the data associated with each slot is the parser's attribute (which is
	passed to attached semantic actions).

	struct hundreds_ : x3::symbols<unsigned>
	{
	hundreds_()
	{
	add
	("C" , 100)
	("CC" , 200)
	("CCC" , 300)
	("CD" , 400)
	("D" , 500)
	("DC" , 600)
	("DCC" , 700)
	("DCCC" , 800)
	("CM" , 900)
	;
	}

	} hundreds;

	Here's a parser for roman tens (10..90):

	struct tens_ : x3::symbols<unsigned>
	{
	tens_()
	{
	add
	("X" , 10)
	("XX" , 20)
	("XXX" , 30)
	("XL" , 40)
	("L" , 50)
	("LX" , 60)
	("LXX" , 70)
	("LXXX" , 80)
	("XC" , 90)
	;
	}

	} tens;

	and, finally, for ones (1..9):

	struct ones_ : x3::symbols<unsigned>
	{
	ones_()
	{
	add
	("I" , 1)
	("II" , 2)
	("III" , 3)
	("IV" , 4)
	("V" , 5)
	("VI" , 6)
	("VII" , 7)
	("VIII" , 8)
	("IX" , 9)
	;
	}

	} ones;

	Now we can use `hundreds`, `tens` and `ones` anywhere in our parser expressions.
	They are all parsers.

	[heading Rules]

	Up until now, we've been inlining our parser expressions, passing them directly
	to the `phrase_parse` function. The expression evaluates into a temporary,
	unnamed parser which is passed into the `phrase_parse` function, used, and then
	destroyed. This is fine for small parsers. When the expressions get complicated,
	you'd want to break the expressions into smaller easier-to-understand pieces,
	name them, and refer to them from other parser expressions by name.

	A parser expression can be assigned to what is called a "rule". There are
	various ways to declare rules. The simplest form is:

	rule<ID> const r = "some-name";

	[heading Rule ID]

	At the very least, the rule needs an identification tag. This ID can be any
	struct or class type and need not be defined. Forward declaration would suffice.
	In subsequent tutorials, we will see that the rule ID can have additional
	functionalities for error handling and annotation.

	[heading Rule Name]

	The name is optional, but is useful for debugging and error handling, as
	we'll see later. Notice that rule `r` is declared `const`. Rules are
	immutable and are best declared as `const`. Rules are lightweight and can be
	passed around by value. Its only member variable is a `std::string`: its
	name.

	[note Unlike Qi (Spirit V2), X3 rules can be used with both `phrase_parse` and
	`parse` without having to specify the skip parser]

	[heading Rule Attributes]

	For our next example, there's one more rule form you should know about:

	rule<ID, Attribute> const r = "some-name";

	The Attribute parameter specifies the attribute type of the rule. You've seen
	that our parsers can have an attribute. Recall that the `double_` parser has
	an attribute of `double`. To be precise, these are /synthesized/ attributes.
	The parser "synthesizes" the attribute value. If the parser is a function,
	think of them as function return values.

	[heading Rule Definition]

	After having declared a rule, you need a definition for the rule. Example:

	auto const r_def = double_ >> *(',' >> double_);

	By convention, rule definitions have a _def suffix. Like rules, rule definitions
	are immutable and are best declared as `const`.

	[#__tutorial_spirit_define__]
	[heading BOOST_SPIRIT_DEFINE]

	Now that we have a rule and its definition, we tie the rule with a rule
	definition using the `BOOST_SPIRIT_DEFINE` macro:

	BOOST_SPIRIT_DEFINE(r);

	Behind the scenes, what's actually happening is that we are defining a `parse_rule`
	function in the client namespace that tells X3 how to invoke the rule.
	And so for each rule defined using `BOOST_SPIRIT_DEFINE`, there is an
	overloaded `parse_rule` function. At parse time, Spirit X3 recursively calls
	the appropriate `parse_rule` function.

	[note `BOOST_SPIRIT_DEFINE` is variadic and may be used for one or more rules.
	Example: `BOOST_SPIRIT_DEFINE(r1, r2, r3);`]

	[heading Grammars]

	Unlike Qi (Spirit V2), X3 discards the notion of a grammar as a concrete
	entity for encapsulating rules. In X3, a grammar is simply a logical group of
	rules that work together, typically with a single top-level start rule which
	serves as the main entry point. X3 grammars are grouped using namespaces.
	The roman numeral grammar is a very nice and simple example of a grammar:

	namespace parser
	{
	using x3::eps;
	using x3::lit;
	using x3::_val;
	using x3::_attr;
	using ascii::char_;

	auto set_zero = [&](auto& ctx){ _val(ctx) = 0; };
	auto add1000 = [&](auto& ctx){ _val(ctx) += 1000; };
	auto add = [&](auto& ctx){ _val(ctx) += _attr(ctx); };

	x3::rule<class roman, unsigned> const roman = "roman";

	auto const roman_def =
	eps [set_zero]
	>>
	(
	-(+lit('M') [add1000])
	>> -hundreds [add]
	>> -tens [add]
	>> -ones [add]
	)
	;

	BOOST_SPIRIT_DEFINE(roman);
	}

	Things to take notice of:

	* The start rule's attribute is `unsigned`.

	* `_val(ctx)` gets a reference to the rule's synthesized attribute.

	* `_attr(ctx)` gets a reference to the parser's synthesized attribute.

	* `eps` is a special spirit parser that consumes no input but is always
	successful. We use it to initialize the rule's synthesized
	attribute, to zero before anything else. The actual parser starts at
	`+lit('M')`, parsing roman thousands. Using `eps` this way is good
	for doing pre and post initializations.

	* The rule `roman` and the definition `roman_def` are const objects.

	* The rule's ID is `class roman`. C++ allows you to declare the class
	in the actual template declaration as you can see in the example:

	x3::rule<class roman, unsigned> const roman = "roman";

	[heading Let's Parse!]

	bool r = parse(iter, end, roman, result);

	if (r && iter == end)
	{
	std::cout << "-------------------------\n";
	std::cout << "Parsing succeeded\n";
	std::cout << "result = " << result << std::endl;
	std::cout << "-------------------------\n";
	}
	else
	{
	std::string rest(iter, end);
	std::cout << "-------------------------\n";
	std::cout << "Parsing failed\n";
	std::cout << "stopped at: \": " << rest << "\"\n";
	std::cout << "-------------------------\n";
	}

	`roman` is our roman numeral parser. This time around we are using the
	no-skipping version of the parse functions. We do not want to skip any spaces!
	We are also passing in an attribute, `unsigned result`, which will receive the
	parsed value.

	The full cpp file for this example can be found here:
	[@../../../example/x3/roman.cpp roman.cpp]

	[endsect]