runtime/interpreter/mterp/README.txt - platform/art - Git at Google

 rt "mterp" README

 NOTE: Find rebuilding instructions at the bottom of this file.


 ==== Overview ====

 Every configuration has a "config-*" file that controls how the sources
 are generated.  The sources are written into the "out" directory, where
 they are picked up by the Android build system.

 The best way to become familiar with the interpreter is to look at the
 generated files in the "out" directory.


 ==== Config file format ====

 The config files are parsed from top to bottom.  Each line in the file
 may be blank, hold a comment (line starts with '#'), or be a command.

 The commands are:

   handler-style <computed-goto|jump-table>

     Specify which style of interpreter to generate.  In computed-goto,
     each handler is allocated a fixed region, allowing transitions to
     be done via table-start-address + (opcode * handler-size). With
     jump-table style, handlers may be of any length, and the generated
     table is an array of pointers to the handlers.  This command is required,
     and must be the first command in the config file.

   handler-size <bytes>

     Specify the size of the fixed region, in bytes.  On most platforms
     this will need to be a power of 2.  For jump-table implementations,
     this command is ignored.

   import <filename>

     The specified file is included immediately, in its entirety.  No
     substitutions are performed.  ".cpp" and ".h" files are copied to the
     C output, ".S" files are copied to the asm output.

   asm-alt-stub <filename>

     When present, this command will cause the generation of an alternate
     set of entry points (for computed-goto interpreters) or an alternate
     jump table (for jump-table interpreters).

   fallback-stub <filename>

     Specifies a file to be used for the special FALLBACK tag on the "op"
     command below.  Intended to be used to transfer control to an alternate
     interpreter to single-step a not-yet-implemented opcode.  Note: should
     note be used on RETURN-class instructions.

   op-start <directory>

     Indicates the start of the opcode list.  Must precede any "op"
     commands.  The specified directory is the default location to pull
     instruction files from.

   op <opcode> <directory>|FALLBACK

     Can only appear after "op-start" and before "op-end".  Overrides the
     default source file location of the specified opcode.  The opcode
     definition will come from the specified file, e.g. "op OP_NOP arm"
     will load from "arm/OP_NOP.S".  A substitution dictionary will be
     applied (see below).  If the special "FALLBACK" token is used instead of
     a directory name, the source file specified in fallback-stub will instead
     be used for this opcode.

   alt <opcode> <directory>

     Can only appear after "op-start" and before "op-end".  Similar to the
     "op" command above, but denotes a source file to override the entry
     in the alternate handler table.  The opcode definition will come from
     the specified file, e.g. "alt OP_NOP arm" will load from
     "arm/ALT_OP_NOP.S".  A substitution dictionary will be applied
     (see below).

   op-end

     Indicates the end of the opcode list.  All kNumPackedOpcodes
     opcodes are emitted when this is seen, followed by any code that
     didn't fit inside the fixed-size instruction handler space.

 The order of "op" and "alt" directives are not significant; the generation
 tool will extract ordering info from the VM sources.

 Typically the form in which most opcodes currently exist is used in
 the "op-start" directive.

 ==== Instruction file format ====

 The assembly instruction files are simply fragments of assembly sources.
 The starting label will be provided by the generation tool, as will
 declarations for the segment type and alignment.  The expected target
 assembler is GNU "as", but others will work (may require fiddling with
 some of the pseudo-ops emitted by the generation tool).

 A substitution dictionary is applied to all opcode fragments as they are
 appended to the output.  Substitutions can look like "$value" or "${value}".

 The dictionary always includes:

   $opcode - opcode name, e.g. "OP_NOP"
   $opnum - opcode number, e.g. 0 for OP_NOP
   $handler_size_bytes - max size of an instruction handler, in bytes
   $handler_size_bits - max size of an instruction handler, log 2

 Both C and assembly sources will be passed through the C pre-processor,
 so you can take advantage of C-style comments and preprocessor directives
 like "#define".

 Some generator operations are available.

   %include "filename" [subst-dict]

     Includes the file, which should look like "arm/OP_NOP.S".  You can
     specify values for the substitution dictionary, using standard Python
     syntax.  For example, this:
       %include "arm/unop.S" {"result":"r1"}
     would insert "arm/unop.S" at the current file position, replacing
     occurrences of "$result" with "r1".

   %default <subst-dict>

     Specify default substitution dictionary values, using standard Python
     syntax.  Useful if you want to have a "base" version and variants.

   %break

     Identifies the split between the main portion of the instruction
     handler (which must fit in "handler-size" bytes) and the "sister"
     code, which is appended to the end of the instruction handler block.
     In jump table implementations, %break is ignored.

 The generation tool does *not* print a warning if your instructions
 exceed "handler-size", but the VM will abort on startup if it detects an
 oversized handler.  On architectures with fixed-width instructions this
 is easy to work with, on others this you will need to count bytes.


 ==== Using C constants from assembly sources ====

 The file "art/runtime/asm_support.h" has some definitions for constant
 values, structure sizes, and struct member offsets.  The format is fairly
 restricted, as simple macros are used to massage it for use with both C
 (where it is verified) and assembly (where the definitions are used).

 If a constant in the file becomes out of sync, the VM will log an error
 message and abort during startup.


 ==== Development tips ====

 If you need to debug the initial piece of an opcode handler, and your
 debug code expands it beyond the handler size limit, you can insert a
 generic header at the top:

     b       ${opcode}_start
 %break
 ${opcode}_start:

 If you already have a %break, it's okay to leave it in place -- the second
 %break is ignored.


 ==== Rebuilding ====

 If you change any of the source file fragments, you need to rebuild the
 combined source files in the "out" directory.  Make sure the files in
 "out" are editable, then:

     $ cd mterp
     $ ./rebuild.sh

 The ultimate goal is to have the build system generate the necessary
 output files without requiring this separate step, but we're not yet
 ready to require Python in the build.

 ==== Interpreter Control ====

 The mterp fast interpreter achieves much of its performance advantage
 over the C++ interpreter through its efficient mechanism of
 transitioning from one Dalvik bytecode to the next.  Mterp for ARM targets
 uses a computed-goto mechanism, in which the handler entrypoints are
 located at the base of the handler table + (opcode * 128).

 In normal operation, the dedicated register rIBASE
 (r8 for ARM, edx for x86) holds a mainHandlerTable.  If we need to switch
 to a mode that requires inter-instruction checking, rIBASE is changed
 to altHandlerTable.  Note that this change is not immediate.  What is actually
 changed is the value of curHandlerTable - which is part of the interpBreak
 structure.  Rather than explicitly check for changes, each thread will
 blindly refresh rIBASE at backward branches, exception throws and returns.
	rt "mterp" README

	NOTE: Find rebuilding instructions at the bottom of this file.


	==== Overview ====

	Every configuration has a "config-*" file that controls how the sources
	are generated. The sources are written into the "out" directory, where
	they are picked up by the Android build system.

	The best way to become familiar with the interpreter is to look at the
	generated files in the "out" directory.


	==== Config file format ====

	The config files are parsed from top to bottom. Each line in the file
	may be blank, hold a comment (line starts with '#'), or be a command.

	The commands are:

	handler-style <computed-goto\|jump-table>

	Specify which style of interpreter to generate. In computed-goto,
	each handler is allocated a fixed region, allowing transitions to
	be done via table-start-address + (opcode * handler-size). With
	jump-table style, handlers may be of any length, and the generated
	table is an array of pointers to the handlers. This command is required,
	and must be the first command in the config file.

	handler-size <bytes>

	Specify the size of the fixed region, in bytes. On most platforms
	this will need to be a power of 2. For jump-table implementations,
	this command is ignored.

	import <filename>

	The specified file is included immediately, in its entirety. No
	substitutions are performed. ".cpp" and ".h" files are copied to the
	C output, ".S" files are copied to the asm output.

	asm-alt-stub <filename>

	When present, this command will cause the generation of an alternate
	set of entry points (for computed-goto interpreters) or an alternate
	jump table (for jump-table interpreters).

	fallback-stub <filename>

	Specifies a file to be used for the special FALLBACK tag on the "op"
	command below. Intended to be used to transfer control to an alternate
	interpreter to single-step a not-yet-implemented opcode. Note: should
	note be used on RETURN-class instructions.

	op-start <directory>

	Indicates the start of the opcode list. Must precede any "op"
	commands. The specified directory is the default location to pull
	instruction files from.

	op <opcode> <directory>\|FALLBACK

	Can only appear after "op-start" and before "op-end". Overrides the
	default source file location of the specified opcode. The opcode
	definition will come from the specified file, e.g. "op OP_NOP arm"
	will load from "arm/OP_NOP.S". A substitution dictionary will be
	applied (see below). If the special "FALLBACK" token is used instead of
	a directory name, the source file specified in fallback-stub will instead
	be used for this opcode.

	alt <opcode> <directory>

	Can only appear after "op-start" and before "op-end". Similar to the
	"op" command above, but denotes a source file to override the entry
	in the alternate handler table. The opcode definition will come from
	the specified file, e.g. "alt OP_NOP arm" will load from
	"arm/ALT_OP_NOP.S". A substitution dictionary will be applied
	(see below).

	op-end

	Indicates the end of the opcode list. All kNumPackedOpcodes
	opcodes are emitted when this is seen, followed by any code that
	didn't fit inside the fixed-size instruction handler space.

	The order of "op" and "alt" directives are not significant; the generation
	tool will extract ordering info from the VM sources.

	Typically the form in which most opcodes currently exist is used in
	the "op-start" directive.

	==== Instruction file format ====

	The assembly instruction files are simply fragments of assembly sources.
	The starting label will be provided by the generation tool, as will
	declarations for the segment type and alignment. The expected target
	assembler is GNU "as", but others will work (may require fiddling with
	some of the pseudo-ops emitted by the generation tool).

	A substitution dictionary is applied to all opcode fragments as they are
	appended to the output. Substitutions can look like "$value" or "${value}".

	The dictionary always includes:

	$opcode - opcode name, e.g. "OP_NOP"
	$opnum - opcode number, e.g. 0 for OP_NOP
	$handler_size_bytes - max size of an instruction handler, in bytes
	$handler_size_bits - max size of an instruction handler, log 2

	Both C and assembly sources will be passed through the C pre-processor,
	so you can take advantage of C-style comments and preprocessor directives
	like "#define".

	Some generator operations are available.

	%include "filename" [subst-dict]

	Includes the file, which should look like "arm/OP_NOP.S". You can
	specify values for the substitution dictionary, using standard Python
	syntax. For example, this:
	%include "arm/unop.S" {"result":"r1"}
	would insert "arm/unop.S" at the current file position, replacing
	occurrences of "$result" with "r1".

	%default <subst-dict>

	Specify default substitution dictionary values, using standard Python
	syntax. Useful if you want to have a "base" version and variants.

	%break

	Identifies the split between the main portion of the instruction
	handler (which must fit in "handler-size" bytes) and the "sister"
	code, which is appended to the end of the instruction handler block.
	In jump table implementations, %break is ignored.

	The generation tool does not print a warning if your instructions
	exceed "handler-size", but the VM will abort on startup if it detects an
	oversized handler. On architectures with fixed-width instructions this
	is easy to work with, on others this you will need to count bytes.


	==== Using C constants from assembly sources ====

	The file "art/runtime/asm_support.h" has some definitions for constant
	values, structure sizes, and struct member offsets. The format is fairly
	restricted, as simple macros are used to massage it for use with both C
	(where it is verified) and assembly (where the definitions are used).

	If a constant in the file becomes out of sync, the VM will log an error
	message and abort during startup.


	==== Development tips ====

	If you need to debug the initial piece of an opcode handler, and your
	debug code expands it beyond the handler size limit, you can insert a
	generic header at the top:

	b ${opcode}_start
	%break
	${opcode}_start:

	If you already have a %break, it's okay to leave it in place -- the second
	%break is ignored.


	==== Rebuilding ====

	If you change any of the source file fragments, you need to rebuild the
	combined source files in the "out" directory. Make sure the files in
	"out" are editable, then:

	$ cd mterp
	$ ./rebuild.sh

	The ultimate goal is to have the build system generate the necessary
	output files without requiring this separate step, but we're not yet
	ready to require Python in the build.

	==== Interpreter Control ====

	The mterp fast interpreter achieves much of its performance advantage
	over the C++ interpreter through its efficient mechanism of
	transitioning from one Dalvik bytecode to the next. Mterp for ARM targets
	uses a computed-goto mechanism, in which the handler entrypoints are
	located at the base of the handler table + (opcode * 128).

	In normal operation, the dedicated register rIBASE
	(r8 for ARM, edx for x86) holds a mainHandlerTable. If we need to switch
	to a mode that requires inter-instruction checking, rIBASE is changed
	to altHandlerTable. Note that this change is not immediate. What is actually
	changed is the value of curHandlerTable - which is part of the interpBreak
	structure. Rather than explicitly check for changes, each thread will
	blindly refresh rIBASE at backward branches, exception throws and returns.