utils/ffsb-6.0-rc2/README - platform/external/ltp - Git at Google

 Introduction:

 The Flexible Filesystem Benchmark (FFSB) is a filesystem performance
 measurement tool.  It is a multi-threaded application (using
 pthreads), written entirely in C with cross-platform portability in
 mind.  It differs from other filesystem benchmarks in that the user
 may supply a profile to create custom workloads, while most other
 filesystem benchmarks use a fixed set of workloads.

 As of version 5.1, it supports seven different basic operations, support
 for multiple groups of threads with different operation mixtures,
 support for operation across multiple filesystems, and support for
 filesystem aging prior to benchmarking.


 Differences from version 4.0 and older:

 Version 5.0 and above represent almost a total re-write and many
 things have changed.  In version 5.0 and above FFSB moved to a
 time-regulated run versus doing a set number of different operations
 and timing the whole thing.  This is primarily to better deal with the
 use of multiple threadgroups which would otherwise not be synchronized
 at termination time.

 Additionally, the FFSB configuration file format has changed in
 version 5.0, although we do support old-style configuration files
 along with a run-time passed on the command line.  In this mode,
 version 5.0 and above ignores the iterations parameter, and simply
 uses the time specified on the command line.

 Behaviorally, most of the old operations are the same -- sequential
 reads and sequential writes work as they did before.  One change in
 version 5.0 is the skip-read behavior of reading then seeking forward
 a fixed amount then reading again is removed, we now support fully
 randomized reads and writes from random offsets within the file.

 Version 4.0 didn't support overwrites (only appends) so we interpret
 writes in old config files to be append operations.

 On Linux, CPU utilization information will only be accurate for
 systems using NPTL, older Linuxthreads systems will probably only see
 zeros for CPU utilization because Linuxthreads is non-compliant to
 POSIX. Version 4.0 and older could be recompiled to work on
 Linuxthreads, but in 5.0 and later we no longer support this.

 We no longer support the "outputfile" on the command line.

 One should simply use tee or similar to capture the output.  FFSB
 unbuffers standard out for this purpose, and errors are sent on
 standard error.

 Global options:

 There are eight valid global options placed at the beginning of the
 profile.  Three of them are required: num_filesystems (number of
 filesystems), num_threadgroups (number of threadgroups), and time
 (running time of the benchmark).  The other five options are:

 directio   - each call to open will be made using O_DIRECT
 alignio    - aligns all block operations for random reads and writes
              on 4k boundaries.
 bufferedio - currently ignorred: it is intended to use libc
              fread,rwrite, instead of just unix read and write calls
 verbose    - currently ignored

 callout    - calls and external command and waits for its termination
 	     before FFSB begins the benchmark phase.
 	     This is useful for synchronizing distributed clients,
 	     starting profilers, etc.

 They must be specified in the above order (num_filesystems,
 num_threadgroups, time, directio, alignio, bufferedio, verbose,
 callout).


 Filesystems:

 Filesystems are specified to FFSB in the form of a directory.  FFSB
 assumes that the filesystem is mounted at this directory and will not
 do any verification of this fact beyond ensuring it can read/write to
 the location.  So be careful to ensure something with enough space to
 handle the dataset is in fact mounted at the specified location.

 In the filesystem clause of the profile, one may set the starting
 number of files and directories as well as a minimum and maximum
 filesize for the filesystem.  One may also specify the blocksize
 used for creating the files separately in the filesystem clause.

 Also, if a filesystem is to be aged, a special threadgroup clause may
 be embedded in a filesystem clause to specify the operation mixture
 and number of threads used to age the filesystem.  This threadgroup is
 run until filesystem utilization reaches the specified amount.

 Inheritance --  if you are using multiple filesystems, all attributes
 except the location should be inherited from the previous filesystem.
 This is done to make it easier to add groups of similar filesystems.
 In this case, only the location is required in the filesystem clause.

 As of version 5.1, filesystem re-use is supported if a given
 filesystem hasn't been modified beyond it's orginal specifications
 (number of files and directories is correct, and file sizes are within
 specifications).  This can be a huge time saver if one wishes to do
 multiple runs on the same data-set without altering it during a run,
 because the fileset doesn't need to be recreated before each run.

 To do this, specify "reuse=1" in the filesystem clause, and FFSB will
 verify the fileset first, and if it checks out it will use it.
 Otherwise, it will remove everything and re-create the filesets for
 that filesystem.

 Threadgroups:

 An arbitrary number of threadgroups with differing numbers of threads
 and operation mixes can be specified.  The operations are specified
 using a weighting for each operation, if an operation isn't specified
 it's weighting is assumed to be zero (not used).

 "Think-time" for a threadgroup may also be specified in millisecond
 amounts using the "op_delay" parameter, where every thread will wait
 for the specified amount between each operation.

 Operations:

 All operations begin by randomly selecting a filesystem from the list
 of filesystems specified in the profile.  The distribution aims to be
 uniform across all filesystems.


 The seven operations are:

 reads  - read() calls with an overall amount and a blocksize
          operates on existing files.  Care must be taken to ensure
          that the read amount is smaller than the size of any possible
          file.

 	 If random_read is specified, then the each individual blocks
          will be read starting from a random point with the file, and
          this will continune until the entire amount specifed has been
          read.  This offset of each random block will be totally
          random to the byte level, unless the "alignio" global parameter
          is on, and then the reads will be 4096 byte aligned.  This is
          generally recommended.


 readall - Very similar to read above, except it doesn't take an
           amount; it simply reads the entire file sequentially using the
           read_blocksize.   This is useful for situations where
 	  different filesystems have differently sized files, and sequential
 	  read patterns across all filesystems are desired.

 writes - write() calls with an overall amount and blocksize
          this is an overwrite operation and will not enlarge an existing
          file, again one must be careful not to specify a write amount
          that is larger than any possible file in the data set.

 	 If random_write is specified, then the each individual blocks
          will be written starting from a random point with the file, and
          this will continune until the entire amount specifed has been
          written out.  This offset of each random block will be totally
          random to the byte level, unless the "alignio" global parameter
          is on, and then the writes will be 4096 byte aligned.  This
          is generally recommended.

 	 If the fsync_flag parameter for the threadgroup is non-zero,
 	 then after all of the write calls are finished, fsync() will
 	 be called on the file descriptor before the file is closed.


 creates - creates a file using open() call and determines the size
           randomly between on the constraints (min_filesize and
           max_filesize) for the selected filesystem. Write operations will
           be done using the same blocksize as is specified for the
           write operation.
 deletes - calls unlink() on a filename and removes it from the
           internal data-structures.  One must be careful to ensure
           there are enough files to delete at all times or else the benchmark
           will terminate.
 appends - calls write() using the append flag with an overall amount
           and a blocksize to be appended onto a randomly chosen file.
 metas   - this is actually a mix of several different directory
           operations.  Each "meta" operation consists of two directory
           creates, one directory remove, and a directory rename.
           These operations are all carried out separately from the
           other 5 operations.

 Operation accounting:

 Each operation which uses a blocksize counts each read/write of a
 blocksize as an operation (reads,writes,creates, and appends) whereas
 deletes and metas are considered single operations.

 Running the benchmark:

 There are three phases to running the benchmark, aging, fileset
 creates, and the benchmark phase.

 The create phase is carried out across all filesystems simultanously
 with one dedicated thread per filesystem.

 After the create phase, sync() is called to ensure all dirty data gets
 written out before the benchmark phase begins, and sync() is again
 called at the end of the benchmark phase.  The time in sync() at the
 end of the benchmark phase is counted as part of the benchmark phase.

 Caveats/Holes/Bugs:

 Aging and aging across multiple filesystems simultaneously hasn't been tested
 very much.

 If *any* i/o operation or system call/libc call fails, the benchmark
 will terminate immediately.

 The parser doesn't handle mal-formed or incorrect profiles very well
 (or at all).

 The parser doesn't check to make sure all of the appropriate options
 have been specified.  For example, if writes are specified in a
 threadgroup but write_blocksize isn't specified, the parse won't catch
 it, but the benchmark run will fail later on.


 Configuration Files (new style):

 New Style Configuration allows for arbitrary newlines between lines,
 and comments using '#' at the start of a line.  Also it allows tabs,
 whitespace before and after configuration parameters.

 The new style configuration file is broken up into three main parts:

 global parameters, filesystems, and threadgroups

 The sections must be in the above order.

 Global parameters:

 Global Paramters are described above, the first three are always
 required. Example:

 ----------

 num_filesystems=1
 num_threadgroups=1
 time=30 		# time is in seconds

 directio=0 		# don't use direct io
 alignio=1  		# align random IOs to 4k
 bufferedio=0		# this does nothing right now
 verbose=0		# this does nothing right now

 			# calls and external command and waits
 			# everything until the newline is taken
 			# so you can have abritrary parmeters
 callout=synchronize.sh myhostname

 ---------

 All of these must appear in this order, though you can leave out the
 optional ones.

 Filesystems:

 Filesystems describe different logical sets of files residing in
 different directorys.  There is no strict requirement that they
 actually be on different filesystems, only that the directory
 specified already exists.

 Filesystems are specified by a clause with a filesystem number like
 this:

 [filesystem0]
 	location=/mnt/testing/
 	num_files=10
 	num_dirs=1
 	max_filesize=4096
 	min_filesize=4096
 [end0]


 The clause must always begin with [filesystemX] and end with [endX]
 where X is the number of that filesystem.

 You should start wiht X = 0, and increment by one for each following
 filesystem.  If they are out of order, things will likely break.

 The required information for each filesystem is: location, num_files,
 num_dirs, max_filesize, and min_filesize.  Beyond those the following
 four options are supported:


 reuse=1 # check the filesystem to see if it is reusable

 	# filesystem aging, three components required
 	# takes agefs=1 to turn it on
 	# then a valid threadgroup specification
 	# then a desired utilization percentage

 agefs=1 # age the filesystem according to the following threadgroup
 	[threadgroup0]
 		num_threads=10
 		write_size=40960
 		write_blocksize=4096
 		create_weight=10
 		append_weight=10
 		delete_weight=1
 	[end0]
 desired_util=0.20	# In this case, age until the fs is 20% full

 create_blocksize=4096   # specify the blocksize to write()
 		        # for creating the fileset, defaults to 4096

 age_blocksize=4096      # specify the blocksize to write() for aging


 Also, to allow lazy people to use lots of filesystems, we support
 filesystem inheritance, which simply copies all options but the
 location from the previous filesystem clause if nothing is specified.
 Obviously, this doesn't work for filesystem0. (May not work for aging
 either?)

 Full blown filesystem clause example:

 ----

 [filesystem0]

 	# required parts

 	location=/home/sonny/tmp
 	num_files=100
 	num_dirs=100
 	max_filesize=65536
 	min_filesize=4096

 	# aging part
 	agefs=0
 	[threadgroup0]
 		num_threads=10
 		write_size=40960
 		write_blocksize=4096
 		create_weight=10
 		append_weight=10
 		delete_weight=1
 	[end0]
 		desired_util=0.02	# age until 2% full

 	# other optional commands

 	create_blocksize=1024		# use a small create blocksize
 	age_blocksize=1024		# and smaller age create blocksize
 	reuse=0	                        # don't reuse it
 [end0]


 --

 Threadgroups:

 Threadgropus are very similar to filesystems in that any number of
 them can be specified in clauses, and they must be in order starting
 with threadgroup0.

 Example:

 ---

 [threadgroup0]
 	num_threads=32
 	read_weight=4
 	append_weight=1

 	write_size=4096
 	write_blocksize=4096

 	read_size=4096
 	read_blocksize=4096
 [end0]

 ---

 In a threadgroup clause, num_threads is required and must be at least
 1.  Then, at least one operation must be given a weight greater than 0
 to be a valid threadgroup.  Operations can be given a weighting of 0,
 and in this case they are ignored.

 Certain operations will also require other commands, for example, if
 read_weight is greater than zero, then one must also include a
 read_size and a read_blocksize.  Here's the table of requirements and
 options:


 Operation		Requirements			Options
 --			--				--
 read_weight		read_size, read_blocksize	read_random
 readall_weight		read_blocksize			none
 write_weight		write_size, write_blocksize	write_random,fsync_file
 create_weight		write_blocksize or create_blocksize	none
 append_weight		write_blocksize, write_size	none
 delete_weight		none				none
 meta_weight		none				none


 Other threadgroup options:

 op_delay=10  # specify a wait between operations in milli-seconds

 bindfs=3     # This allows you to restrict a threadgroup's operation
              # to a specific filesystem number.  Currently only
 	     # binding to one specific filesystem is supported
	Introduction:

	The Flexible Filesystem Benchmark (FFSB) is a filesystem performance
	measurement tool. It is a multi-threaded application (using
	pthreads), written entirely in C with cross-platform portability in
	mind. It differs from other filesystem benchmarks in that the user
	may supply a profile to create custom workloads, while most other
	filesystem benchmarks use a fixed set of workloads.

	As of version 5.1, it supports seven different basic operations, support
	for multiple groups of threads with different operation mixtures,
	support for operation across multiple filesystems, and support for
	filesystem aging prior to benchmarking.


	Differences from version 4.0 and older:

	Version 5.0 and above represent almost a total re-write and many
	things have changed. In version 5.0 and above FFSB moved to a
	time-regulated run versus doing a set number of different operations
	and timing the whole thing. This is primarily to better deal with the
	use of multiple threadgroups which would otherwise not be synchronized
	at termination time.

	Additionally, the FFSB configuration file format has changed in
	version 5.0, although we do support old-style configuration files
	along with a run-time passed on the command line. In this mode,
	version 5.0 and above ignores the iterations parameter, and simply
	uses the time specified on the command line.

	Behaviorally, most of the old operations are the same -- sequential
	reads and sequential writes work as they did before. One change in
	version 5.0 is the skip-read behavior of reading then seeking forward
	a fixed amount then reading again is removed, we now support fully
	randomized reads and writes from random offsets within the file.

	Version 4.0 didn't support overwrites (only appends) so we interpret
	writes in old config files to be append operations.

	On Linux, CPU utilization information will only be accurate for
	systems using NPTL, older Linuxthreads systems will probably only see
	zeros for CPU utilization because Linuxthreads is non-compliant to
	POSIX. Version 4.0 and older could be recompiled to work on
	Linuxthreads, but in 5.0 and later we no longer support this.

	We no longer support the "outputfile" on the command line.

	One should simply use tee or similar to capture the output. FFSB
	unbuffers standard out for this purpose, and errors are sent on
	standard error.

	Global options:

	There are eight valid global options placed at the beginning of the
	profile. Three of them are required: num_filesystems (number of
	filesystems), num_threadgroups (number of threadgroups), and time
	(running time of the benchmark). The other five options are:

	directio - each call to open will be made using O_DIRECT
	alignio - aligns all block operations for random reads and writes
	on 4k boundaries.
	bufferedio - currently ignorred: it is intended to use libc
	fread,rwrite, instead of just unix read and write calls
	verbose - currently ignored

	callout - calls and external command and waits for its termination
	before FFSB begins the benchmark phase.
	This is useful for synchronizing distributed clients,
	starting profilers, etc.

	They must be specified in the above order (num_filesystems,
	num_threadgroups, time, directio, alignio, bufferedio, verbose,
	callout).



	Filesystems:

	Filesystems are specified to FFSB in the form of a directory. FFSB
	assumes that the filesystem is mounted at this directory and will not
	do any verification of this fact beyond ensuring it can read/write to
	the location. So be careful to ensure something with enough space to
	handle the dataset is in fact mounted at the specified location.

	In the filesystem clause of the profile, one may set the starting
	number of files and directories as well as a minimum and maximum
	filesize for the filesystem. One may also specify the blocksize
	used for creating the files separately in the filesystem clause.

	Also, if a filesystem is to be aged, a special threadgroup clause may
	be embedded in a filesystem clause to specify the operation mixture
	and number of threads used to age the filesystem. This threadgroup is
	run until filesystem utilization reaches the specified amount.

	Inheritance -- if you are using multiple filesystems, all attributes
	except the location should be inherited from the previous filesystem.
	This is done to make it easier to add groups of similar filesystems.
	In this case, only the location is required in the filesystem clause.

	As of version 5.1, filesystem re-use is supported if a given
	filesystem hasn't been modified beyond it's orginal specifications
	(number of files and directories is correct, and file sizes are within
	specifications). This can be a huge time saver if one wishes to do
	multiple runs on the same data-set without altering it during a run,
	because the fileset doesn't need to be recreated before each run.

	To do this, specify "reuse=1" in the filesystem clause, and FFSB will
	verify the fileset first, and if it checks out it will use it.
	Otherwise, it will remove everything and re-create the filesets for
	that filesystem.

	Threadgroups:

	An arbitrary number of threadgroups with differing numbers of threads
	and operation mixes can be specified. The operations are specified
	using a weighting for each operation, if an operation isn't specified
	it's weighting is assumed to be zero (not used).

	"Think-time" for a threadgroup may also be specified in millisecond
	amounts using the "op_delay" parameter, where every thread will wait
	for the specified amount between each operation.

	Operations:

	All operations begin by randomly selecting a filesystem from the list
	of filesystems specified in the profile. The distribution aims to be
	uniform across all filesystems.


	The seven operations are:

	reads - read() calls with an overall amount and a blocksize
	operates on existing files. Care must be taken to ensure
	that the read amount is smaller than the size of any possible
	file.

	If random_read is specified, then the each individual blocks
	will be read starting from a random point with the file, and
	this will continune until the entire amount specifed has been
	read. This offset of each random block will be totally
	random to the byte level, unless the "alignio" global parameter
	is on, and then the reads will be 4096 byte aligned. This is
	generally recommended.


	readall - Very similar to read above, except it doesn't take an
	amount; it simply reads the entire file sequentially using the
	read_blocksize. This is useful for situations where
	different filesystems have differently sized files, and sequential
	read patterns across all filesystems are desired.

	writes - write() calls with an overall amount and blocksize
	this is an overwrite operation and will not enlarge an existing
	file, again one must be careful not to specify a write amount
	that is larger than any possible file in the data set.

	If random_write is specified, then the each individual blocks
	will be written starting from a random point with the file, and
	this will continune until the entire amount specifed has been
	written out. This offset of each random block will be totally
	random to the byte level, unless the "alignio" global parameter
	is on, and then the writes will be 4096 byte aligned. This
	is generally recommended.

	If the fsync_flag parameter for the threadgroup is non-zero,
	then after all of the write calls are finished, fsync() will
	be called on the file descriptor before the file is closed.


	creates - creates a file using open() call and determines the size
	randomly between on the constraints (min_filesize and
	max_filesize) for the selected filesystem. Write operations will
	be done using the same blocksize as is specified for the
	write operation.
	deletes - calls unlink() on a filename and removes it from the
	internal data-structures. One must be careful to ensure
	there are enough files to delete at all times or else the benchmark
	will terminate.
	appends - calls write() using the append flag with an overall amount
	and a blocksize to be appended onto a randomly chosen file.
	metas - this is actually a mix of several different directory
	operations. Each "meta" operation consists of two directory
	creates, one directory remove, and a directory rename.
	These operations are all carried out separately from the
	other 5 operations.

	Operation accounting:

	Each operation which uses a blocksize counts each read/write of a
	blocksize as an operation (reads,writes,creates, and appends) whereas
	deletes and metas are considered single operations.

	Running the benchmark:

	There are three phases to running the benchmark, aging, fileset
	creates, and the benchmark phase.

	The create phase is carried out across all filesystems simultanously
	with one dedicated thread per filesystem.

	After the create phase, sync() is called to ensure all dirty data gets
	written out before the benchmark phase begins, and sync() is again
	called at the end of the benchmark phase. The time in sync() at the
	end of the benchmark phase is counted as part of the benchmark phase.

	Caveats/Holes/Bugs:

	Aging and aging across multiple filesystems simultaneously hasn't been tested
	very much.

	If any i/o operation or system call/libc call fails, the benchmark
	will terminate immediately.

	The parser doesn't handle mal-formed or incorrect profiles very well
	(or at all).

	The parser doesn't check to make sure all of the appropriate options
	have been specified. For example, if writes are specified in a
	threadgroup but write_blocksize isn't specified, the parse won't catch
	it, but the benchmark run will fail later on.


	Configuration Files (new style):

	New Style Configuration allows for arbitrary newlines between lines,
	and comments using '#' at the start of a line. Also it allows tabs,
	whitespace before and after configuration parameters.

	The new style configuration file is broken up into three main parts:

	global parameters, filesystems, and threadgroups

	The sections must be in the above order.

	Global parameters:

	Global Paramters are described above, the first three are always
	required. Example:

	----------

	num_filesystems=1
	num_threadgroups=1
	time=30 # time is in seconds

	directio=0 # don't use direct io
	alignio=1 # align random IOs to 4k
	bufferedio=0 # this does nothing right now
	verbose=0 # this does nothing right now

	# calls and external command and waits
	# everything until the newline is taken
	# so you can have abritrary parmeters
	callout=synchronize.sh myhostname

	---------

	All of these must appear in this order, though you can leave out the
	optional ones.

	Filesystems:

	Filesystems describe different logical sets of files residing in
	different directorys. There is no strict requirement that they
	actually be on different filesystems, only that the directory
	specified already exists.

	Filesystems are specified by a clause with a filesystem number like
	this:

	[filesystem0]
	location=/mnt/testing/
	num_files=10
	num_dirs=1
	max_filesize=4096
	min_filesize=4096
	[end0]


	The clause must always begin with [filesystemX] and end with [endX]
	where X is the number of that filesystem.

	You should start wiht X = 0, and increment by one for each following
	filesystem. If they are out of order, things will likely break.

	The required information for each filesystem is: location, num_files,
	num_dirs, max_filesize, and min_filesize. Beyond those the following
	four options are supported:



	reuse=1 # check the filesystem to see if it is reusable

	# filesystem aging, three components required
	# takes agefs=1 to turn it on
	# then a valid threadgroup specification
	# then a desired utilization percentage

	agefs=1 # age the filesystem according to the following threadgroup
	[threadgroup0]
	num_threads=10
	write_size=40960
	write_blocksize=4096
	create_weight=10
	append_weight=10
	delete_weight=1
	[end0]
	desired_util=0.20 # In this case, age until the fs is 20% full

	create_blocksize=4096 # specify the blocksize to write()
	# for creating the fileset, defaults to 4096

	age_blocksize=4096 # specify the blocksize to write() for aging


	Also, to allow lazy people to use lots of filesystems, we support
	filesystem inheritance, which simply copies all options but the
	location from the previous filesystem clause if nothing is specified.
	Obviously, this doesn't work for filesystem0. (May not work for aging
	either?)

	Full blown filesystem clause example:

	----

	[filesystem0]

	# required parts

	location=/home/sonny/tmp
	num_files=100
	num_dirs=100
	max_filesize=65536
	min_filesize=4096

	# aging part
	agefs=0
	[threadgroup0]
	num_threads=10
	write_size=40960
	write_blocksize=4096
	create_weight=10
	append_weight=10
	delete_weight=1
	[end0]
	desired_util=0.02 # age until 2% full

	# other optional commands

	create_blocksize=1024 # use a small create blocksize
	age_blocksize=1024 # and smaller age create blocksize
	reuse=0 # don't reuse it
	[end0]



	--

	Threadgroups:

	Threadgropus are very similar to filesystems in that any number of
	them can be specified in clauses, and they must be in order starting
	with threadgroup0.

	Example:

	---

	[threadgroup0]
	num_threads=32
	read_weight=4
	append_weight=1

	write_size=4096
	write_blocksize=4096

	read_size=4096
	read_blocksize=4096
	[end0]

	---

	In a threadgroup clause, num_threads is required and must be at least
	1. Then, at least one operation must be given a weight greater than 0
	to be a valid threadgroup. Operations can be given a weighting of 0,
	and in this case they are ignored.

	Certain operations will also require other commands, for example, if
	read_weight is greater than zero, then one must also include a
	read_size and a read_blocksize. Here's the table of requirements and
	options:


	Operation Requirements Options
	-- -- --
	read_weight read_size, read_blocksize read_random
	readall_weight read_blocksize none
	write_weight write_size, write_blocksize write_random,fsync_file
	create_weight write_blocksize or create_blocksize none
	append_weight write_blocksize, write_size none
	delete_weight none none
	meta_weight none none



	Other threadgroup options:

	op_delay=10 # specify a wait between operations in milli-seconds

	bindfs=3 # This allows you to restrict a threadgroup's operation
	# to a specific filesystem number. Currently only
	# binding to one specific filesystem is supported