| =head1 Beyond make test |
| |
| Test::Harness is responsible for running test scripts, analysing |
| their output and reporting success or failure. When I type |
| F<make test> (or F<./Build test>) for a module, Test::Harness is usually |
| used to run the tests (not all modules use Test::Harness but the |
| majority do). |
| |
| To start exploring some of the features of Test::Harness I need to |
| switch from F<make test> to the F<prove> command (which ships with |
| Test::Harness). For the following examples I'll also need a recent |
| version of Test::Harness installed; 3.14 is current as I write. |
| |
| For the examples I'm going to assume that we're working with a |
| 'normal' Perl module distribution. Specifically I'll assume that |
| typing F<make> or F<./Build> causes the built, ready-to-install module |
| code to be available below ./blib/lib and ./blib/arch and that |
| there's a directory called 't' that contains our tests. Test::Harness |
| isn't hardwired to that configuration but it saves me from explaining |
| which files live where for each example. |
| |
| Back to F<prove>; like F<make test> it runs a test suite - but it |
| provides far more control over which tests are executed, in what |
| order and how their results are reported. Typically F<make test> |
| runs all the test scripts below the 't' directory. To do the same |
| thing with prove I type: |
| |
| prove -rb t |
| |
| The switches here are -r to recurse into any directories below 't' |
| and -b which adds ./blib/lib and ./blib/arch to Perl's include path |
| so that the tests can find the code they will be testing. If I'm |
| testing a module of which an earlier version is already installed |
| I need to be careful about the include path to make sure I'm not |
| running my tests against the installed version rather than the new |
| one that I'm working on. |
| |
| Unlike F<make test>, typing F<prove> doesn't automatically rebuild |
| my module. If I forget to make before prove I will be testing against |
| older versions of those files - which inevitably leads to confusion. |
| I either get into the habit of typing |
| |
| make && prove -rb t |
| |
| or - if I have no XS code that needs to be built I use the modules |
| below F<lib> instead |
| |
| prove -Ilib -r t |
| |
| So far I've shown you nothing that F<make test> doesn't do. Let's |
| fix that. |
| |
| =head2 Saved State |
| |
| If I have failing tests in a test suite that consists of more than |
| a handful of scripts and takes more than a few seconds to run it |
| rapidly becomes tedious to run the whole test suite repeatedly as |
| I track down the problems. |
| |
| I can tell prove just to run the tests that are failing like this: |
| |
| prove -b t/this_fails.t t/so_does_this.t |
| |
| That speeds things up but I have to make a note of which tests are |
| failing and make sure that I run those tests. Instead I can use |
| prove's --state switch and have it keep track of failing tests for |
| me. First I do a complete run of the test suite and tell prove to |
| save the results: |
| |
| prove -rb --state=save t |
| |
| That stores a machine readable summary of the test run in a file |
| called '.prove' in the current directory. If I have failures I can |
| then run just the failing scripts like this: |
| |
| prove -b --state=failed |
| |
| I can also tell prove to save the results again so that it updates |
| its idea of which tests failed: |
| |
| prove -b --state=failed,save |
| |
| As soon as one of my failing tests passes it will be removed from |
| the list of failed tests. Eventually I fix them all and prove can |
| find no failing tests to run: |
| |
| Files=0, Tests=0, 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) |
| Result: NOTESTS |
| |
| As I work on a particular part of my module it's most likely that |
| the tests that cover that code will fail. I'd like to run the whole |
| test suite but have it prioritize these 'hot' tests. I can tell |
| prove to do this: |
| |
| prove -rb --state=hot,save t |
| |
| All the tests will run but those that failed most recently will be |
| run first. If no tests have failed since I started saving state all |
| tests will run in their normal order. This combines full test |
| coverage with early notification of failures. |
| |
| The --state switch supports a number of options; for example to run |
| failed tests first followed by all remaining tests ordered by the |
| timestamps of the test scripts - and save the results - I can use |
| |
| prove -rb --state=failed,new,save t |
| |
| See the prove documentation (type prove --man) for the full list |
| of state options. |
| |
| When I tell prove to save state it writes a file called '.prove' |
| ('_prove' on Windows) in the current directory. It's a YAML document |
| so it's quite easy to write tools of your own that work on the saved |
| test state - but the format isn't officially documented so it might |
| change without (much) warning in the future. |
| |
| =head2 Parallel Testing |
| |
| If my tests take too long to run I may be able to speed them up by |
| running multiple test scripts in parallel. This is particularly |
| effective if the tests are I/O bound or if I have multiple CPU |
| cores. I tell prove to run my tests in parallel like this: |
| |
| prove -rb -j 9 t |
| |
| The -j switch enables parallel testing; the number that follows it |
| is the maximum number of tests to run in parallel. Sometimes tests |
| that pass when run sequentially will fail when run in parallel. For |
| example if two different test scripts use the same temporary file |
| or attempt to listen on the same socket I'll have problems running |
| them in parallel. If I see unexpected failures I need to check my |
| tests to work out which of them are trampling on the same resource |
| and rename temporary files or add locks as appropriate. |
| |
| To get the most performance benefit I want to have the test scripts |
| that take the longest to run start first - otherwise I'll be waiting |
| for the one test that takes nearly a minute to complete after all |
| the others are done. I can use the --state switch to run the tests |
| in slowest to fastest order: |
| |
| prove -rb -j 9 --state=slow,save t |
| |
| =head2 Non-Perl Tests |
| |
| The Test Anything Protocol (http://testanything.org/) isn't just |
| for Perl. Just about any language can be used to write tests that |
| output TAP. There are TAP based testing libraries for C, C++, PHP, |
| Python and many others. If I can't find a TAP library for my language |
| of choice it's easy to generate valid TAP. It looks like this: |
| |
| 1..3 |
| ok 1 - init OK |
| ok 2 - opened file |
| not ok 3 - appended to file |
| |
| The first line is the plan - it specifies the number of tests I'm |
| going to run so that it's easy to check that the test script didn't |
| exit before running all the expected tests. The following lines are |
| the test results - 'ok' for pass, 'not ok' for fail. Each test has |
| a number and, optionally, a description. And that's it. Any language |
| that can produce output like that on STDOUT can be used to write |
| tests. |
| |
| Recently I've been rekindling a two-decades-old interest in Forth. |
| Evidently I have a masochistic streak that even Perl can't satisfy. |
| I want to write tests in Forth and run them using prove (you can |
| find my gforth TAP experiments at |
| https://svn.hexten.net/andy/Forth/Testing/). I can use the --exec |
| switch to tell prove to run the tests using gforth like this: |
| |
| prove -r --exec gforth t |
| |
| Alternately, if the language used to write my tests allows a shebang |
| line I can use that to specify the interpreter. Here's a test written |
| in PHP: |
| |
| #!/usr/bin/php |
| <?php |
| print "1..2\n"; |
| print "ok 1\n"; |
| print "not ok 2\n"; |
| ?> |
| |
| If I save that as t/phptest.t the shebang line will ensure that it |
| runs correctly along with all my other tests. |
| |
| =head2 Mixing it up |
| |
| Subtle interdependencies between test programs can mask problems - |
| for example an earlier test may neglect to remove a temporary file |
| that affects the behaviour of a later test. To find this kind of |
| problem I use the --shuffle and --reverse options to run my tests |
| in random or reversed order. |
| |
| =head2 Rolling My Own |
| |
| If I need a feature that prove doesn't provide I can easily write my own. |
| |
| Typically you'll want to change how TAP gets I<input> into and I<output> |
| from the parser. L<App::Prove> supports arbitrary plugins, and L<TAP::Harness> |
| supports custom I<formatters> and I<source handlers> that you can load using |
| either L<prove> or L<Module::Build>; there are many examples to base mine on. |
| For more details see L<App::Prove>, L<TAP::Parser::SourceHandler>, and |
| L<TAP::Formatter::Base>. |
| |
| If writing a plugin is not enough, you can write your own test harness; one of |
| the motives for the 3.00 rewrite of Test::Harness was to make it easier to |
| subclass and extend. |
| |
| The Test::Harness module is a compatibility wrapper around TAP::Harness. |
| For new applications I should use TAP::Harness directly. As we'll |
| see, prove uses TAP::Harness. |
| |
| When I run prove it processes its arguments, figures out which test |
| scripts to run and then passes control to TAP::Harness to run the |
| tests, parse, analyse and present the results. By subclassing |
| TAP::Harness I can customise many aspects of the test run. |
| |
| I want to log my test results in a database so I can track them |
| over time. To do this I override the summary method in TAP::Harness. |
| I start with a simple prototype that dumps the results as a YAML |
| document: |
| |
| package My::TAP::Harness; |
| |
| use base qw( TAP::Harness ); use YAML; |
| |
| sub summary { |
| my ( $self, $aggregate ) = @_; |
| print Dump( $aggregate ); |
| $self->SUPER::summary( $aggregate ); |
| } |
| |
| 1; |
| |
| I need to tell prove to use my My::TAP::Harness. If My::TAP::Harness |
| is on Perl's @INC include path I can |
| |
| prove --harness=My::TAP::Harness -rb t |
| |
| If I don't have My::TAP::Harness installed on @INC I need to provide |
| the correct path to perl when I run prove: |
| |
| perl -Ilib `which prove` --harness=My::TAP::Harness -rb t |
| |
| I can incorporate these options into my own version of prove. It's |
| pretty simple. Most of the work of prove is handled by App::Prove. |
| The important code in prove is just: |
| |
| use App::Prove; |
| |
| my $app = App::Prove->new; |
| $app->process_args(@ARGV); |
| exit( $app->run ? 0 : 1 ); |
| |
| If I write a subclass of App::Prove I can customise any aspect of |
| the test runner while inheriting all of prove's behaviour. Here's |
| myprove: |
| |
| #!/usr/bin/env perl use lib qw( lib ); # Add ./lib to @INC |
| use App::Prove; |
| |
| my $app = App::Prove->new; |
| |
| # Use custom TAP::Harness subclass |
| $app->harness( 'My::TAP::Harness' ); |
| |
| $app->process_args( @ARGV ); exit( $app->run ? 0 : 1 ); |
| |
| Now I can run my tests like this |
| |
| ./myprove -rb t |
| |
| =head2 Deeper Customisation |
| |
| Now that I know how to subclass and replace TAP::Harness I can |
| replace any other part of the harness. To do that I need to know |
| which classes are responsible for which functionality. Here's a |
| brief guided tour; the default class for each component is shown |
| in parentheses. Normally any replacements I write will be subclasses |
| of these default classes. |
| |
| When I run my tests TAP::Harness creates a scheduler |
| (TAP::Parser::Scheduler) to work out the running order for the |
| tests, an aggregator (TAP::Parser::Aggregator) to collect and analyse |
| the test results and a formatter (TAP::Formatter::Console) to display |
| those results. |
| |
| If I'm running my tests in parallel there may also be a multiplexer |
| (TAP::Parser::Multiplexer) - the component that allows multiple |
| tests to run simultaneously. |
| |
| Once it has created those helpers TAP::Harness starts running the |
| tests. For each test it creates a new parser (TAP::Parser) which |
| is responsible for running the test script and parsing its output. |
| |
| To replace any of these components I call one of these harness |
| methods with the name of the replacement class: |
| |
| aggregator_class |
| formatter_class |
| multiplexer_class |
| parser_class |
| scheduler_class |
| |
| For example, to replace the aggregator I would |
| |
| $harness->aggregator_class( 'My::Aggregator' ); |
| |
| Alternately I can supply the names of my substitute classes to the |
| TAP::Harness constructor: |
| |
| my $harness = TAP::Harness->new( |
| { aggregator_class => 'My::Aggregator' } |
| ); |
| |
| If I need to reach even deeper into the internals of the harness I |
| can replace the classes that TAP::Parser uses to execute test scripts |
| and tokenise their output. Before running a test script TAP::Parser |
| creates a grammar (TAP::Parser::Grammar) to decode the raw TAP into |
| tokens, a result factory (TAP::Parser::ResultFactory) to turn the |
| decoded TAP results into objects and, depending on whether it's |
| running a test script or reading TAP from a file, scalar or array |
| a source or an iterator (TAP::Parser::IteratorFactory). |
| |
| Each of these objects may be replaced by calling one of these parser |
| methods: |
| |
| source_class |
| perl_source_class |
| grammar_class |
| iterator_factory_class |
| result_factory_class |
| |
| =head2 Callbacks |
| |
| As an alternative to subclassing the components I need to change I |
| can attach callbacks to the default classes. TAP::Harness exposes |
| these callbacks: |
| |
| parser_args Tweak the parameters used to create the parser |
| made_parser Just made a new parser |
| before_runtests About to run tests |
| after_runtests Have run all tests |
| after_test Have run an individual test script |
| |
| TAP::Parser also supports callbacks; bailout, comment, plan, test, |
| unknown, version and yaml are called for the corresponding TAP |
| result types, ALL is called for all results, ELSE is called for all |
| results for which a named callback is not installed and EOF is |
| called once at the end of each TAP stream. |
| |
| To install a callback I pass the name of the callback and a subroutine |
| reference to TAP::Harness or TAP::Parser's callback method: |
| |
| $harness->callback( after_test => sub { |
| my ( $script, $desc, $parser ) = @_; |
| } ); |
| |
| I can also pass callbacks to the constructor: |
| |
| my $harness = TAP::Harness->new({ |
| callbacks => { |
| after_test => sub { |
| my ( $script, $desc, $parser ) = @_; |
| # Do something interesting here |
| } |
| } |
| }); |
| |
| When it comes to altering the behaviour of the test harness there's |
| more than one way to do it. Which way is best depends on my |
| requirements. In general if I only want to observe test execution |
| without changing the harness' behaviour (for example to log test |
| results to a database) I choose callbacks. If I want to make the |
| harness behave differently subclassing gives me more control. |
| |
| =head2 Parsing TAP |
| |
| Perhaps I don't need a complete test harness. If I already have a |
| TAP test log that I need to parse all I need is TAP::Parser and the |
| various classes it depends upon. Here's the code I need to run a |
| test and parse its TAP output |
| |
| use TAP::Parser; |
| |
| my $parser = TAP::Parser->new( { source => 't/simple.t' } ); |
| while ( my $result = $parser->next ) { |
| print $result->as_string, "\n"; |
| } |
| |
| Alternately I can pass an open filehandle as source and have the |
| parser read from that rather than attempting to run a test script: |
| |
| open my $tap, '<', 'tests.tap' |
| or die "Can't read TAP transcript ($!)\n"; |
| my $parser = TAP::Parser->new( { source => $tap } ); |
| while ( my $result = $parser->next ) { |
| print $result->as_string, "\n"; |
| } |
| |
| This approach is useful if I need to convert my TAP based test |
| results into some other representation. See TAP::Convert::TET |
| (http://search.cpan.org/dist/TAP-Convert-TET/) for an example of |
| this approach. |
| |
| =head2 Getting Support |
| |
| The Test::Harness developers hang out on the tapx-dev mailing |
| list[1]. For discussion of general, language independent TAP issues |
| there's the tap-l[2] list. Finally there's a wiki dedicated to the |
| Test Anything Protocol[3]. Contributions to the wiki, patches and |
| suggestions are all welcome. |
| |
| [1] L<http://www.hexten.net/mailman/listinfo/tapx-dev> |
| [2] L<http://testanything.org/mailman/listinfo/tap-l> |
| [3] L<http://testanything.org/> |