Sunday 12 March 2017

Aggregating Reports in Multi-Module Maven Projects

With a highly complex multi-module Maven build, it can be very useful to use the Site generation feature of Maven to produce HTML reports for Unit and Integration Test results, as well as code coverage.  This can save scrolling back through pages of console output to try and figure out which Unit Test has unexpectedly failed

However, by default, Maven will produce a separate report for each module, which can mean having to click through many different modules to figure out which one contains the failure - especially when the module Name (shown in the Maven log) doesn't correspond directly to the path where it can be found.

Fortunately, Maven does have some report aggregating capabilities.  Less fortunately, these seem to be poorly documented online - either that, or everyone else figured it out with no help at all!

Module Configuration

I've set up a simple demo project to explore how this can be done effectively.  My scope included the following requirements:
  • Aggregate the Unit Test reports (Surefire) in a single place
  • Aggregate the Integration Test reports (Failsafe) in a single place
  • Aggregate the Code Coverage report (Jacoco) in a single place
Additionally, to match the real-world scenario that prompted this investigation, I wanted to cover the interesting edge case of an "Integration Test Only" module - ideally, failures in this module would be aggregated in with the Failsafe reports, and the aggregated code coverage report from Jacoco would include the coverage from the integration tests.

So, my starting position looks like this:
  • module1
  • module2
  • component-test (depends on module1 & module2)
The next step is to combine these three modules into a single Maven build, and here's where it gets interesting.  It seems that there are two separate principles in play in Maven, which are often conflated into a single implementation: Inheritance vs. Aggregation

The best article I've found on the topic is a StackOverflow answer, which led me to the two possible solutions below:
  • parentAggregate
    • module1 (inherits from parentAggregate)
    • module2 (inherits from parentAggregate)
    • component-test (inherits from parentAggregate, depends on module1 & module2)
or
  • aggregate
    • parent
    • module1 (inherits from ../parent)
    • module2 (inherits from ../parent)
    • component-tests (inherits from ../parent, depends on module1 & module2)
The key difference between these two solutions is the build order.  The parent module is always built before any child modules - so parentAggregate is built before module1 or module2.  However, with the separate parent and aggregate model, the aggregate module is built last, after all modules are completed.  This is quite interesting and important when it comes to creating aggregate reports for Jacoco.

The project I've inherited uses the parentAggregate model, which at first glance seems to be the more elegant solution.  There's no need to specify the relative path for the parent, and the directory structure directly matches the inheritance model.  However, as we'll see, this isn't necessarily the optimal solution when it comes to aggregate reports.

For the rest of this post, I'm going to try and cover both models - highlighting the similarities and differences between the configuration required in each case.  At the end, I'll offer my opinion as to which solution is superior for my requirements, and why.

Executing Tests and Gathering Coverage

Let's start by configuring the Surefire, Failsafe and Jacoco plugins to actually execute the Unit Tests, Integration Tests and Code Coverage during the build.

There are a few things to note here:
  • For the Surefire plugin, there's no need to specify any executions; the defaults work as expected
  • For both Failsafe and Jacoco, if you don't specify the executions, the plugin will not run
  • In the parentAggregate model, this should be added to the parentAggregate module, so it is inherited by each sub-module.  Similarly, in the separate aggregate/parent model, it must be added to the parent model, so it is inherited by each sub-module.
This configuration results in four outcomes during the Maven build cycle:
  • During the test phase, the Unit Tests are executed by the Surefire plugin.
  • While these are running, Jacoco gathers the code coverage and creates jacoco.exec
  • During the verify phase, the Integration Tests are executed by the Failsafe plugin
  • While these are running, Jacoco gathers the code coverage and created jacoco-it.exec
At this point though, we have no reports generated at all; if we want to examine the results of the unit tests, we need to look at each separate TXT or XML file generated by Surefire and Failsafe, and we have no way to read the coverage data.

Basic Reporting per Module

The next step is to configure Surefire Reports, Failsafe Reports and Jacoco Reports for each sub-module.  For now, we're just interested in making it easy to read the reports per sub-module, rather than aggregating the reports.


Again, the points of interest:
  • These are still added to the "parent" module, whichever that is.
  • For speed, I've disabled the default reports - only the index report is generated, which provides the index.html page with the project description.
  • The Failsafe reports are generated by the Surefire report plugin
  • For all three reports, the report is only generated if the relevant raw data exists.  This means that there will be no Jacoco Integration report if there is no jacobo-it.exec file for a module.  
  • Also, since my component-test module has no production classes (it's test only), there will be no Jacoco report for this module either, as there's no way to calculate a percentage coverage without having the original classes to compare to.  Modules that do have both Unit and Integration tests would get both Jacoco reports
At this point, we can easily get separate reports for each module, but we have no visibility of what is covered by our dedicated integration test module, and it's still tricky to find a test failure in a random module.

Executing Maven

A note now on how best to execute Maven.  If you do the obvious mvn clean install site, you will discover that failed unit tests prevent the Site from being generated for that module - so there's no reports to examine!  However, mvn site on it's own will quite happily find the failed tests and add them to the report, but runs the tests a second time anyway.  I've found the best combination is this:
mvn -fn clean verify; mvn -DskipTests site

This ensures all the tests are run (-fn, Fail Never, means that the build will continue on failing Unit Tests, so you can see all the tests that fail, not just the first one), then runs the Site build as a separate task, without re-running the tests.
If your integration tests are particularly slow, you may want to run just test instead of verify.
You may want to add a third step - mvn site:stage - to put the Sites for all the modules into a single location. However, when we get to the point of generating aggregate reports, this becomes unnecessary.

Aggregated Test Reports

Finally, we get to the most interesting part of this post - generating the Aggregate reports.

For Surefire and Failsafe, this is quite easy to achieve.

This needs to be added to the aggregate module - for both the parentAggregate and the separate models, that's the top level module.  When using the separate parent and aggregate, the inherited=false setting is technically not required, but won't do any harm.
This will generate the Aggregate reports when the aggregate module site is built.  When using the separate parent and aggregate, this is the last module to be built, so it could be done as part of a single Maven execution - but as discussed above, it's useful to run a separate build for the Site to catch any Unit Test failures in the reports.
When using the combined parentAggregate model, the parent is built first, so if you try and do this all as one execution, the Aggregate report will be empty, which is one reason some people would prefer this architecture.
In this model, since both the reportSets are set not to inherit, the child modules end up using the default reportSet only, and therefore generating the non-aggregated reports, as we require.

Aggregated Coverage Report

Generating an Aggregate report from Jacoco is, unfortunately, far more involved.  Jacoco provides a target for an aggregate report (report-aggregate), but this target only aggregates the reports for dependencies of the current module.

This gives us two different solutions - one for the parentAggregate model, and one for the separate model.

When using a parentAggregate, the recommended solution from Jacoco themselves is to create a separate reports module.  This module must depend on all the other modules - and, significantly, specify the test scope for Test-Only modules.  This ensures that Jacoco knows how to count lines of code - in Test-Only modules, the execution counts toward coverage, but the code itself does not need to be covered by tests.  The Jacoco Aggregate report is added in the dedicated reports module.

When using separate parent and aggregator modules, the aggregator is built last, so one would think that adding the Jacoco Aggregate report to this module would be sufficient, but alas, Jacoco still needs the dependancies to know the scope of each module.

Therefore, to generate the Aggregate report for Jacoco we must add the following to either the aggregator or the reports module, depending on the model chosen.

In either architecture, the Dependancies of the aggregator or the reports module must be maintained separately for the Aggregated Code Coverage report to continue to be accurate.  This is a burden on maintenance that would be better avoided.
In my opinion, using separate the aggregator module has the advantage here, as the Dependencies can be updated when the Modules are updated, whereas with a reports module, these would have to be updated separately.

Final Conclusions

By executing the Build and the Site generation separately, we eliminate a lot of the confusion around aggregate reports almost accidentally - which may be why the internet as a whole is so quiet on this topic... perhaps most people do these separately by default!

mvn -fn clean verify; mvn -DskipTests site

Generating the Aggregated Test Reports is simple, whichever model you choose to implement.

Generating the Aggregated Jacoco Reports is difficult, whichever model you implement, although the combined parentAggregator does make this slightly worse.

In the end, the best solution is probably to generate the Aggregated Test Reports in Maven, and leave the Code Coverage report to a dedicated tool such as Sonar!