Comparison JMLOK and JET

We perform a comparison between JMLOK and JET considering the number of nonconformances detected and the coverage of the test cases generated. To us coverage is given the following equation:

Where x is an experimental unit, JavaCoverage (block instructions coverage) is given by EclEmma Eclipse plugin and JMLCoverage (block instructions coverage) is collected manually.

In our comparison, we use the experimental units: Samples, JAccounting (we considered only the main class: Account, because some limitations of JET tool), Mondex and TransactedMemory (see details here). The results are displayed on Table 1:

Table 1. Results of Comparison

JET was able to reveal unseen nonconformances, specially for the Sample programs and JAccounting experimental units. However, we observed an important drawback: the tool is inconstant about the nonconformances discovered; for instance, in JAccounting unit, different executions found different nonconformances: JET often detects zero nonconformances for a this unit, then in the next execution shows four nonconformances; for the same unit JMLOK always find three nonconformances.

Furthermore, the nonconformances detected by the tools are the same in the majority of cases; but in the JAccounting, there are two cases in which JET and JMLOK differ in the type assigned to the nonconformances: while JMLOK assigned Evaluation, JET assigned Postcondition; maybe this difference can be related to the compilers used by tools (jmlc, in JMLOK, and an extension of jmlc in JET).

We also observed that the genetic algorithm in the backend makes JET differ between repeated executions. This property was not observed in the RGT-based approach, despite its randomness (using the same setup we always detected the same nonconformances in several executions of the tool).

Considering the test coverage, in general JMLOK performed better than JET. The only case where JET was better was JAccounting. This result can be related to JET requirements: no public fields can be assigned, and object sharing is not allowed; the tests miss several parts from the programs that do not fulfill those requirements, which does not occur with JMLOK. Considering the number of nonconformances detected, the only case where JET performs better than JMLOK too was JAccounting: 4 against 3.

So, answering the research question: "Does the RGT-based approach perform better than the JET tool?" - Yes, considering number of detected nonconformances and test coverage in our experimental units JMLOK tool performs better than JET. In other units, these results may vary, although JET limitations impede tests with dependencies between classes and packages, and the use of external libraries. We believe that an approach that uses the best features of both tools would be suitable for the purpose of nonconformance detection.