ContractSuggestor - Automatically Generating Contracts from Javadoc Comments

ContractSuggestor

ContractSuggestor provides an approach for automatically suggesting contracts by analyzing natural language tagged-comments. This effort tries to reduce the annotation burden – one of the points developers highlight as being a blocking for using the DBC approach – by enabling developers to use contracts without asking them to learn a new syntax or even to use an approach such as ContractJDoc.

ContractSuggestor infrastructure. First, the project received as parameter pass through the tagged-comments extractor. Then, the trained machine learning algorithm is applied for classifying each comment instance (after the training process with the dataset manually produced). Finally, a contract generator will generate contracts for the property desired (in order to not modify the sources of the projects, we are using AspectJ aspects as contracts).

In this page we make available all details for replicating the studies performed with ConntractSuggestor.

Systems

First we present the Python scripts for downloading the systems considered in each dataset: Non-null and Relational:

After downloading the systems, someone may want to run Extract_comments over each system in order to collect all tagged-comments. For this purpose, we make the Python script available here.

Dataset

In case you do not want to run Extract_comments, we make our manually validated datasets available:

iPython Notebooks

In addition, for applying the whole approach, we make available the iPython Notebooks created for each property considered.

Contracts Generated

In the following folder, we present all contracts generated by ContractSuggestor for the evaluated systems.

Choosing a Machine Learning Algorithm

For choosing the Machine Learning algorithm for using in ContractSuggestor, we selected the top-3 in terms of precision, recall, F1-score, and accuracy considering a first version of our dataset for non-null (composed by 13 systems: 55,684 comment instances). Then, we applied AdaBoost, Multi-layer Perceptron, and Passive-Aggressive to eight systems concerning to non-null property. The results are shown below:

Precision-Recall--Non-nullDataset

By means of these results, we decided to keep using AdaBoost for the non-null property.

For the relational property, Passive-Aggressive outperformed AdaBoost and generated more useful contract than the other two algorithms.

Replication package

A replication package for ContractSuggestor is also available online.