ALearner

ALearner is a tool that can learn likely assertions based on passed and failed test cases according to a set of initial assertions. ALearner can improve the learned assertions by using selective sampling to generate more test cases automatically based on current learned assertions.

Key Publication

Assertion Generation Through Active Learning.
Long H. Pham, Lyly Tran Thi, and Jun Sun.
Formal Methods and Software Engineering - 19th International Conference on Formal Engineering Methods, ICFEM 2017, Xi’an, China, November 13-17, 2017, Proceedings.

More Information

The source code of ALearner can be downloaded from Github.

Primitive Templates
The list of primitive templates includes:

– For boolean variables
+ x = true
+ x = false

– For int and long variables
+ x + y is not overflow
+ x – y is not overflow
+ x * y is not overflow
+ gcd(x,y) = z
+ x mod y = z

– For byte, short, int, long, float, double variables
+ x = a
+ x != MIN
+ x != a
+ x >= 0
+ x > 0
+ x = abs(y)
+ x = y ^ 2
+ x = y ^ 3
+ x = sqrt(y)
+ x >= y
+ x > y
+ x = y
+ ax + by = c
+ x + y = z
+ x – y = z
+ x * y = z
+ x / y = z
+ x ^ y = z

We also use SVM to learn assertions in the form of ax >= b and ax + by >= c.

Experiments
We experimented ALearner with and without selective sampling. The experimental subjects include projects pedrovgs/Algorithms, JodaOrg/joda-time, and JodaOrg/joda-money from GitHub, and 10 programs from SVComp (each program in SVComp is analyzed 20 times with random test cases).

We categorized each of learned assertions into 4 categories:
Necessary if it is necessary condition to avoid failure.
Sufficient if it is sufficient condition to avoid failure.
Correct if it is both necessary and sufficient.
Irrelevant if it is not necessary nor sufficient.

For SVComp programs, we consider if the learned assertions are correct or useful (weaker than user-defined precondition but still strong enough to prove the postcondition).

The summary of results are as follow:
– Project Algorithms:
+ With selective sampling: 69 corr, 10 necc, 2 suff, 4 irre
+ Without selective sampling: 64 corr, 15 necc, 3 suff, 6 irre

– Project joda-time:
+ With selective sampling: 83 corr, 43 necc, 0 suff, 7 irre
+ Without selective sampling: 25 corr, 31 necc, 37 suff, 60 irre

– Project joda-money:
+ With selective sampling: 16 corr, 9 necc, 0 suff, 0 irre
+ Without selective sampling: 16 corr, 9 necc, 0 suff, 5 irre

– SVComp programs:
+ With selective sampling: 90% useful, 80% correct
+ Without selective sampling: 20% useful, nearly 0% correct

We can see with more data from selective sampling, ALearner can filter out a lot of sufficient and irrelevant assertions. Especially, without selective sampling, we almost never learn any correct assertions for SVComp programs.

We also compare ALearner’s results with Daikon‘s results in the same set of subjects. In general, Daikon infers less necessary and correct assertions, but more sufficient and irrelevant assertions than ALearner.

You can download the details of results and test cases used in experiments from HERE.

People

SUN Jun
Professor
PHAM Hong Long
PHAM Hong Long
Research Fellow
Lyly TRAN L
Lyly TRAN
Research Assistant