Automated Software Engineering Day 1
The first day of the Automated Software Engineering (ASE) conference has passed. I wanted to highlight a few things that I saw and thought were interesting.  Sorry for posting my impressions a bit late, but I guess it’s better late than never. I’ll write a separate post for days two and three.
The first was a paper entitled “Increasing Test Granularity by Aggregating Unit Tests” by a group from the University of Nebraska. I think this is my favorite so far. The idea is to increase the effectiveness of automatically derived unit tests by aggregating them based on several properties. The techniques for creating the original set of unit tests is also rather cool and was developed previously. Basically it involves profiling system tests and extracting uses of individual functions as unit tests. This gives developers a very large set of unit tests, so large that not all may be executed during testing. So these guys developed a technique that combines the extracted unit tests into more coarsely grained tests. These new tests not only execute faster, but can also lead to better bug discovery. Very interesting work that I know little about.
Another paper that I kind of liked was “Random Test Run Length and Effectiveness”. I think the work was interesting, but I am not sure about their conclusions. The whole point, at least as much as I understood it, was to create semi-random sequences of operations as test cases. The idea is to exercise the system to try to find bugs that may propagate during execution. Meaning that one may not observe the bug causing a problem until much later in the execution. The issue here is that a long test sequence is more likely to find a bug, but far more difficult to actually use, because developers have to go through a lot of data and steps to recreate the bug in the first place. So these guys performed some experiments to try to discover the optimal length. I think, as they probably expected as well, that the length varies based on the system. The part that I found most interesting wasn’t that result, but what happened when they ran it on different versions of the same system. I think those graphs show potential to be used in studying the evolution of software systems. The results were a bit obvious, but it may be a good start for someone to pick up. They showed that the earlier versions of the system needed smaller sequences to get good performance. Interesting results, but not something they mentioned during the presentation.
There was also a poster session at the end of day 1. In it I saw two rather interesting ideas. The first was replay application using virtualization. Basically it was like Tivo for services. When something broke the developers could stop the VM and replay from before the bug appeared. Developers can also instrument the replay and monitor what was happening. Incredibly useful. Another poster I saw used clustering to restructure interfaces found in C header files. The goal was to optimize the system for compiling, so that dependencies are minimized. That way when something changes the entire system does not need to be recompiled. Their system worked really well and came close to matching the hand chosen clustering.
Well that was day 1. I hope you check out ASE and if you can find them read those papers, very interesting work. Good conference so far, can’t wait until day 2.