Friday, May 8, 2009

Continuous Integration Lessons Learned

I got a lot of interested from my most recent post about The Ultimate Enterprise Java Build Solutions and it motivated me to find one of my favorite blog posts I posted somewhere else originally back in December 2004. I found some of my lessons still relevant so since I could only find it on the wayback at http://web.archive.org/web/20050528062417/blogs.apress.com/authors.php?author=Christopher+Judd. So, I thought I would repost portions of it.

During the process of using CC, I learned the following valuable lessons I wanted to share.

First, once every 24 hours is not frequent enough for continuous integration. As mentioned above, I use to set up CI environments to build once every 24 hours. When I initially set up CI, I was asked to set the builds to run every 4 to 6 hours. There were skeptics who believed any more frequent builds triggered by repository activity would interfere with a team so new to CI and cause unnecessary anxiety. However, every 4 to 6 hours was a problem when trying to set up and configure CI initially so I set the CI to check the repository every minute and if something changed to wait for 5 minutes of inactivity before starting the build. Fortunately, I forgot to change it to a less frequent iteration and the minute check made into the final configuration. We discovered that the short frequencies actually provided the best results by giving everybody comfort since they got immediate feedback. Plus without a frequent build, one bad build could cause the red lava lamp to be lit all day.

Second, there are multiple audiences for the builds so there is a need for a continuous integration build and a nightly build. The audience of a continuous build should be the developers themselves. Developers need quick feedback to provide confidence. They need to know what they checked in to the repository works outside their development environment and what they check out of the repository works. So, this build should focus on code compiling, passing the unit tests and being able to be packages and possibly deployed. The second audience is management and architects. Managers are often trying to collect metrics from frameworks like NCSS and JUnit (number of unit tests). Architects are often interested in code quality reports such as PMD and unit test code coverage. These types of reports take longer to produce and don’t need to run continuously. A separate build that runs at midnight is perfect for executing metric and code quality reports. Of course developers should be able to run these reports at any point in time in their development environments since the same build scripts should be used by both the developers and CI environment.

Third, a CI web site like the one already in Hudson and provided by Sonar is a very valuable communication tool. While developers need to be notified immediately of build problems via email, IM or lava lamps, other such as managers do not. A website can provide the information they need at their convenience.

Forth, lava lamps are a fun way to provide a visual indicator of the build. I initially thought the idea was rather hokey but I was wrong. If you want to learn how to integrate lava lamps with CruiseControl check out Mike Clark's Pragmatic Automation web site (http://www.pragmaticautomation.com/cgi-bin/pragauto.cgi/Monitor/Devices/BubbleBubbleBuildsInTrouble.rdoc).

Tuesday, May 5, 2009

The Ultimate Enterprise Java Build Solution

Early in my career I took on the role of setting up and operating the build infrastructure of many of the projects I have consulted on. I started in this role before Apache Ant released its 1.0 version. I have struggled with using Cruise Control as my continuous integration server including lava lamps for broken builds. Finally, I have also used and configured just about every code quality tool for Java and built a dashboard to try to combind all the results.

Now after all these years, I think I found the right solutions for Enterprise Java Builds. The solution involves 5 open source projects: Maven, Subversion, Hudson, Nexus, Sonar.

At the core of the solutions is Apache Maven, a build, project, dependency management framework. Maven makes it easy to declaratively describe a project or collection of projects that generate artifacts like binary jars, source jars, doc jars, dependency lists and other artifacts. All these artifacts can be versioned to ensure all developers are using the right artifacts. These artifacts can also be published to a Maven repository making distribution of the artifacts seamless.

In order for developers to collectively own code and integrate often, a source code repository is necessary. Subversion has been a proven enterprise scale repository which integrates well with may tools like Eclipse, Hudson and Maven. But there are many other quality source code repositories that could fit in Subversion's place such as Git. The exact source code repository for this solutions is not as important as having one and having one that integrates well with the choosen tools.

One of the biggest challenges in developing software with a team of people is integrating the software so the practice of continiously integrating has become a staple in many enterprises. After every developer check-in, a continious integration server will check the code out, compile and run all the unit tests. Hudson is possibly the easiest and most powerful continious integration server available for Java. It has a very simple web console that makes creating and configuring build jobs a cinch especially Maven jobs. Just incase that is not enough, it has a very nice plug-in system and community making it very flexible and robust.

After Hudson builds artifacts (jars) that developers need it must publish them to a Maven repository hosted within the enterprise. Nexus is that Maven repository. It enables you to publish both release and snapshot artifacts, provides different views into the respository and includes searching for artifacts even their contents. In addition, it can act as a proxy to external public Maven repository providing traceablity into where artifacts came from as well as improve download performance. Both developers and Hudson can use Nexus to keep their local artifacts up to date providing continious integration for everybody all the time.

Finally, it is valuable to keep metrics about code quality. This can help show if code is improving or declining. This can help easily identify problems, risky areas and bad pratices. Sonar is a server that provides a dashboard into your code quality. It integrates with many common code quality tools like PMD, Checkstyle and FindBugs. It include metrics for code coverage, unit testing and lines of code. The trending capabilities make it easy to identify patterns.

AWS EC2 Hibernate Java SDK v2 Example

I recently wanted to automate the creation of developer VMs in AWS using EC2 instances. To improve the developer experience (DX), I didn...