Java Journal: Refactoring

Java
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times
My den meets all the criteria to be thought of as a software system. It takes input. The inputs are me and a bunch of reference material. It produces output. The output is a monthly Java column like the one you are reading right now. Seems like a pretty simple system with a well-defined interface of input and output. However, like most systems, it is far more complex than the interface would lead you to believe. Part of the reason for this complexity is that the system is doing something that it was not originally designed to do. In its infancy, my den was used for surfing the Web, watching TV, playing video games, and working on special projects. Later, it was primarily used for writing code for contract work and has slowly evolved into the column-writing system of the present. As my den passed through these various stages, it became somewhat less than tidy. There are stacks of books, piles of CDs, cables, and sticky notes galore. I remember purchasing a scanner a while back and even using it once or twice, so it must be under there somewhere. My den is exhibiting one of the fundamental laws of the universe. Specifically, my den is demonstrating the Second Law of Thermodynamics, which states that the universe tends toward a state of entropy. In the case of my den, I like to describe it as a "spontaneous increase in disorder." I've tried to explain this theory to my wife, but she insists that it won't harm the space-time continuum if I straighten up a little.

So it's time to clean the den, and I have a couple of ways I could go about it. One method would be to pull everything out of the bookshelves, file cabinets, and desk drawers, and then organize all my knickknacks and whatnots and, most importantly, throw things away that I don't need. I could unhook the computers and untangle the ball of wires that has accumulated behind the desk. This would be the right thing to do. There is only one problem with this method. From the time I begin cleaning until I have finished, I have disabled my Java column producing system. So there must be a better method. Another method would be to take an incremental approach, starting with something small like clearing off the desktop or organizing a drawer. This would allow me to improve the overall system without disabling it. Although the first method is probably preferable for getting everything done, the second method more closely resembles the way we have to make changes to software systems.

In his book Refactoring: Improving the Design of Existing Code, Martin Fowler describes refactoring as "the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure." So, in the case of my den, a refactoring could be any one of the small tasks mentioned above, because although they would improve the internal structure of my den, they would not affect the external behavior.

In software systems, you have probably already done refactoring from time to time. Usually, most people refactor when they are adding a new feature to a system and the current design is not flexible enough to accommodate it. So you refactor the design and then add a new feature. The result is that you get your new feature and improve your design. So if refactoring is a good thing to do, why do it only when you are forced to? Why not do it all the time? The most important goal when refactoring is to not alter the current behavior. Many people confuse refactoring with adding new functionality. Remember, you refactor first and then add the new functionality.

The only way to make sure that applying a refactoring has not changed the behavior of your system is to have good, automated tests. In Java, the best way to create automated tests is by using JUnit. For more information on JUnit, read my article on JUnit and automated testing. If you have good, automated testing, you can and should refactor all the time. Now, all you need to do is learn which refactorings are available to apply in a given situation and apply them.

There are several books on refactoring, but none even begins to compare to the original Refactoring by Martin Fowler mentioned above. Not only does he provide a great walkthrough example of how to apply refactorings, but he also provides a catalog of refactoring that he and his colleagues have come up with. Each refactoring in the catalog has a name, a summary, a motivation, the mechanics, and an example. Each of these properties is self-describing--except for possibly mechanics, which, simply put, are the steps that you would go through in order to apply a refactoring. Which brings up an interesting point. Any sort of task for which you can specify a series steps to complete can be automated and built into a tool such as your favorite Integrated Development Environment (IDE). In fact, both JBuilder and Together Control Center offer some minimal, automated refactoring support.

Some Refactoring Examples

The refactoring that I use the most is called Extract Method. In general, when I have a method that has grown too large or that I find myself cutting and pasting code out of, then I know that it is time to apply Extract Method. The goal of Extract Method is to remove a logical block of code from a method either to make the original method easier to understand or to make a block of logic available to be reused somewhere else. For my example, I'll use a method that prints a report by printing a header, a body, and then a footer. A stub for this method would look like the following:

  public void complicatedReportPrintingMethod()
  {
    /* Several lines of code that print the report header
       ...
    */

    /* A bunch of code that prints the report body
       ...
    */

    /* Several lines of code that print the report footer
       ...
    */
  }


Rather than bog you down in a lot of code that you don't care about, I will just assume that each of the comment blocks is a significant number of lines of code.

Now, for the purposes of this example, let's say that you have a requirement to add a new report to your system that has the same header and footer as your current report. You will want to extract the code for the header and footer into their own methods by applying the Extract Method refactoring twice. However, before doing any refactoring, make sure that you have good test coverage for the code that you are about to change. If needed, add some new tests. Then, run all the tests to make sure that they all pass. Only then should you apply the Extract Method refactoring twice, once to the header block of code and once to the footer block of code. The resulting code would look like the following:

  public void complicatedReportPrintingMethod()
  {
    this.printHeader();

    /* A bunch of code that prints the report body
       ...
    */

    this.printFooter();
  }

  public void printHeader()
  {
    /* Several lines of code that print the report header
       ...
    */
  }

  public void printFooter()
  {
    /* Several lines of code that print the report footer
       ...
    */
  }


Once the refactoring is done and you have verified that the automated tests all still pass, you could add the new report, which you would call printHeader() and printFooter(). (Note: There are a lot of rules for how to extract a method that I am skipping over here for the sake of simplicity.) So now, suppose you decide that you want to add yet another report and your class is getting lots of independent methods in it. You might make an abstract base class for the reports and then have each report inherit from this new base class. When doing this, you will want to move printHeader() and printFooter() to this new abstract base class by applying a refactoring called Pull Up Method. As I mentioned earlier, some IDEs are now incorporating tools to help you do refactoring automatically. So, to do the Extract Method in the above example, all you would have to do is highlight the code that you wish to extract, select the appropriate refactoring, and then name the new method.

Some Other Benefits

There are several other benefits to refactoring. First, some refactorings tie in tightly with design patterns. Design patterns are a proven way to reuse design. For more information on design patterns and design reuse, see my article on design patterns. An example of such a refactoring would be Replace Type Code with State/Strategy. The state and strategy design patterns really let you leverage the power of your objects by decoupling interfaces from implementations. Second, you will find that as you learn more and more of the refactoring techniques, you will write better code to begin with because you will produce the equivalent of already refactored code on the first pass. This is not to say that you won't want to still refactor more; there is always room for improvement, and in almost any significant piece of code, you will find many opportunities to refactor. Third, most refactoring techniques work in multiple languages, and while many are based on the premise that your language is object-oriented, many more are not and are applicable to sequential and object-oriented languages alike.

Where to Get More Information

In addition to the Refactoring book, there is a refactoring Web site where you can find up-to-date information and discussion of all the latest tools and techniques. I list only this one site because it is both updated frequently and contains links to all the refactoring information in the known universe.

Recap

Refactoring is about changing the internal design of code without changing its external behavior. Don't confuse refactoring with adding functionality--refactor first, then add functionality. Learn to continuously refactor and add new automated tests if you are unsure of the test coverage of the code you are refactoring. Start by learning simple, common refactorings like Extract Method, and then build your confidence and branch out to others. Some refactorings like Replace Type Code with State/Strategy are particularly effective, and they allow you to incorporate proven design patterns into your system. Once you become confident with refactoring techniques, you will find that you write better code to begin with. Although some refactorings are geared toward object-oriented languages, there are many that can be used in sequential languages. Refactoring is how we fight the Second Law of Thermodynamics and keep our systems from becoming chaotic, so I think I will start with the bookshelves before it's too late.

Michael J. Floyd is an Extreme Programmer and the Software Engineering Technical Lead for DivXNetworks. He is also a consultant for San Diego State University and can be reached at This email address is being protected from spambots. You need JavaScript enabled to view it..

BLOG COMMENTS POWERED BY DISQUS