What I've Learned from (Nearly) Failing to Refactor Hudson
We've tried to refactor Hudson.java but without success; only later have I been able to refactor it successfully, thanks to the experience from the first attempt and more time. In any case it was a great learning opportunity.
I've also started to think that the refactoring process must be more rigorous to protect you from wandering too far your original goal and from getting lost in the eternal cycle of fixing something <-> discovering new problems. People tend to do depth-first refactoring changes that can easily lead them astray, far from where they actually need to go; it is important to stop periodically and look at where we are, where we are trying to get and whether we aren't getting lost and shouldn't just prune the current "branch" of refactorings and return to some earlier point and try perhaps a completely different solution. I guess that one of the key benefits of the Mikado method is that it provides you with this global overview - which gets easily lost when it is only in your head - and with points to roll-back to.
Don't use public fields. They make it really hard to replace a class with an interface.
Reflection and multithreading make it pretty difficult if not impossible to find out the dependencies of a particular piece of code and thus the impacts of its change. I'd hard time finding out all the places where Hudson.getInstance is invoked while its constructor is still running.
The steps can be seen very well from the corresponding GitHub commits. Summary:
The refactored code doesn't provide much added value yet but it is a good start for further refactorings (which I won't have the time to try :-( ), it got rid of the offending use of an instance while it is being created and the constructor code is simpler and better. The exercise took me about four pomodoros, i.e. little less than two hours.
If I had the time, I'd continue with extracting an interface from Hudson, moving its unrelated responsibilities to classes of their own (perhaps keeping the methods in Hudson for backwards compatibility and delegating to those objects) and I might even use some AOP magic to get a cleaner code while preserving binary compatibility (as Hudson/Jenkins actually already does).
and browse to http://localhost:8080/ (Jetty should pick changes to class files automatically).Convert the nested classes into top-level ones Provide a way to get instances of the classes without Hudson, e.g. as singletons Use the individual classes instead of Hudson wherever possible so that other classes depend only on the functionality they actually need instead of on the whole of Hudson
Building Hudson Introduction into the UI framework Stapler (its key feature is that it cleverly maps URLs to object hierarchies [and view files and action methods]), perhaps check also Stapler's reference
Lessons Learned
The two most important things we've learned are:- Never underestimate legacy code. It's for more complex and intertwined than you expect and it has more nasty surprises up in its sleeves than you can imagine.
- Never underestimate legacy code.
I've also started to think that the refactoring process must be more rigorous to protect you from wandering too far your original goal and from getting lost in the eternal cycle of fixing something <-> discovering new problems. People tend to do depth-first refactoring changes that can easily lead them astray, far from where they actually need to go; it is important to stop periodically and look at where we are, where we are trying to get and whether we aren't getting lost and shouldn't just prune the current "branch" of refactorings and return to some earlier point and try perhaps a completely different solution. I guess that one of the key benefits of the Mikado method is that it provides you with this global overview - which gets easily lost when it is only in your head - and with points to roll-back to.
Evils of Legacy Code
Use a dependency injection framework, for God's sake! Singletons and their manual retrieval really complicate testing and affect the flexibility of the code.Don't use public fields. They make it really hard to replace a class with an interface.
Reflection and multithreading make it pretty difficult if not impossible to find out the dependencies of a particular piece of code and thus the impacts of its change. I'd hard time finding out all the places where Hudson.getInstance is invoked while its constructor is still running.
Our Way to Failure and Success
There is a lot of refactoring that could be done with Hudson.java, for it is a typical God Class which additionally spreads its tentacles through the whole code base via its evil singleton instance being used by just about anyone for many different purposes. Gojko describes some of the problems worth removing.The Failure
We've tried to start small and "normalize" the singleton initialization, which isn't done in a factory method, but in the constructor itself. I haven't chosen the goal very well as it doesn't bring much value. The idea was to make it possible to have potentially also other implementations of Hudson - e.g. a MockHudson - but with respect to the state of the code it wasn't really feasible and even if it was, a simple Hudson.setInstance would perhaps suffice. Anyway we've tried to create a factory method and move the initialization of the singleton instance there but at the end we got lost in concurrency issues: there were either multiple instances of Hudson or the application deadlocked itself. We tried to move pieces of code around, but the dependencies wouldn't have let us do that.The Success
While reflecting on our failure I've come to the realization that the problem was that Hudson.getInstance() is called (many times) already during the execution of the Hudson's constructor by the objects used there and threads started from there. It is of course a hideous practice to access a half-baked instance before it is fully initialized. The solution is then simple: to be able to initialize the singleton field outside of the constructor, we must remove all calls to getInstance from its context.The steps can be seen very well from the corresponding GitHub commits. Summary:
- I used the "introduce factory" refactoring on the constructor
- I modified ProxyConfiguration not to use getInstance but to expect that the root directory will be set before its first use
- I moved the code that didn't need to be run from the constructor out, to the new factory method - this resulted in some, hopefully insignificant, reordering of the code
- Finally, I also moved the instance initialization to the factory method
The refactored code doesn't provide much added value yet but it is a good start for further refactorings (which I won't have the time to try :-( ), it got rid of the offending use of an instance while it is being created and the constructor code is simpler and better. The exercise took me about four pomodoros, i.e. little less than two hours.
If I had the time, I'd continue with extracting an interface from Hudson, moving its unrelated responsibilities to classes of their own (perhaps keeping the methods in Hudson for backwards compatibility and delegating to those objects) and I might even use some AOP magic to get a cleaner code while preserving binary compatibility (as Hudson/Jenkins actually already does).
Try it for Yourself!
Setup
Get the code
Get the code as .zip or via git:
git@github.com:iterate/coding-dojo.git # 50MB => takes a while
cd coding-dojo
git checkout -b mybranch INITIAL
Compile the Code
as described in the dojo's README.Run Jenkins/Hudson
cd coding-dojo/2011-04-26-refactoring_hudson/
cd maven-plugin; mvn install; cd .. # a necessary dependency
cd hudson/war; mvn hudson-dev:run
and browse to http://localhost:8080/ (Jetty should pick changes to class files automatically).
Further Refactorings
If you're the adventurous type, you can try to improve the code more by splitting out the individual responsibilities of the god class. I'd proceed like this:- Extract an interface from Hudson and use it wherever possible
- Move related methods and fields into (nested) classes of their own, the original Hudson's methods just delegate to them (the move method refactoring should be useful); for example:
- Management of extensions and descriptors
- Authentication & authorization
- Cluster management
- Application-level functionality (control methods such as restart, updates of configurations, management of socket listeners)
- UI controller (factoring this out would require re-configuration of Stapler)
Learning about Jenkins/Hudson
If you want to understand mode about what Hudson does and how it works, you may check:- Hudson's Architecture and optionally proceed with