“Diff it to me baby (aha, aha)”
A timely refactoring kata
押忍 – welcome to my venerable coding dojo, fellow software ninja! Today, we got a fun little refactoring exercise to tackle1: it goes by the name clockdiff
, a command line application that computes the difference between two wall clock times. Have you ever wondered how much time elapsed between 9:38 and 13:11? clockdiff
has you covered:
$ clockdiff 9:38 13:11
3:33
The good thing about clockdiff
is that it is functioning correctly. However, if you look under the hood, then you might find that the original source code could need some improvement here and there. And this is what we are tasked with today.
clockdiff
is written in the Go programming language, though no in-depth knowledge about Go is required. The original implementation is basically one consecutive blob of code inside the main
function:
package main
import ("fmt"; "os"; "strconv"; "strings")
func main() {
if len(os.Args) != 3 {
fmt.Println("Error: wrong number of arguments")
os.Exit(1)
}
time1 := strings.Split(os.Args[1], ":")
time2 := strings.Split(os.Args[2], ":")
if len(time1) != 2 || len(time1[0]) > 2 || len(time1[1]) != 2 ||
len(time2) != 2 || len(time2[0]) > 2 || len(time2[1]) != 2 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
hours1, errH1 := strconv.Atoi(time1[0])
mins1, errM1 := strconv.Atoi(time1[1])
hours2, errH2 := strconv.Atoi(time2[0])
mins2, errM2 := strconv.Atoi(time2[1])
if errH1 != nil || errM1 != nil || errH2 != nil || errM2 != nil ||
hours1 < 0 || hours1 >= 24 || mins1 < 0 || mins1 >= 60 ||
hours2 < 0 || hours2 >= 24 || mins2 < 0 || mins2 >= 60 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
diff := (hours2*60 + mins2) - (hours1*60 + mins1)
if diff < 0 {
diff = diff * -1
}
fmt.Printf("%d:%02d\n", diff/60, diff%60)
}
Let’s see what we got here:
- First of all,
clockdiff
checks whether the right number of input arguments was provided.os.Args
is expected to contain 3 items: first the path of the program (populated by the Go runtime), and then two additional input values provided by the user. - Next, it parses both input values (e.g.
9:38
and13:11
). The algorithm for that is:- Split each string into an array of strings at the
:
delimiter. - Make sure that the split yielded two parts, namely the hour and minute part. The hour part must contain 1 or 2 characters; the minute part 2 characters.
- Convert the parts from strings to integers, using Go’s built-in
Atoi
function. - Check whether the conversion succeeded, and also verify some logical constraints: the hour part can’t be
24
or greater, the minute part can’t be60
or greater, and neither part can be a negative value.
- Split each string into an array of strings at the
clockdiff
then calculates the (absolute) difference between both times. It uses the amount of minutes since midnight for that computation.- Eventually, it outputs the result, formatted as wall clock time.
The refactoring
Our goal is to improve the code quality of the implementation without changing how clockdiff
behaves. Fortunately, there is a somewhat comprehensive suite of end-to-end tests, which helps us ensure to not introduce any regressions. We will approach the refactoring step-wise, where each of the following steps is guided by a software development principle.
1. Don’t repeat yourself
The probably most obvious issue with the original code is that it duplicates the exact same logic for parsing both input values. Unnecessary code repetition is not just prone to subtle bugs, but it might also mislead the code reader and make them assume that the two input values are treated distinctly for a particular reason. We should avoid this potential uncertainty and clear up our intent by extracting the common algorithm to a single, parametrised function that we invoke once per input value.
func main() {
if len(os.Args) != 3 {
fmt.Println("Error: wrong number of arguments")
os.Exit(1)
}
hours1, mins1, ok1 := ParseTime(os.Args[1])
hours2, mins2, ok2 := ParseTime(os.Args[2])
if !ok1 || !ok2 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
diff := (mins2 + hours2*60) - (mins1 + hours1*60)
if diff < 0 {
diff = diff * -1
}
fmt.Printf("%d:%02d\n", diff/60, diff%60)
}
// ParseTime determines the hour and minute part from a
// serialized time (HH:MM).
func ParseTime(serializedTime string) (hours int, mins int, ok bool) {
parts := strings.Split(serializedTime, ":")
if len(parts) != 2 || len(parts[0]) > 2 || len(parts[1]) != 2 {
return 0, 0, false
}
hours, errH := strconv.Atoi(parts[0])
mins, errM := strconv.Atoi(parts[1])
if errH != nil || hours < 0 || hours >= 24 ||
errM != nil || mins < 0 || mins >= 60 {
return 0, 0, false
}
return hours, mins, true
}
In case you are unfamiliar with Go, the following things might be helpful for you to know:
- Functions can have multiple return values. The return values can (optionally) be named.
- It is conventional to use the last return value to communicate success or failure back to the caller. That value can either be an object of type
error
, or a boolean flag, which is usually calledok
. The latter is sufficient for simple cases like ours. - In case of failure, all other (success-related) return values are supposed to be disregarded by the caller.
- The comment above a function is a godoc comment. It starts with the name of the symbol which it is referring to.
2. Carve out the domain
Our ParseTime
function from the previous step returns the hour and minute part as distinct integers. While this does the job technically, it’s not overly expressive. After all, the hour and minute part are not independent values, which only happen to be returned by the same function by accident. Instead, they inherently belong together and only make sense in conjunction. We can express this through code, by introducing a dedicated data structure that encapsulates both values.
func main() {
if len(os.Args) != 3 {
fmt.Println("Error: wrong number of arguments")
os.Exit(1)
}
t1, ok1 := Parse(os.Args[1])
t2, ok2 := Parse(os.Args[2])
if !ok1 || !ok2 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
diff := (t2.Minutes + t2.Hours*60) - (t1.Minutes + t1.Hours*60)
if diff < 0 {
diff = diff * -1
}
fmt.Printf("%d:%02d\n", diff/60, diff%60)
}
// ClockTime represents the time as it would be
// displayed on a wall clock.
type ClockTime struct {
Hours int
Minutes int
}
// Parse constructs a ClockTime object from its
// serialized form (HH:MM).
func Parse(serializedTime string) (t ClockTime, ok bool) {
parts := strings.Split(serializedTime, ":")
if len(parts) != 2 || len(parts[0]) > 2 || len(parts[1]) != 2 {
return ClockTime{}, false
}
hours, errH := strconv.Atoi(parts[0])
mins, errM := strconv.Atoi(parts[1])
if errH != nil || hours < 0 || hours >= 24 ||
errM != nil || mins < 0 || mins >= 60 {
return ClockTime{}, false
}
return ClockTime{hours, mins}, true
}
The notion of a clock time has now been made visible as an abstraction in our code. The ClockTime
struct doesn’t make any functional difference of course, its primary purpose is rather to be a programmatic representation of our domain concept.
The mere type signature of the Parse
method is now sufficient to inform us what the method is doing: it takes a string as input and parses the information into a ClockTime
object. Therefore, we can simplify the method name from ParseTime
to just Parse
, because the Time
suffix wouldn’t tell us anything new anymore. On the contrary: if we were to extract our domain-specific code into its own module, the suffix would even become redundant, as the method would then be invoked with a module prefix such as clocktime.Parse
.
3. Information hiding
For computing the time difference, we have to transform the input times to their respective amount of minutes since midnight. However, performing this calculation inside the main
function is not only cumbersome, but it also leaks implementation details from the domain logic – namely the facts that one hour is equivalent to 60 minutes, and that we use an intermediate form for carrying out the calculation. We can fix both issues at once by extracting the diffing algorithm to its own method. That way we can hide away internal details, and it also allows us to assign a meaningful name to the operation, which improves readability on the call side.
func main() {
if len(os.Args) != 3 {
fmt.Println("Error: wrong number of arguments")
os.Exit(1)
}
t1, ok1 := Parse(os.Args[1])
t2, ok2 := Parse(os.Args[2])
if !ok1 || !ok2 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
diff := DiffAbs(t1, t2)
fmt.Printf("%d:%02d", diff.Hours, diff.Minutes)
}
// DiffAbs computes the absolute (non-negative) difference
// between two ClockTime objects.
func DiffAbs(t1 ClockTime, t2 ClockTime) ClockTime {
result := (t2.Minutes + t2.Hours*60) - (t1.Minutes + t1.Hours*60)
if result < 0 {
result = result * -1
}
return ClockTime{result / 60, result % 60}
}
type ClockTime struct { /* Unchanged; see above */ }
func Parse(serializedTime string) { /* Unchanged; see above */ }
4. Separation of concerns
Almost done. There is one additional tiny bit that’s bothering us, which is the fmt.Printf
statement for serialising the time into its canonical string representation. Remember: for the parsing step, we already introduced a Parse
method that is closely related to the ClockTime
data structure. Consequently, it suggests itself to separate the concern of stringification in the same fashion. That way, serialisation and de-serialisation both live closely together, and are tightly incorporated into the ClockTime
domain.
func main() {
if len(os.Args) != 3 {
fmt.Println("Error: wrong number of arguments")
os.Exit(1)
}
t1, ok1 := Parse(os.Args[1])
t2, ok2 := Parse(os.Args[2])
if !ok1 || !ok2 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
fmt.Println(DiffAbs(t1, t2))
}
// String serializes the ClockTime in the format HH:MM.
func (t ClockTime) String() string {
return fmt.Sprintf("%d:%02d", t.Hours, t.Minutes)
}
type ClockTime struct { /* Unchanged; see above */ }
func Parse(string) { /* Unchanged; see above */ }
func DiffAbs(ClockTime, ClockTime) ClockTime { /* Unchanged; see above */ }
Note that the Go compiler automatically looks for a String
method if an object is used in a string context, so we don’t have to call it explicitly in fmt.Println
.
5. Don’t over-engineer
We already reached a good state with our refactoring, yet there would be more opportunities to overhaul the code – for example in the DiffAbs
function:
func DiffAbs(t1 ClockTime, t2 ClockTime) ClockTime {
result := (t2.Minutes + t2.Hours*60) - (t1.Minutes + t1.Hours*60)
if result < 0 {
result = result * -1
}
return ClockTime{result / 60, result % 60}
}
- The “magic number” of
60
is spelled out several times throughout the code. - The logic for converting
ClockTime
to “amount of minutes since midnight” is repeated for botht1
andt2
. - The logic for figuring out the absolute value of
result
could be extracted to a generic utility function.
While these are all valid points, I still wouldn’t do anything about them:
- The “magic number”:
60
is an eternal constant, so the value won’t ever change. To me, the integer literal is easier to reason about than a named variable such asMINS_PER_HOUR
. - The repetition: for a short program like
clockdiff
, and a short method likeDiffAbs
, I’d find such a minor duplication acceptable. If we were to re-use the “amount of minutes since midnight” mechanism in a different context, I’d introduce it as method on theClockDiff
struct, though. - The absolute value: Go’s standard library doesn’t provide a utility function for getting the absolute value of an integer, so we would have to create our own. For our program, I would only do that if we needed this logic in multiple places.
Just for illustration purpose, if we were to address the above points, the DiffAbs
function would look like this:
func DiffAbs(t1 ClockTime, t2 ClockTime) ClockTime {
MINS_PER_HOUR := 60
minsSinceMidnight := func(t ClockTime) int { return t.Hours*MINS_PER_HOUR + t.Minutes }
diff := mathAbs(minsSinceMidnight(t2) - minsSinceMidnight(t1))
return ClockTime{diff / MINS_PER_HOUR, diff % MINS_PER_HOUR}
}
func mathAbs(x int) int {
if x < 0 {
return x * -1
}
return x
}
I’m not convinced that this would really be a worthwhile improvement over our previous version. All in all, I personally would draw the line here, and leave the code as it was in the fourth step. After all, the potential “ugliness” is restricted to five lines of code, and hidden away inside a function. The same is true, by the way, for the implementation of our Parse
method, which I also don’t find overly elegant – but since it’s buried likewise, I can live with it as it is.
I have to admit, though, that my judgement is based on personal preference rather than on the rigorous application of programming principles. Thankfully, we are still humans, so we can make our own decisions based on good sense.
The result
Let’s see it all come together – this is our final implementation:
func main() {
if len(os.Args) != 3 {
fmt.Println("Error: wrong number of arguments")
os.Exit(1)
}
time1, ok1 := Parse(os.Args[1])
time2, ok2 := Parse(os.Args[2])
if !ok1 || !ok2 {
fmt.Println("Error: invalid input")
os.Exit(1)
}
fmt.Println(DiffAbs(time1, time2))
}
// ClockTime represents the time as it would be displayed on a wall clock.
type ClockTime struct {
Hours int
Minutes int
}
func (t ClockTime) String() string {
return fmt.Sprintf("%d:%02d\n", t.Hours, t.Minutes)
}
// Parse constructs a ClockTime object from its serialized form (HH:MM).
func Parse(serializedTime string) (t ClockTime, ok bool) {
parts := strings.Split(serializedTime, ":")
if len(parts) != 2 || len(parts[0]) > 2 || len(parts[1]) != 2 {
return ClockTime{}, false
}
hours, errH := strconv.Atoi(parts[0])
mins, errM := strconv.Atoi(parts[1])
if errH != nil || hours < 0 || hours >= 24 ||
errM != nil || mins < 0 || mins >= 60 {
return ClockTime{}, false
}
return ClockTime{hours, mins}, true
}
// DiffAbs computes the absolute (unsigned) difference between two ClockTime objects.
func DiffAbs(t1 ClockTime, t2 ClockTime) ClockTime {
diff := (t2.Hours*60 + t2.Minutes) - (t1.Hours*60 + t1.Minutes)
if diff < 0 {
diff = diff * -1
}
return ClockTime{diff / 60, diff % 60}
}
The main
function now gives a clear and succinct high-level overview of what the program does. The most important domain concept, the clock time, is represented prominently in the code. The individual operations live in their own functions, and you are able to understand their purpose without having to read their implementation.
Code purists might still favour the original implementation, however, and bring up some critique with our refactoring:
- The amount of code has grown by about 50%.
- The original code could be read in one go from top to bottom, whereas the refactored version consists of five distinct code units, which you have to jump around between.
These points are not totally unfounded. I think that this code kata is a good demonstration that code structure is often a trade-off. The original implementation was shorter and more concise without doubt, but it only achieved that at the expense of keeping important information implicit. For example, nowhere in the original code you find the concepts of “parsing a clock time” or “calculating an (absolute) diff” verbalised. The respective logic exists, but the code itself doesn’t help you recognise it directly. The refactored version does a better job at expressing these concepts, and the code is now almost self-explanatory, even if you’d strip the comments. Those benefits come at a cost – in our case we had to “pay” with 20 additional lines of code. To me, that investment is reasonable.
What’s your take on refactoring clockdiff
? If you like, you can play around with the code yourself.
-
Pro tip: the exercise is even more fun while listening to some good music. ↩︎
-
This is the original implementation of
clockdiff
. Our task is to improve the code, while preserving the behaviour. ↩︎