UPDATE: I’ve learned a lot since writing this post awhile back. I can’t say enough good things about Sandi Metz’s book, Practical Object-Oriented Design in Ruby. It’s recommended reading for any new programmer on my team.
“Exercism” is a great little Ruby Gem & Website made by Katrina Owen that allows you to get crowd-sourced code reviews on practice problems. I’ve completed the first five problems and just started working on the sixth. It’s been great to get feedback from other people and learn some great new tricks in Ruby!
One of the great features of the website is the ability to compare your first iteration with your final/approved version. I’d love to show you the refactoring I did for the “word count” exercise.
Here are the instructions for the challenge:
Sounds simple enough. Let’s do this!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Phrase
attr_accessor :words
def initialize(words)
@words = words
end
def word_count
word_list = {}
words.split(" ").each do |word|
word_list = process_word_in_list(word, word_list)
end
word_list
end
private
def process_word_in_list(word, word_list)
return word_list if word.empty?
word_list[word] ||= 0
word_list[word] += 1
word_list
end
end
Seemed pretty straight-forward at first. We’ll make a hash, then add to a counter inside the hash if the word shows up. Easy peasy.
But there are three issues:
word_count
method isn’t super easy to readprocess_word_in_list
method is crazy ugly(This is about to get much uglier before it gets better)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class Phrase
attr_accessor :words
def initialize(words)
@words = words
end
def word_count
prep_words_for_counting()
word_list = {}
words.split(" ").each do |word|
word_list = process_word_in_list(word, word_list)
end
word_list
end
private
def process_word_in_list(word, word_list)
word = parse_word(word)
return word_list if word.empty?
word_list[word] ||= 0
word_list[word] += 1
word_list
end
def prep_words_for_counting
words.gsub!(",", " ")
end
def parse_word(word)
word.gsub!(/[^a-zA-Z0-9]/, "")
word.downcase!
word
end
end
Oh lord. This thing is a beast. But I didn’t see much hope yet, so I sent it off for review.
The first review of my code exposed a few great learning points:
Hash.new(0)
creates a hash and sets “0” as the default value for every new key. This solves my word_list[word] ||= 0
problem at line 21.scan
Scan iterates through str, matching the pattern (which may be a Regexp or a String). For each match, a result is generated and either added to the result array or passed to the block.
Some examples:
With my new-found Ruby knowledge, here’s v2:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Phrase
attr_accessor :words
def initialize(words)
@words = words
end
def word_count
word_list = Hash.new(0)
words.downcase.scan(/\w+/) do |word|
word_list[word] += 1
end
word_list
end
end
So much cleaner! Using scan
removed a huge portion of my checks and pattern matching. And although I think this code should get a good grade, I learned even more from the great community on Exercism.
Raymond Vellener from the Netherlands offered this simple piece of advice: “Great improvement from the last submission. Perhaps you could consider separating the word_count method?”
Definitely a nitpick, but I could see Ray’s point. The words/downcase/regex/array combo in the middle of that method could be more expressive. How about we extract these details into its own block?
The final result:
My new each_word
method is very expressive: it tells the reader exactly what will come from it, while moving the implementation details to a discreet private method.
That’s it! This exercise annoyed me at first, but I loved learning about scan
and found a great example of moving complex methods into my own Ruby block.
Have any other tips or tricks? Have you done any exercises on Exercism or need help completing one? Let me know on Twitter!
UPDATE
I’ve gotten some great feedback on Twitter about simplifying the Hash buildup using inject
or each_with_object
.
Jérémy Lecour gave me a good rule on when to use each: “For me: if you accumulate (build a Hash, …), use #each_with_object, but if you really mutate/change the object, then use #inject”. To that end, I’ve further simplified the example:
Writer. Musician. Adventurer. Nerd.
Purveyor of GIFs and dad jokes.