Monday, August 17, 2009

Python - Boggle Solver

I have been a long time fan of the game Boggle. Last year in one of my CS classes we were required to create a program in Java that would find all the words on any given Boggle board - a very fun project, and one that I found useful for a number of reasons. The other day I was messing around with my old Java implementation, and thought it would be good practice to convert it to Python. During the process, I learned a few new tricks about Python and also some other solutions to problems that may come up in other places.

The first useful thing I learned how to do was to create a class that acted as if it had two __init__ functions, which of course Python cannot handle. There are a couple of ways to do it, but my favorite implementation uses the "@classmethod" decorator. The "@classmethod" decorator makes the method static (not like in Java or C++ - use @staticmethod for that) - so within the method you can initialize a new object, set the instance variables then return the newly created object once you are done. Here is a code snippet from my Word class where I used this:


class Word:
def __init__(self):
self.letter_sequence = ""
self.used_board_coordinates = set()

@classmethod
def new(cls, row, column):
word = cls()
word.used_board_coordinates.add((row, column))
return word

@classmethod
def new_from_word(cls, word):
new_word = cls()
new_word.letter_sequence += word.letter_sequence
new_word.used_board_coordinates.update(word.used_board_coordinates)
return new_word



One of the constructors (called "new") accepts two parameters - the originating coordinates. It is called like so:


word = Word.new(row, column)


The second takes one parameter - another word. It creates a new word, and then copies over the data it needs. You call it much like the first:

wordCopy = Word.new_from_word(word)

Another problem I had to solve was prefix look-up. I have a large dictionary, and while searching for words on the board I only wanted to pursue directions that could potentially lead to actual words - so I needed to be able to quickly see if a letter sequence is a valid word prefix. I considered implementing a Trie or something similar, or using a modified Binary Search like I did in my Java program. I didn't really like either solution, so I came up with a solution that can take up a lot of memory, but runs extremely quickly - a simple set(). For each word loaded from my dictionary, I run a for loop that adds incrementing amounts of the word into my prefix set.


for index in xrange(len(word.strip()) + 1):
self.prefixes.add(word[:index]


So, when I receive the word "python" from my dictionary, I add "p", "py", "pyt", "pyth"...etc to my prefix set. I was first afraid at how long it would take to do this, since the dictionary I have has 170,000+ words in it. It takes less than 1 second, and the look-up times are top-notch.

I also found a use for a generator - it could have been done without it, but I like how it cleans up the code. I use one in the BoggleSolver class - it helps me loop through all of the valid moves (coordinates) for a given word. The rules are this for Boggle - you can only use each letter (coordinate) once when constructing a word, and you can only move to adjacent spaces that are on the board. Here an example from my code:


for row, column in self._get_valid_coodinates_for_word(word, row, column):
if(self.dictionary.contains_prefix(word.letter_sequence + self.board[row][column])):
self._find_words(Word.new_from_word(word), row, column)

def _get_valid_coodinates_for_word(self, word, row, column):
for r in range(row - 1, row + 2):
for c in range(column - 1, column + 2):
if r >= 0 and r < self.board.side_length and c >= 0 and c < self.board.side_length:
if ((r, c) not in word.used_board_coordinates):
yield r, c




There are a lot of "for"s and "if" statements there, I know - but I thought it made it more readable, and it doesn't affect the performance. So, to get the coordinates for the word, simply pass the word and the current coordinates to the _get_valid_coordinates_for_word generator, and you can iterate over the returned row, column values. Pretty slick, if you ask me.

The actual problem that this script solves may not be too interesting to others, but some of the problems it solves within itself might, so I am posting the source. If you want to use it, it is ready to be used at the command line. Type "python pyBoggle.py" then a space separated list of the letters on your board. The board must be a square - 3x3, 4x4, 5x5...etc for it to word (it checks).

Example:

python pyBoggle.py a d e g n u p t e m l p w e f t

You can download the source here.
Be sure to grab the dictionary here (Place it in the same folder as the script).

Here is the source if you wish to just view it:


import sys

class BoggleSolver:
def __init__(self, letter_list):
self.dictionary = Dictionary("C:\\Dropbox\\Programming Projects\\Python\\PyBoggle\\dictionary.txt")
self.board = Board(letter_list)
self.min_length = 4
self.found_words = set()

# Find all words starting from each coordinate position
for row in xrange(self.board.side_length):
for column in xrange(self.board.side_length):
self._find_words(Word.new(row, column), row, column)

def _find_words(self, word, row, column):
word.add_letter(self.board[row][column], row, column)

if (self._can_add_word(word)):
self.found_words.add(word.letter_sequence)

for row, column in self._get_valid_coodinates_for_word(word, row, column):
if(self.dictionary.contains_prefix(word.letter_sequence + self.board[row][column])):
self._find_words(Word.new_from_word(word), row, column)

def _can_add_word(self, word):
return len(word) >= self.min_length and self.dictionary.contains_word(word.letter_sequence)

def _get_valid_coodinates_for_word(self, word, row, column):
for r in range(row - 1, row + 2):
for c in range(column - 1, column + 2):
if r >= 0 and r < self.board.side_length and c >= 0 and c < self.board.side_length:
if ((r, c) not in word.used_board_coordinates):
yield r, c

class Board:
def __init__(self, letter_list):
self.side_length = len(letter_list) ** .5
if (self.side_length != int(self.side_length)):
raise Exception("Board must have equal sides! (4x4, 5x5...)")
else:
self.side_length = int(self.side_length)

self.board = []

index = 0
for row in xrange(self.side_length):
self.board.append([])
for column in xrange(self.side_length):
self.board[row].append(letter_list[index])
index += 1

def __getitem__(self, row):
return self.board[row]

class Word:
def __init__(self):
self.letter_sequence = ""
self.used_board_coordinates = set()

@classmethod
def new(cls, row, column):
word = cls()
word.used_board_coordinates.add((row, column))
return word

@classmethod
def new_from_word(cls, word):
new_word = cls()
new_word.letter_sequence += word.letter_sequence
new_word.used_board_coordinates.update(word.used_board_coordinates)
return new_word

def add_letter(self, letter, row, column):
self.letter_sequence += letter
self.used_board_coordinates.add((row, column))

def __str__(self):
return self.letter_sequence

def __len__(self):
return len(self.letter_sequence)

class Dictionary:
def __init__(self, dictionary_file):
self.words = set()
self.prefixes = set()
word_file = open(dictionary_file, "r")

for word in word_file.readlines():
self.words.add(word.strip())
for index in xrange(len(word.strip()) + 1):
self.prefixes.add(word[:index])

def contains_word(self, word):
return word in self.words

def contains_prefix(self, prefix):
return prefix in self.prefixes

if __name__ == "__main__":
boggleSolver = BoggleSolver(sys.argv[1:])
words = boggleSolver.found_words
print words



Enjoy.

1 comment:

Please comment!