# Chapter 2 (Part 3), Sennelart & Blondel – Automatic Discovery of Similar Words

In Section 2.3, we get to the meat of Sennelart & Blondel’s work, which is a graph-based method for determining similar words, using a dictionary as source. Their method uses a *vXv* matrix, where each *v* is a word in the dictionary. They compare their method and results with that of Kleinberg, who proposes a method for determining good Web *hubs* and *authorities*, and with the ArcRank and WordNet methods. They test the four methods on four words: *disappear, parallelogram, sugar,* and *science*. By and large, their method performs 2nd-best, after WordNet. They propose improving their results by taking a larger subgraph.

The most interesting result is not so much the specific method, but that their approach makes it possible to see a dictionary, and the resulting vector space model, as a (possibly weighted) directed graph. This is a significant result, as graph theory has many powerful theoretical aspects which can be brought to bear on this problem class.

Overall, a good read, and some food for thought.

The Kleinberg reference is:

Kleinberg, J.M. Authoritative sources in a hyperlinked environment. *Journal of the ACM*, **46 **(5): 604-632 (1999).