Time Limit: Java: 2000 ms / Others: 2000 ms
Memory Limit: Java: 65536 KB / Others: 65536 KB
It's easy to tell if two words are identical - just check the letters. But
how do you tell if two words are almost identical? And how close is "almost"?
There are lots of techniques for approximate word matching. One is to determine the best substring match, which is the number of common letters when the words are compared letter-byletter.
The key to this approach is that the words can overlap in any way. For example, consider the words CAPILLARY and MARSUPIAL. One way to compare them is to overlay them:
There is only one common letter (A). Better is the following overlay:
with two common letters (A and R), but the best is:
Which has three common letters (P, I and L).
The approximation measure appx(word1, word2) for two words is given by:
common letters * 2
length(word1) + length(word2)
Thus, for this example, appx(CAPILLARY, MARSUPIAL) = 6 / (9 + 9) = 1/3. Obviously, for any word W appx(W, W) = 1, which is a nice property, while words with no common letters have an appx value of 0.