Saturday, 5 April 2008

Using SOUNDEX with a Function Based Index in Oracle

The SOUNDEX function returns a phonetic representation of an English word. It can be used to compare words that are spelled differently, but sound alike.

For example.
SELECT ename
FROM emp
WHERE SOUNDEX(ename) = SOUNDEX('janes');
ENAME      
----------
JONES
JAMES
2 rows selected
However, a regular index on the ENAME column will not be used by Oracle if we include the SOUNDEX function.
CREATE INDEX emp_ix1 ON emp(ename);
Index created.

SELECT ename
FROM emp
WHERE SOUNDEX(ename) = SOUNDEX('janes');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1 Bytes=6)
1 0 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=4 Card=1 Bytes=6)
Instead, we must use a function based index.
CREATE INDEX emp_ix2 ON emp(SOUNDEX(ename));
Index created.

SELECT ename
FROM emp
WHERE SOUNDEX(ename) = SOUNDEX('janes');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=6)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (TABLE) (Cost=2 Card=1 Bytes=6)
2 1 INDEX (RANGE SCAN) OF 'EMP_IX2' (INDEX) (Cost=1 Card=1)
This time, the function based index is used to retrieve the data.

Incidentally, a Java equivalent is included as part of the Apache Commons project.
import org.apache.commons.codec.language.Soundex;
..
public boolean soundsLike(String firstWord, String secondWord) {
Soundex s = new Soundex();
return s.soundex(firstWord).equals(s.soundex(secondWord));
}

No comments: