com.ctc.wstx.util
Class WordSet

java.lang.Object
  extended by com.ctc.wstx.util.WordSet

public final class WordSet
extends java.lang.Object

An efficient (both memory and time) implementation of a Set used to verify that a given word is contained within the set. The general usage pattern is expected to be such that most checks are positive, ie. that the word indeed is contained in the set.

Performance of the set is comparable to that of TreeSet for Strings, ie. 2-3x slower than HashSet when using pre-constructed Strings. This is generally result of algorithmic complexity of structures; Word and Tree sets are roughly logarithmic to the whole data, whereas Hash set is linear to the length of key. However:

Although this is an efficient set for specific set of usage patterns, one restriction is that the full set of words to include has to be known before constructing the set. Also, the size of the set is limited to total word content of about 20k characters; factory method does verify the limit and indicates if an instance can not be created.


Method Summary
static char[] constructRaw(java.util.TreeSet wordSet)
           
static WordSet constructSet(java.util.TreeSet wordSet)
           
static boolean contains(char[] data, char[] str, int start, int end)
           
 boolean contains(char[] buf, int start, int end)
           
static boolean contains(char[] data, java.lang.String str)
           
 boolean contains(java.lang.String str)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

constructSet

public static WordSet constructSet(java.util.TreeSet wordSet)

constructRaw

public static char[] constructRaw(java.util.TreeSet wordSet)

contains

public boolean contains(char[] buf,
                        int start,
                        int end)

contains

public static boolean contains(char[] data,
                               char[] str,
                               int start,
                               int end)

contains

public boolean contains(java.lang.String str)

contains

public static boolean contains(char[] data,
                               java.lang.String str)