Class LCSubstringSolver.ConcurrentSuffixTreeImpl<V>
java.lang.Object
com.googlecode.concurrenttrees.radix.ConcurrentRadixTree<V>
com.googlecode.concurrenttrees.solver.LCSubstringSolver.ConcurrentSuffixTreeImpl<V>
- All Implemented Interfaces:
PrettyPrintable
,RadixTree<V>
,Serializable
- Enclosing class:
LCSubstringSolver
-
Nested Class Summary
Nested classes/interfaces inherited from class com.googlecode.concurrenttrees.radix.ConcurrentRadixTree
ConcurrentRadixTree.KeyValuePairImpl<O>, ConcurrentRadixTree.NodeKeyPair
-
Field Summary
Fields inherited from class com.googlecode.concurrenttrees.radix.ConcurrentRadixTree
root
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
(package private) CharSequence
The main algorithm to find the longest common substring.protected Iterable
<ConcurrentRadixTree.NodeKeyPair> lazyTraverseDescendants
(CharSequence startKey, Node startNode) Traverses the tree using depth-first, preordered traversal, starting at the given node, using lazy evaluation such that the next node is only determined when next() is called on the iterator returned.protected void
(package private) boolean
subTreeReferencesAllOriginalDocuments
(CharSequence startKey, Node startNode) Returns true if the given node and its descendants collectively reference all original documents added to the solver.Methods inherited from class com.googlecode.concurrenttrees.radix.ConcurrentRadixTree
getClosestKeys, getKeysStartingWith, getKeyValuePairsForClosestKeys, getKeyValuePairsForKeysStartingWith, getNode, getValueForExactKey, getValuesForClosestKeys, getValuesForKeysStartingWith, put, putIfAbsent, remove, size, transformKeyForResult
-
Constructor Details
-
ConcurrentSuffixTreeImpl
-
-
Method Details
-
acquireWriteLock
protected void acquireWriteLock()- Overrides:
acquireWriteLock
in classConcurrentRadixTree<V>
-
releaseWriteLock
protected void releaseWriteLock()- Overrides:
releaseWriteLock
in classConcurrentRadixTree<V>
-
lazyTraverseDescendants
protected Iterable<ConcurrentRadixTree.NodeKeyPair> lazyTraverseDescendants(CharSequence startKey, Node startNode) Description copied from class:ConcurrentRadixTree
Traverses the tree using depth-first, preordered traversal, starting at the given node, using lazy evaluation such that the next node is only determined when next() is called on the iterator returned. The traversal algorithm uses iteration instead of recursion to allow deep trees to be traversed without requiring large JVM stack sizes. Each node that is encountered is returned from the iterator along with a key associated with that node, in a NodeKeyPair object. The key will be prefixed by the given start key, and will be generated by appending to the start key the edges traversed along the path to that node from the start node.- Overrides:
lazyTraverseDescendants
in classConcurrentRadixTree<V>
- Parameters:
startKey
- The key which matches the given start nodestartNode
- The start node- Returns:
- An iterator which when iterated traverses the tree using depth-first, preordered traversal, starting at the given start node
-
getLongestCommonSubstring
CharSequence getLongestCommonSubstring()The main algorithm to find the longest common substring.- Traverses all nodes in the suffix tree
- For each node checks if the path from the root via edges to that node is longer than the longest common substring encountered so far (and so is a candidate)
-
Calls helper method
subTreeReferencesAllOriginalDocuments(CharSequence, Node)
, supplying the candidate node. That method returns true if nodes in the sub-tree descending from that node collectively references all of the original documents added to the solver - If the nodes in the sub-tree do collectively reference all documents, then the path from root to the current node is a substring of all documents
- Updates the longest common substring encountered so far if the conditions above hold for the current node
- Continues traversing the tree until all nodes have been checked
subTreeReferencesAllOriginalDocuments(CharSequence, Node)
will stop traversal early if it finds all original documents early. This method currently does not apply a similar optimization, and will actually descend into and apply the same tests to branches which the helper method already indicated are dead-ends(!). Future work might be to use this knowledge, skip dead-end branches, but it would involve not using any of the traversal logic from superclasses and overriding it all here for this one use case.- Returns:
- The longest common substring
-
subTreeReferencesAllOriginalDocuments
Returns true if the given node and its descendants collectively reference all original documents added to the solver. This method will traverse the entire sub-tree until it has encountered all of the original documents. If it encounters all of the original documents early, before exhausting all nodes, returns early.- Parameters:
startKey
- The key associated with the start node (concatenation of edges from root leading to it)startNode
- The root of the sub-tree to traverse- Returns:
- True if the given node and its descendants collectively reference all original documents added to the solver, false if the sub-tree does not reference all documents added to the solver
-