Search/Lucene/Index/SegmentInfo.php
LICENSE
This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.
- Category
- Zend
- Copyright
- Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)
- License
- New BSD License
- Package
- Zend_Search_Lucene
- Subpackage
- Index
- Version
- $Id: SegmentInfo.php 24593 2012-01-05 20:35:02Z matthew $
\Zend_Search_Lucene_Index_SegmentInfo
- Implements
- \Zend_Search_Lucene_Index_TermsStream_Interface
- Category
- Zend
- Copyright
- Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)
- License
- New BSD License
Constants

FULL_SCAN_VS_FETCH_BOUNDARY
= 5
If filter selectivity is less than this value, then full scan is performed (since term entries fetching has some additional overhead).
Properties


integer $_delGen
-2 means autodetect latest delete generation -1 means 'there is no delete file' 0 means pre-2.1 format delete file X specifies used delete file
- Type
- integer


mixed $_deleted = null
bitset if bitset extension is loaded or array otherwise.
null
Details- Type
- mixed


array|null $_docMap = null
It's not very effective from memory usage point of view, but much more faster, then other methods
null
Details- Type
- array | null


array $_fields
Array of Zend_Search_Lucene_Index_FieldInfo objects for this segment
- Type
- array


array $_fieldsDicPositions
(Term dictionary contains filelds ordered by names)
- Type
- array


\Zend_Search_Lucene_Storage_File $_frqFile = null
null
Details

boolean $_hasSingleNormFile
If true then one .nrm file is used for all fields Otherwise .fN files are used
- Type
- boolean


boolean $_isCompound
- Type
- boolean


\Zend_Search_Lucene_Index_TermInfo $_lastTermInfo = null
null
Details

array|null $_lastTermPositions
Array structure: array( docId => array( pos1, pos2, ...), ...)
Is set to null if term positions loading has to be skipped
- Type
- array | null


array $_norms = array()
An array fieldName => normVector normVector is a binary string. Each byte corresponds to an indexed document in a segment and encodes normalization factor (float value, encoded by Zend_Search_Lucene_Search_Similarity::encodeNorm())
array()
Details- Type
- array


\Zend_Search_Lucene_Storage_File $_prxFile = null
null
Details

array $_segFileSizes
- Type
- array


array $_segFiles
- Type
- array


array $_termDictionary
Array of arrays (Zend_Search_Lucene_Index_Term objects are represented as arrays because of performance considerations) [0] -> $termValue [1] -> $termFieldNum
Corresponding Zend_Search_Lucene_Index_TermInfo object stored in the $_termDictionaryInfos
- Type
- array


array $_termDictionaryInfos
Array of arrays (Zend_Search_Lucene_Index_TermInfo objects are represented as arrays because of performance considerations) [0] -> $docFreq [1] -> $freqPointer [2] -> $proxPointer [3] -> $skipOffset [4] -> $indexPointer
- Type
- array


array $_termInfoCache = array()
Size is 1024. Numbers are used instead of class constants because of performance considerations
array()
Details- Type
- array


integer $_termsScanMode
Values:
self::SM_TERMS_ONLY - terms are scanned, no additional info is retrieved self::SM_FULL_INFO - terms are scanned, frequency and position info is retrieved self::SM_MERGE_INFO - terms are scanned, frequency and position info is retrieved document numbers are compacted (shifted if segment has deleted documents)
- Type
- integer


\Zend_Search_Lucene_Storage_File $_tisFile = null
null
DetailsMethods


__construct(\Zend_Search_Lucene_Storage_Directory $directory, string $name, integer $docCount, integer $delGen = 0, array | null $docStoreOptions = null, boolean $hasSingleNormFile = false, boolean $isCompound = null) : void
Zend_Search_Lucene_Index_SegmentInfo constructor
Name | Type | Description |
---|---|---|
$directory | \Zend_Search_Lucene_Storage_Directory | |
$name | string | |
$docCount | integer | |
$delGen | integer | |
$docStoreOptions | array | null | |
$hasSingleNormFile | boolean | |
$isCompound | boolean |


_detectLatestDelGen() : integer
Detect latest delete generation
Is actualy used from writeChanges() method or from the constructor if it's invoked from Index writer. In both cases index write lock is already obtained, so we shouldn't care about it
Type | Description |
---|---|
integer |


_getFieldPosition(integer $fieldNum) : integer
Get field position in a fields dictionary
Name | Type | Description |
---|---|---|
$fieldNum | integer |
Type | Description |
---|---|
integer |


_load21DelFile() : mixed
Load 2.1+ format detetions file
Returns bitset or an array depending on bitset extension availability
Type | Description |
---|---|
mixed |


_loadDelFile() : mixed
Load detetions file
Returns bitset or an array depending on bitset extension availability
Type | Description |
---|---|
mixed |
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


_loadDictionaryIndex() : void
Load terms dictionary index
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


_loadNorm(integer $fieldNum) : void
Load normalizatin factors from an index file
Name | Type | Description |
---|---|---|
$fieldNum | integer |
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


_loadPre21DelFile() : mixed
Load pre-2.1 detetions file
Returns bitset or an array depending on bitset extension availability
Type | Description |
---|---|
mixed |
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


closeTermsStream() : void
Close terms stream
Should be used for resources clean up if stream is not read up to the end


compoundFileLength(string $extension) : integer
Get compound file length
Name | Type | Description |
---|---|---|
$extension | string |
Type | Description |
---|---|
integer |


count() : integer
Returns the total number of documents in this segment (including deleted documents).
Type | Description |
---|---|
integer |


currentTerm() : \Zend_Search_Lucene_Index_Term | null
Returns term in current position
Type | Description |
---|---|
\Zend_Search_Lucene_Index_Term | null |


currentTermPositions() : array
Returns an array of all term positions in the documents.
Return array structure: array( docId => array( pos1, pos2, ...), ...)
Type | Description |
---|---|
array |


delete(integer $id) : void
Deletes a document from the index segment.
$id is an internal document id
Name | Type | Description |
---|---|---|
$id | integer |


getField(integer $fieldNum) : \Zend_Search_Lucene_Index_FieldInfo
Returns field info for specified field
Name | Type | Description |
---|---|---|
$fieldNum | integer |
Type | Description |
---|---|
\Zend_Search_Lucene_Index_FieldInfo |


getFieldNum(string $fieldName) : integer
Returns field index or -1 if field is not found
Name | Type | Description |
---|---|---|
$fieldName | string |
Type | Description |
---|---|
integer |


getFields(boolean $indexed = false) : array
Returns array of fields.
if $indexed parameter is true, then returns only indexed fields.
Name | Type | Description |
---|---|---|
$indexed | boolean |
Type | Description |
---|---|
array |


getTermInfo(\Zend_Search_Lucene_Index_Term $term) : \Zend_Search_Lucene_Index_TermInfo
Scans terms dictionary and returns term info
Name | Type | Description |
---|---|---|
$term | \Zend_Search_Lucene_Index_Term |
Type | Description |
---|---|
\Zend_Search_Lucene_Index_TermInfo |


hasDeletions() : boolean
Returns true if any documents have been deleted from this index segment.
Type | Description |
---|---|
boolean |


hasSingleNormFile() : boolean
Returns true if segment has single norms file.
Type | Description |
---|---|
boolean |


isCompound() : boolean
Returns true if segment is stored using compound segment file.
Type | Description |
---|---|
boolean |


isDeleted(integer $id) : boolean
Checks, that document is deleted
Name | Type | Description |
---|---|---|
$id | integer |
Type | Description |
---|---|
boolean |


nextTerm() : \Zend_Search_Lucene_Index_Term | null
Scans terms dictionary and returns next term
Type | Description |
---|---|
\Zend_Search_Lucene_Index_Term | null |


norm(integer $id, string $fieldName) : float
Returns normalization factor for specified documents
Name | Type | Description |
---|---|---|
$id | integer | |
$fieldName | string |
Type | Description |
---|---|
float |


normVector(string $fieldName) : string
Returns norm vector, encoded in a byte string
Name | Type | Description |
---|---|---|
$fieldName | string |
Type | Description |
---|---|
string |


numDocs() : integer
Returns the total number of non-deleted documents in this segment.
Type | Description |
---|---|
integer |


openCompoundFile(string $extension, boolean $shareHandler = true) : \Zend_Search_Lucene_Storage_File
Opens index file stoted within compound index file
Name | Type | Description |
---|---|---|
$extension | string | |
$shareHandler | boolean |
Type | Description |
---|---|
\Zend_Search_Lucene_Storage_File |
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


resetTermsStream() : integer
Reset terms stream
$startId - id for the fist document $compact - remove deleted documents
Returns start document id for the next segment
Type | Description |
---|---|
integer |
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


skipTo(\Zend_Search_Lucene_Index_Term $prefix) : void
Skip terms stream up to the specified term preffix.
Prefix contains fully specified field info and portion of searched term
Name | Type | Description |
---|---|---|
$prefix | \Zend_Search_Lucene_Index_Term |
Exception | Description |
---|---|
\Zend_Search_Lucene_Exception |


termDocs(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter | null $docsFilter = null) : array
Returns IDs of all the documents containing term.
Name | Type | Description |
---|---|---|
$term | \Zend_Search_Lucene_Index_Term | |
$shift | integer | |
$docsFilter | \Zend_Search_Lucene_Index_DocsFilter | null |
Type | Description |
---|---|
array |


termFreqs(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter | null $docsFilter = null) : \Zend_Search_Lucene_Index_TermInfo
Returns term freqs array.
Result array structure: array(docId => freq, ...)
Name | Type | Description |
---|---|---|
$term | \Zend_Search_Lucene_Index_Term | |
$shift | integer | |
$docsFilter | \Zend_Search_Lucene_Index_DocsFilter | null |
Type | Description |
---|---|
\Zend_Search_Lucene_Index_TermInfo |


termPositions(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter | null $docsFilter = null) : \Zend_Search_Lucene_Index_TermInfo
Returns term positions array.
Result array structure: array(docId => array(pos1, pos2, ...), ...)
Name | Type | Description |
---|---|---|
$term | \Zend_Search_Lucene_Index_Term | |
$shift | integer | |
$docsFilter | \Zend_Search_Lucene_Index_DocsFilter | null |
Type | Description |
---|---|
\Zend_Search_Lucene_Index_TermInfo |