Module org.apache.lucene.core
Class Lucene99ScalarQuantizedVectorsWriter
java.lang.Object
org.apache.lucene.codecs.FlatVectorsWriter
org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsWriter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Accountable
Writes quantized vector values and metadata to index segments.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class
(package private) static class
(package private) static class
Returns a merged view over all the segment'sQuantizedByteVectorValues
.private static final class
private static class
private static class
(package private) static final class
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Float
private final List
<Lucene99ScalarQuantizedVectorsWriter.FieldWriter> private boolean
private final IndexOutput
private static final float
private final IndexOutput
private final FlatVectorsWriter
private static final float
private final SegmentWriteState
private static final long
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
Constructor Summary
ConstructorsConstructorDescriptionLucene99ScalarQuantizedVectorsWriter
(SegmentWriteState state, Float confidenceInterval, FlatVectorsWriter rawVectorDelegate) -
Method Summary
Modifier and TypeMethodDescriptionaddField
(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter) Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.void
close()
void
finish()
Called once at the end before closevoid
flush
(int maxDoc, Sorter.DocMap sortMap) Flush all buffered data on disk *private static QuantizedVectorsReader
getQuantizedKnnVectorsReader
(KnnVectorsReader vectorsReader, String fieldName) private static ScalarQuantizer
getQuantizedState
(KnnVectorsReader vectorsReader, String fieldName) (package private) static ScalarQuantizer
mergeAndRecalculateQuantiles
(MergeState mergeState, FieldInfo fieldInfo, float confidenceInterval) void
mergeOneField
(FieldInfo fieldInfo, MergeState mergeState) Write field for mergingmergeOneFieldToIndex
(FieldInfo fieldInfo, MergeState mergeState) Write the field for merging, providing a scorer over the newly merged flat vectors.mergeOneFieldToIndex
(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) (package private) static ScalarQuantizer
mergeQuantiles
(List<ScalarQuantizer> quantizationStates, List<Integer> segmentSizes, float confidenceInterval) private ScalarQuantizer
mergeQuantiles
(FieldInfo fieldInfo, MergeState mergeState) long
Return the memory usage of this object in bytes.(package private) static boolean
shouldRecomputeQuantiles
(ScalarQuantizer mergedQuantizationState, List<ScalarQuantizer> quantizationStates) Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.(package private) static boolean
shouldRequantize
(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles) Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state.private void
writeField
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc) private void
writeMeta
(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, Float confidenceInterval, Float lowerQuantile, Float upperQuantile, DocsWithFieldSet docsWithField) private static DocsWithFieldSet
writeQuantizedVectorData
(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues) Writes the vector values to the output and returns a set of documents that contains vectors.private void
private void
writeSortedQuantizedVectors
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap) private void
writeSortingField
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
SHALLOW_RAM_BYTES_USED
private static final long SHALLOW_RAM_BYTES_USED -
QUANTILE_RECOMPUTE_LIMIT
private static final float QUANTILE_RECOMPUTE_LIMIT- See Also:
-
REQUANTIZATION_LIMIT
private static final float REQUANTIZATION_LIMIT- See Also:
-
segmentWriteState
-
fields
-
meta
-
quantizedVectorData
-
confidenceInterval
-
rawVectorDelegate
-
finished
private boolean finished
-
-
Constructor Details
-
Lucene99ScalarQuantizedVectorsWriter
Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, FlatVectorsWriter rawVectorDelegate) throws IOException - Throws:
IOException
-
-
Method Details
-
addField
public FlatFieldVectorsWriter<?> addField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter) throws IOException Description copied from class:FlatVectorsWriter
Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.- Specified by:
addField
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to addindexWriter
- the writer to delegate to, can be null- Returns:
- a writer for the field
- Throws:
IOException
- if an I/O error occurs when adding the field
-
mergeOneField
Description copied from class:FlatVectorsWriter
Write field for merging- Overrides:
mergeOneField
in classFlatVectorsWriter
- Throws:
IOException
-
mergeOneFieldToIndex
public CloseableRandomVectorScorerSupplier mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState) throws IOException Description copied from class:FlatVectorsWriter
Write the field for merging, providing a scorer over the newly merged flat vectors. This way any additional merging logic can be implemented by the user of this class.- Specified by:
mergeOneFieldToIndex
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to mergemergeState
- mergeState of the segments to merge- Returns:
- a scorer over the newly merged flat vectors, which should be closed as it holds a temporary file handle to read over the newly merged vectors
- Throws:
IOException
- if an I/O error occurs when merging
-
flush
Description copied from class:FlatVectorsWriter
Flush all buffered data on disk *- Specified by:
flush
in classFlatVectorsWriter
- Throws:
IOException
-
finish
Description copied from class:FlatVectorsWriter
Called once at the end before close- Specified by:
finish
in classFlatVectorsWriter
- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal. -
writeField
private void writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc) throws IOException - Throws:
IOException
-
writeMeta
private void writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, Float confidenceInterval, Float lowerQuantile, Float upperQuantile, DocsWithFieldSet docsWithField) throws IOException - Throws:
IOException
-
writeQuantizedVectors
private void writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData) throws IOException - Throws:
IOException
-
writeSortingField
private void writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap) throws IOException - Throws:
IOException
-
writeSortedQuantizedVectors
private void writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap) throws IOException - Throws:
IOException
-
mergeQuantiles
private ScalarQuantizer mergeQuantiles(FieldInfo fieldInfo, MergeState mergeState) throws IOException - Throws:
IOException
-
mergeOneFieldToIndex
private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) throws IOException - Throws:
IOException
-
mergeQuantiles
static ScalarQuantizer mergeQuantiles(List<ScalarQuantizer> quantizationStates, List<Integer> segmentSizes, float confidenceInterval) -
shouldRecomputeQuantiles
static boolean shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, List<ScalarQuantizer> quantizationStates) Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.- Parameters:
mergedQuantizationState
- The merged quantization statequantizationStates
- The quantization states of the individual segments- Returns:
- true if the quantiles should be recomputed
-
getQuantizedKnnVectorsReader
private static QuantizedVectorsReader getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, String fieldName) -
getQuantizedState
-
mergeAndRecalculateQuantiles
static ScalarQuantizer mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, float confidenceInterval) throws IOException - Throws:
IOException
-
shouldRequantize
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state. This would imply that floating point values would slightly shift quantization buckets.- Parameters:
existingQuantiles
- The existing quantiles for a segmentnewQuantiles
- The new quantiles for a segment, could be merged, or fully re-calculated- Returns:
- true if the floating point values should be requantized
-
writeQuantizedVectorData
private static DocsWithFieldSet writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues) throws IOException Writes the vector values to the output and returns a set of documents that contains vectors.- Throws:
IOException
-
close
- Throws:
IOException
-