A Quantizer converts a fine-grained or
continuous value to a
discrete or coarse-grained value.
This can be used to control the
granularity of indexes: the tradeoff between storage space required by
indexes, and the query processing time in looking up objects. It can also be used to convert
continuous
values to
discrete values, given the inherent challenges in indexing continuous values.
Example uses:
-
Index adjacent integers to fewer coarse-grained keys with a compression factor.
Store objects having integer attributes with adjacent integer values to the same key in indexes, to reduce
the overall number of keys in the index.
For example, objects with an integer "price" attribute: store objects with price 0-4 against the same key,
objects with price 5-9 against the the next key etc., potentially reducing the number of keys
in the index by a factor of five. The compression factor (5 in this example) can be varied.
See:
IntegerQuantizer
, LongQuantizer
, BigIntegerQuantizer
-
Index attributes with continuous values, by quantizing to discrete values, and optionally
apply compression.
For example, objects with "price" stored with arbitrary precision (
Float
, Double
,
BigDecimal
etc.). If one object has price 5.00, and another has price 5.0000001,
these objects would by default be stored against different keys, leading to an arbitrarily large number
of keys for potentially small ranges in price.
See: FloatQuantizer
, DoubleQuantizer
, BigDecimalQuantizer
When an index is configured with a
Quantizer
and added to a collection, the index will use the function
implemented by the
Quantizer
to generate keys against which it will store objects in the index.
Subsequently, when an index receives a query, it will use the same quantizer to determine from
sought values
in the query the relevant keys against which it can find matching objects in the index. Given that the set of objects
stored against a
quantized key will be larger than those actually sought in the query, the index will then
filter objects in the retrieved set on-the-fly to those actually matching the query.
As such quantization allows the size of indexes to be reduced but trades it for additional CPU overhead in filtering
retrieved objects.