OEFPHistogram

Attention

PRELIMINARY-IMAGE This is a preliminary API until 2019.Apr and may be improved based on user feedback. It is currently available in C++ and Python.

class OEFPHistogram

The OEFPHistogram class is used to hold a histogram of similarity scores.

This class is an efficient way to obtain statistics on large databases which would be unfeasible to keep as full NxN similarity matrices.

The following classes derive from this class:

Constructors

OEFPHistogram(size_t numBins=200, double min=0.0, double max=1.0)

Creates an OEFPHistogram object with given number of bins and range.

nrbins
Number of bins in the histogram. (default=200)
min
Minimum bound (inclusive) of the histogram range. (default = 0)
max
Maximum bound (inclusive) of the histogram range. (default = 1)

Note

The default range of [0,1] covers all built-in similarity measures.

AddSample

size_t AddSample(const float sample)

Adds the sample into the histogram by incrementing the count in the relevant bin. Returns the bin index that the sample went into.

GetBinBoundaries

OESystem::OEIterBase<double> *GetBinBoundaries() const

Returns an iterator over the boundaries for each bin.

Note

The number of elements in this iterator is equal to OEFPHistogram::NumBins +1.

GetBinCenters

OESystem::OEIterBase<double> *GetBinCenters() const

Returns an iterator over the center value for each bin.

Note

The number of elements in this iterator is equal to OEFPHistogram::NumBins.

Hint

It is generally reasonable to plot the results of OEFPHistogram::GetCounts or OEFPHistogram::GetDensity against the bin centers returned by this method.

GetBinIdx

size_t GetBinIdx(const float sample) const

Returns the bin index that the sample would go into without actually adding the sample into the bin.

GetBinWidth

double GetBinWidth() const

Returns the width of bins in the histogram.

GetCounts

OESystem::OEIterBase<unsigned int> *GetCounts() const

Returns an iterator over the bin counts.

Note

Even though counts are returned as unsigned integers, they are internally kept as size_t. If the counts are overflowing the unsigned int type of the system, it may be more appropriate to use the normalized OEFPHistogram::GetDensity instead.

GetDensity

OESystem::OEIterBase<double> *GetDensity() const

Returns an iterator over the probability density for each bin.

Note

Result is equivalent to OEFPHistogram::GetCounts normalized by OEFPHistogram::NumSamples.

GetMax

double GetMax() const

Returns upper bound (inclusive) of the histogram range.

GetMin

double GetMin() const

Returns lower bound (inclusive) of the histogram range.

Mean

double Mean()

Approximates and returns the sample mean over the histogram.

Note

This approximation is affected by number of bins in the histogram.

NumBins

size_t NumBins() const

Returns the number of bins in the histogram.

NumSamples

size_t NumSamples() const

Returns the total number of samples in the histogram.

Std

double Std()

Approximates and returns the sample standard deviation over the histogram.

Note

This approximation is affected by number of bins in the histogram.