# OEFPHistogram¶

Attention

This is a preliminary API until 2019.Apr and may be improved based on user feedback. It is currently available in C++ and Python.

class OEFPHistogram


The OEFPHistogram class is used to hold a histogram of similarity scores.

This class is an efficient way to obtain statistics on large databases which would be unfeasible to keep as full NxN similarity matrices.

The following classes derive from this class:

## Constructors¶

OEFPHistogram(size_t numBins=200, double min=0.0, double max=1.0)


Creates an OEFPHistogram object with given number of bins and range.

nrbins
Number of bins in the histogram. (default=200)
min
Minimum bound (inclusive) of the histogram range. (default = 0)
max
Maximum bound (inclusive) of the histogram range. (default = 1)

Note

The default range of [0,1] covers all built-in similarity measures.

size_t AddSample(const float sample)


Adds the sample into the histogram by incrementing the count in the relevant bin. Returns the bin index that the sample went into.

## GetBinBoundaries¶

OESystem::OEIterBase<double> *GetBinBoundaries() const


Returns an iterator over the boundaries for each bin.

Note

The number of elements in this iterator is equal to OEFPHistogram::NumBins +1.

## GetBinCenters¶

OESystem::OEIterBase<double> *GetBinCenters() const


Returns an iterator over the center value for each bin.

Note

The number of elements in this iterator is equal to OEFPHistogram::NumBins.

Hint

It is generally reasonable to plot the results of OEFPHistogram::GetCounts or OEFPHistogram::GetDensity against the bin centers returned by this method.

## GetBinIdx¶

size_t GetBinIdx(const float sample) const


Returns the bin index that the sample would go into without actually adding the sample into the bin.

## GetBinWidth¶

double GetBinWidth() const


Returns the width of bins in the histogram.

## GetCounts¶

OESystem::OEIterBase<unsigned int> *GetCounts() const


Returns an iterator over the bin counts.

Note

Even though counts are returned as unsigned integers, they are internally kept as size_t. If the counts are overflowing the unsigned int type of the system, it may be more appropriate to use the normalized OEFPHistogram::GetDensity instead.

## GetDensity¶

OESystem::OEIterBase<double> *GetDensity() const


Returns an iterator over the probability density for each bin.

Note

Result is equivalent to OEFPHistogram::GetCounts normalized by OEFPHistogram::NumSamples.

## GetMax¶

double GetMax() const


Returns upper bound (inclusive) of the histogram range.

## GetMin¶

double GetMin() const


Returns lower bound (inclusive) of the histogram range.

## Mean¶

double Mean()


Approximates and returns the sample mean over the histogram.

Note

This approximation is affected by number of bins in the histogram.

## NumBins¶

size_t NumBins() const


Returns the number of bins in the histogram.

## NumSamples¶

size_t NumSamples() const


Returns the total number of samples in the histogram.

## Std¶

double Std()


Approximates and returns the sample standard deviation over the histogram.

Note

This approximation is affected by number of bins in the histogram.