Kernels, kernel density estimation, kernel regression

This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei. More...

Classes

class  nuklei::kernel::base
 Polymorphic kernel class. More...
 
class  nuklei::kernel::se3
 
class  nuklei::kernel::r3xs2_base< OriGrp >
 
class  nuklei::kernel::r3
 
class  nuklei::KernelCollection
 This class acts as a vector-like container for kernels. It also provides methods related to kernel density estimation. More...
 
struct  nuklei::KernelLogisticRegressor
 Implements kernel logistic regression. More...
 

Typedefs

typedef r3xs2_base< groupS::s2 > nuklei::kernel::r3xs2
 \( \mathbb R^3 \times S^2 \)
 
typedef r3xs2_base< groupS::s2p > nuklei::kernel::r3xs2p
 \( \mathbb R^3 \times S^2_p \)
 

Detailed Description

This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei.

Before reading this page, make sure you are familiar with the material discussed in Background.

The KDE and KLR tools provided by Nuklei work in combination with the kernels defined in the nuklei::kernel namespace. These kernels are described below. If one wishes to work with different kernels, there are two options:

  1. Change the PositionKernel and OrientationKernel typedefs in one of the kernels (e.g., in kernel::se3),
  2. Re-implement KDE or KLR with Generic Kernels.

Kernels

The nuklei::kernel namespace provides kernels for elements that belong to

These kernels are defined as the product of a position kernel and an orientation kernel (except for the third one which is only a position kernel). An important difference between the kernels provided by nuklei::kernel and the kernels discussed in Background, is that the kernels provided by nuklei::kernel are unnormalized. Their value at their center point is equal to 1, and their integral varies with the bandwidth. The motivation behind this choice is that normalization factors are often expensive to compute, and often unnecessary (many algorithms will be happy with a KDE computed up to a multiplicative constant).

The class kernel::se3 implements an \( SE(3) \) kernel. The method kernel::se3::eval() returns an evaluation of

\[ \mathcal K_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \]

where \( \mathcal T \) is a triangular position kernel, \( |\cdot| \) denotes an absolute value, and \( f_{SE(3)}(\sigma_r) = \frac{1}{1-\cos(0.5*\sigma_r)}\) is a function which allows \( \sigma_r \) to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor \( e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \) efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at \( \mu_r \). The position kernel \( \mathcal T \) is given by

\[ \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) = \begin{cases} 1 - \frac{\sqrt{(\lambda-\mu_t)^2}}{2\sigma_t} &\textrm{if } \sqrt{(\lambda-\mu_t)^2} \leq 2\sigma_t \\ 0 &\textrm{if } \sqrt{(\lambda-\mu_t)^2} > 2\sigma_t\end{cases} \]

The method kernel::se3::sample() returns samples \( (\lambda, \theta) \) that follow

\[ \mathcal K'_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_4\left( \theta ; \mu_r, f_{SE(3)}(\sigma_r) \right) + \mathcal F_4 \left(\theta; -\mu_r, f_{SE(3)}(\sigma_r)\right)}2. \]

We note that this expression uses a pair of von Mises-Fisher distributions instead of the approximation introduced in \( \mathcal K_{SE(3)} \). The reason behind this difference is that \( K_{SE(3)} \) is fast to evaluate, but I do not know of an algorithm to generate samples from \( e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \). An algorithm for sampling from a von Mises-Fisher distribution exists, which is why \( \mathcal K'_{SE(3)} \) has that form.

The class kernel::r3xs2p implements an \( \mathbb R^3 \times S^2 \) kernel for axial orientations. The method kernel::r3xs2p::eval() returns an evaluation of

\[ \mathcal K_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)} \]

where \( f_{RSA}(\sigma_r) = \frac{1}{1-\cos(\sigma_r)}\) is a function which allows \( \sigma_r \) to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor \( e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)} \) efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at \( \mu_r \). Similar to the \( SE(3) \) kernel, the method kernel::r3xs2p::sample() returns samples \( (\lambda, \theta) \) that follow

\[ \mathcal K'_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_3\left( \theta ; \mu_r, f_{RSA}(\sigma_r) \right) + \mathcal F_3 \left(\theta; -\mu_r, f_{RSA}(\sigma_r)\right)}2. \]

The class kernel::r3 implements an \( \mathbb R^3 \) kernel. The method kernel::r3::eval() returns an evaluation of

\[ \mathcal K_{\mathbb R^3}\left(\lambda ; \mu_t, \sigma_t\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right). \]

The method kernel::r3::sample() returns samples \( \lambda \) that follow \( \mathcal K_{\mathbb R^3} \).

Kernel Density Estimation

The KernelCollection class provides access to a kernel density estimate of the density modeled by the kernels it contains. The KDE defined by KernelCollection is expressed as

\[ f(x) = \sum_{i=1}^n w_i \mathcal K\left(x ; \hat x_i, \sigma\right), \]

where \( n \) is the number of kernels held in a KernelCollection object, \( w_i \) and \( x_i \) denote the weight and center point of the \( i^\mathrm{th} \) kernel, respectively, \( \mathcal K \) denotes the kernel function implemented by the kernels, and \( \sigma \) denotes kernel bandwidths, which may correspond to a pair of values if the kernels have both a location and orientation component.

KernelCollection must contain kernels of identical domain of definition (for instance, \( SE(3) \) kernels cannot be mixed with \( \mathbb R^3 \) kernels). The methods of KernelCollection enforce this constraint (exceptions will be thrown if it is attempted to mix different kernels). KernelCollection holds kernel objects of a class that inherits from kernel::base. Those kernels are by default unnormalized, which impacts KDE in two ways:

Evaluation

Evaluation is provided by KernelCollection::evaluationAt(). In effect, this method computes

using namespace nuklei; // [for proper linking by doxygen]
{
double value = 0;
for (KernelCollection::const_iterator i = begin(); i != end(); i++)
value += i->getWeight() * i->polyEval(k);
return value;
}

For computational efficiency, the implementation of this function differs in two ways from this piece of code:

The Nuklei implementation follows the description given in Background and uses a \( kd \)-tree to quickly access kernels that are near the evaluation point. The definition of near is given by the cutPoint method that kernels of nuklei::kernel must implement. This method gives the distance to the kernel's origin at which the kernel value can be assume to be zero. For triangular kernels, this distance is \( 2\sigma_t \).

Calls to KernelCollection::evaluationAt() should be preceded a call to KernelCollection::computeKernelStatistics() (or KernelCollection::normalizeWeights()), which compute statistics necessary for evaluation, and a call to KernelCollection::buildKdTree(). Read Intermediary Results for an explanation of how to use these functions, and kde_evaluate.cpp for an example.

Sampling

Sampling is provided by KernelCollection::sample(). Random variates from the density are generated as follows:

Sampling is provided by KernelCollection::sample(). This method works as follows:

using namespace nuklei; // [for proper linking by doxygen]
{
for (const_sample_iterator i = sampleBegin(sampleSize); i != i.end(); ++i)
{
k->setWeight( 1.0/sampleSize );
s.add(*k);
}
return s;
}

The loop selects sampleSize kernels with probability \( P(i = \ell) \propto w_{\ell} \). Calling polySample then draws a sample from each of the selected kernels.

If a single sample is needed, kernelCollection.randomKernel().polySample() can be used. Note that this expression is linear in the number of kernels in kernelCollection.

Kernel Logistic Regression

KLR is implemented in KernelLogisticRegressor.

© Copyright 2007-2013 Renaud Detry.
Distributed under the terms of the GNU General Public License (GPL).
(See accompanying file LICENSE.txt or copy at http://www.gnu.org/copyleft/gpl.html.)
Revised Fri Feb 6 2015 13:51:51.
SourceForge.net Logo