Kernels, kernel density estimation, kernel regression

This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei. More...

## Classes | |

class | nuklei::kernel::base |

Polymorphic kernel class. More... | |

class | nuklei::kernel::se3 |

class | nuklei::kernel::r3xs2_base< OriGrp > |

class | nuklei::kernel::r3 |

class | nuklei::KernelCollection |

This class acts as a vector-like container for kernels. It also provides methods related to kernel density estimation. More... | |

struct | nuklei::KernelLogisticRegressor |

Implements kernel logistic regression. More... | |

## Typedefs | |

typedef r3xs2_base< groupS::s2 > | nuklei::kernel::r3xs2 |

\( \mathbb R^3 \times S^2 \) | |

typedef r3xs2_base< groupS::s2p > | nuklei::kernel::r3xs2p |

\( \mathbb R^3 \times S^2_p \) | |

This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei.

Before reading this page, make sure you are familiar with the material discussed in Background.

The KDE and KLR tools provided by Nuklei work in combination with the kernels defined in the nuklei::kernel namespace. These kernels are described below. If one wishes to work with different kernels, there are two options:

- Change the
`PositionKernel`

and`OrientationKernel`

typedefs in one of the kernels (e.g., in kernel::se3), - Re-implement KDE or KLR with Generic Kernels.

The nuklei::kernel namespace provides kernels for elements that belong to

- the Special Euclidean Group \( SE(3) = \mathbb R^3 \times SO(3) \) (i.e., 3D rigid body transformations),
- \( \mathbb R^3 \times S^2 \) (i.e., the product of \( \mathbb R^3 \) and the space of 2DOF orientations).
- \( \mathbb R^3 \).

These kernels are defined as the product of a position kernel and an orientation kernel (except for the third one which is only a position kernel). An important difference between the kernels provided by nuklei::kernel and the kernels discussed in Background, is that the kernels provided by nuklei::kernel are *unnormalized*. Their value at their center point is equal to 1, and their integral varies with the bandwidth. The motivation behind this choice is that normalization factors are often expensive to compute, and often unnecessary (many algorithms will be happy with a KDE computed up to a multiplicative constant).

The class kernel::se3 implements an \( SE(3) \) kernel. The method kernel::se3::eval() returns an evaluation of

\[ \mathcal K_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \]

where \( \mathcal T \) is a triangular position kernel, \( |\cdot| \) denotes an absolute value, and \( f_{SE(3)}(\sigma_r) = \frac{1}{1-\cos(0.5*\sigma_r)}\) is a function which allows \( \sigma_r \) to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor \( e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \) efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at \( \mu_r \). The position kernel \( \mathcal T \) is given by

\[ \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) = \begin{cases} 1 - \frac{\sqrt{(\lambda-\mu_t)^2}}{2\sigma_t} &\textrm{if } \sqrt{(\lambda-\mu_t)^2} \leq 2\sigma_t \\ 0 &\textrm{if } \sqrt{(\lambda-\mu_t)^2} > 2\sigma_t\end{cases} \]

The method kernel::se3::sample() returns samples \( (\lambda, \theta) \) that follow

\[ \mathcal K'_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_4\left( \theta ; \mu_r, f_{SE(3)}(\sigma_r) \right) + \mathcal F_4 \left(\theta; -\mu_r, f_{SE(3)}(\sigma_r)\right)}2. \]

We note that this expression uses a pair of von Mises-Fisher distributions instead of the approximation introduced in \( \mathcal K_{SE(3)} \). The reason behind this difference is that \( K_{SE(3)} \) is fast to evaluate, but I do not know of an algorithm to generate samples from \( e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \). An algorithm for sampling from a von Mises-Fisher distribution exists, which is why \( \mathcal K'_{SE(3)} \) has that form.

The class kernel::r3xs2p implements an \( \mathbb R^3 \times S^2 \) kernel for axial orientations. The method kernel::r3xs2p::eval() returns an evaluation of

\[ \mathcal K_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)} \]

where \( f_{RSA}(\sigma_r) = \frac{1}{1-\cos(\sigma_r)}\) is a function which allows \( \sigma_r \) to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor \( e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)} \) efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at \( \mu_r \). Similar to the \( SE(3) \) kernel, the method kernel::r3xs2p::sample() returns samples \( (\lambda, \theta) \) that follow

\[ \mathcal K'_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_3\left( \theta ; \mu_r, f_{RSA}(\sigma_r) \right) + \mathcal F_3 \left(\theta; -\mu_r, f_{RSA}(\sigma_r)\right)}2. \]

The class kernel::r3 implements an \( \mathbb R^3 \) kernel. The method kernel::r3::eval() returns an evaluation of

\[ \mathcal K_{\mathbb R^3}\left(\lambda ; \mu_t, \sigma_t\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right). \]

The method kernel::r3::sample() returns samples \( \lambda \) that follow \( \mathcal K_{\mathbb R^3} \).

The KernelCollection class provides access to a kernel density estimate of the density modeled by the kernels it contains. The KDE defined by KernelCollection is expressed as

\[ f(x) = \sum_{i=1}^n w_i \mathcal K\left(x ; \hat x_i, \sigma\right), \]

where \( n \) is the number of kernels held in a KernelCollection object, \( w_i \) and \( x_i \) denote the weight and center point of the \( i^\mathrm{th} \) kernel, respectively, \( \mathcal K \) denotes the kernel function implemented by the kernels, and \( \sigma \) denotes kernel bandwidths, which may correspond to a pair of values if the kernels have both a location and orientation component.

KernelCollection must contain kernels of identical domain of definition (for instance, \( SE(3) \) kernels cannot be mixed with \( \mathbb R^3 \) kernels). The methods of KernelCollection enforce this constraint (exceptions will be thrown if it is attempted to mix different kernels). KernelCollection holds kernel objects of a class that inherits from kernel::base. Those kernels are by default unnormalized, which impacts KDE in two ways:

- KDE (KernelCollection::evaluationAt()) provides a density estimate up to a multiplicative constant, i.e.,
\[ \int f(x) \mathrm{d}x = C(\sigma), \]

where \( C \) is positive and depends on \( \sigma \). - All kernels in a KernelCollection must have the same bandwidth. This constraint is
*not*enforced by the methods of KernelCollection. It is the user's responsibility to make sure it is verified prior to calling KernelCollection::evaluationAt().

Evaluation is provided by KernelCollection::evaluationAt(). In effect, this method computes

using namespace nuklei; // [for proper linking by doxygen]

{

double value = 0;

for (KernelCollection::const_iterator i = begin(); i != end(); i++)

value += i->getWeight() * i->polyEval(k);

return value;

}

For computational efficiency, the implementation of this function differs in two ways from this piece of code:

- If the number of kernels is larger than 1000, a \(kd\)-tree is used to discard the kernels that are too far from
`k`

to be relevant (see below). - Instead of calling
`polyEval`

, evaluationAt first figures out the type of the kernels in this KernelCollection, and calls the static equivalent of`polyEval`

. This trick avoids repeated VTABLE lookups and allows the compiler to inline the kernel evaluation function.

The Nuklei implementation follows the description given in Background and uses a \( kd \)-tree to quickly access kernels that are near the evaluation point. The definition of *near* is given by the `cutPoint`

method that kernels of nuklei::kernel must implement. This method gives the distance to the kernel's origin at which the kernel value can be assume to be zero. For triangular kernels, this distance is \( 2\sigma_t \).

Calls to KernelCollection::evaluationAt() should be preceded a call to KernelCollection::computeKernelStatistics() (or KernelCollection::normalizeWeights()), which compute statistics necessary for evaluation, and a call to KernelCollection::buildKdTree(). Read Intermediary Results for an explanation of how to use these functions, and kde_evaluate.cpp for an example.

Sampling is provided by KernelCollection::sample(). Random variates from the density are generated as follows:

- First, a kernel index \( i \) is selected by drawing \( i \) from
\[ P(i = \ell) \propto w_{\ell}, \]

where \( w_{\ell} \) is the weight of the \( \ell^\mathrm{th} \) kernel. (This effectively gives a higher chance to kernels with a larger weight.) - Then, a random variate \( x \) is generated by sampling from the kernel \( \mathcal K\left(x ; \hat x_i, \sigma\right) \).

Sampling is provided by KernelCollection::sample(). This method works as follows:

using namespace nuklei; // [for proper linking by doxygen]

{

for (const_sample_iterator i = sampleBegin(sampleSize); i != i.end(); ++i)

{

kernel::base::ptr k = i->polySample();

k->setWeight( 1.0/sampleSize );

s.add(*k);

}

return s;

}

The loop selects `sampleSize`

kernels with probability \( P(i = \ell) \propto w_{\ell} \). Calling `polySample`

then draws a sample from each of the selected kernels.

If a single sample is needed, `kernelCollection.randomKernel().polySample()`

can be used. Note that this expression is linear in the number of kernels in `kernelCollection`

.

KLR is implemented in KernelLogisticRegressor.

© Copyright 2007-2013 Renaud Detry.

Distributed under the terms of the GNU General Public License (GPL).

(See accompanying file LICENSE.txt or copy at http://www.gnu.org/copyleft/gpl.html.)

Revised Fri Feb 6 2015 13:51:51.

Distributed under the terms of the GNU General Public License (GPL).

(See accompanying file LICENSE.txt or copy at http://www.gnu.org/copyleft/gpl.html.)

Revised Fri Feb 6 2015 13:51:51.