2013年3月17日日曜日

C++11 乱数

メモ

実行する度に異なるseedを生成したいなら、std::random_deviceを使えば良い。 実行結果 std::discrete_distributionはこんな風に使用する。 実行結果 std::discrete_distributionのコンストラクタに渡した重みのリスト通りに乱数が作られる。

参考文献
  1. http://isocpp.org/files/papers/n3551.pdf
  2. http://www.cplusplus.com/reference/random/discrete_distribution/

EM Algorithm 〜Practice by OpenCV〜

in Japanese

Introduction

In the previous page, I described the brief explanation on the theory of the EM Algorithm. In this page, I will use the implementation of the EM Algorithm the OpenCV provides to us to achieve a simple image segmentation. I will also relate the quantities the OpenCV computes to the expressions shown in the explanation.

Expressions in Previous Page

To refer to some expressions used for the interpretation on the EM Algorithm, I describe them again.

The likelihood function to maximize is defined as

where, is the normalized Gaussian, is the observed point, is the mean vector, and is the covariance matrix. indicates the weight of each Gaussian. Moreover, the following expressions also appeared during the process of the explanation.




Code

function main:
As I consider the RGB color space, an observed point corresponds to a pixel on an image, i.e., . is the total number of the pixels.
  1. the 3th-6th lines : Read an input image (image). Confirm the type of the image, and define varibles image_rows and image_cols.
  2. the 10th-13th lines : Reshape image to make reshaped_image. Confirm its type and its number of rows and columns.
  3. the 16th-20th lines : Convert the type of reshaped_image to make samples. Confirm its type and its number of rows and columns. The samples is one of some inputs passed to the EM Algorithm.
  4. the 22th-23th lines : Create an instance of the EM Algorithm (cv::EM). Arguments passed to the constructor of the cv::EM are as follows:
    • the first argument : the number of the Gaussians
    • the second argument : the type of the covariance matrix
    • the third argument : the condition to quit the algorithm
    The first argument corresponds to . I used default values to the second and the third arguments. The default value of the second argument is EM::COV_MAT_DIAGONAL and it means a diagonal matrix with positive diagonal elements. The number of free parameters is D for each matrix, where D is the space dimension. In this sample code, D equals to 3.
  5. the 26th-28th lines : Prepare a matrix for an output.
  6. 31th line: Execute the EM Algorithm. The function train internally uses the K-means clustering to create initial parameters.
  7. the 33th-35th lines : Confirm the type, the rows, and the columns of the output log_likelihoods. In each row the following quantity is stored
    .
  8. the 37th-39th lines : Confirm the type, the rows, and the columns of the output labels. The labels has the following quantities:
    (I modified the following statements. 2013/3/21)

    where,

    .
    is the probability of the observed point.

    .
  9. the 41th-44th lines : Confirm the type, the rows, and the columns of the output probs. The probs has the quantities . I will discuss the function observe_probs later.
  10. the 46th-50th lines : Confirm the type, the rows, and the columns of the output means which has . As we consider the RGB space, the quantity is the 3D vector. I will discuss the function observe_labels_and_means later.
  11. the 52th-56th lines : Confirm the type, the rows, and the columns of the output weights. In the output, is stored. I will discuss the function observe_weights later.
function observe_probs:
  1. the 13th line : Confirm the following equation:
  2. the 16th line : Confirm the following equation:
function observe_weights:
  1. the 10th line : Confirm the following equation:
function observe_labels_and_means:
This function replaces a label k assigned to a certain pixel by a corresponding mean vector .
  1. the 4th line : Create a matrix in which a final result (rgb_image) is stored.
  2. the 9th-11th lines : means is a cluster_number x 3 matrix, where 3 means the space dimension. In each row, the center coordinate (mean vector) of the Gaussian is stored. Apply the type conversion from double to unsigned char to the elements of means. Also, the number of the channels is converted to 3.
  3. the 13th-18th lines : Replace the label by the corresponding (R,G,B) to make the final image rgb_image.

Results

cluster_num=2
cluster_num=3

cluster_num=5
cluster_num=10
original image


Development Environment

  1. Mac OS X 10.8.2
  2. Processor:3.06 GHz Intel Core 2 Duo
  3. Memory:4GB
  4. Xcode4.6 with Apple LLVM 4.2(C++ Language Dialect → GNU++11, C++ Standard Library → libc++)
  5. boost-1.51.0 built by the Apple LLVM 4.1. See here
  6. opencv-2.4.3 built by the Apple LLVM 4.2. See here.

Reference

  1. Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer
  2. OpenCV Document

2013年3月16日土曜日

Gaussian Mixture and EM Algorithm

in Japanese

Introduction

In this page, I will describe a brief explanation on the Gaussian Mixture and the EM Algorithm. (Here, I show how to use the implementation of the EM Algorithm the OpenCV provides to us.)

Gaussian Mixture

Suppose we have a probability of an observed data . We can represent the probability as a superposition of K Gaussian distributions of the form
,
where, is a Gaussian density, and and indicate a mean vector and a covariance matrix, respectively. is a weight for each Gaussian density. The superposition is called a Gaussian Mixture or Mixture of Gaussians.

When we have a data set consisting of N observations , we can write a probability of the data set in the form
.
By substituting the Gaussian mixture representation into the equation, we obtain
.
An elegant and powerful method to decide unknown parameters in the right hand side, , is called the EM Algorithm.

EM Algorithm

To make notations simple, we introduce the following variables:




Then, is rewritten as . This means that how probable the observed data is under the condition of the given parameters . The quantity is called likelihood function. The method of the maximum likelihood selects a set of values of the parameters that maximize the likelihood function to approximate the true distribution by our Gaussian mixture model.
All we have to do is to maximize the following quantity:
.
For simplicity, we maximize the log of the likelihood function given by
.
Before going any further, we introduce a constraint on .
The probability is given by
.
If we integrate both sides of the equation with respect to , we obtain
.
It should be noticed that the individual Gaussian components are normalized. Using a Lagrange multiplier, the quantity to maximize taking into account the constraint is written as
.
The rest of our work is setting the derivatives of with respect to , , and to zero. We show only the results.
  1. The derivative with respect to yields the following equation:

    where,

    .
  2. The derivative with respect to yields the following equation:

    where, T indicates transposition.
  3. The derivative with respect to yields the following equation:
    .
We obtain three equations to decide parameters. These equations depend on each other. To satisfy all equations, we can use a simple iterative scheme. The scheme is called the EM Algorithm. The procedures are as follows:
  1. Set initial values to .
  2. Calculate using the current parameter values.
  3. Using , calculate three parameters


    .
  4. Calculate .
  5. Until or converge, repeat the loop from the step 2 to the step 4.
The steps 2 and 4 are called the expectation and the maximization steps, respectively. In the step 1, the K-means clustering is often used for initialization of the parameters. We consider the center of the k-th cluster as the mean . The covariance matrix can be calculated as
.
The weight is given by

where, is the number of data points assigned to the k-th cluster.

References

Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer