2013年3月7日木曜日

K-means Clustering

in Japanese

Introduction

In this page, I will describe a brief explanation on the theory of the K-means clustering and implement a simple image segmentation by means of a function cv::kmeans the OpenCV provides to us.

Theory

Suppose we have a data set consisting of N points each of which is defined in the D-dimensional Euclidean space as
.
Our goal is to partition the data set into some number K of clusters, where the value of K is given.
When we introduce a vector to represent a center of each cluster, an objective function we should minimize is defined as

where, is a variable that equals 1 if the point belongs to the cluster k, otherwise 0.
The procedures to minimize J are as follows:
  1. Initialize the cluster centers in some way.
  2. Minimize J with respect to the , keeping the fixed.
  3. Minimize J with respect to the , keeping the fixed.
  4. Repeat these optimizations until both of the and the converge.
It's easy to realize the 2nd procedure. It's only necessary to choose which has the minimum distance to the point we now consider and set to 1.
In order to consider the 3rd procedure, we set J's derivative with respect to to zero,

It results in the following equation,

The denominator in this expression is equal to the number of points assigned to the cluster k. Namely, indicates the mean of all of the data points assigned to the cluster k. On the basis of these discussion, the procedures to minimize J are rewritten as follows:
  1. Decide the initial centers of the clusters in some way.
  2. Assign points to the closest clusters.
  3. Calculate the mean of the points assigned to the cluster, and replace the center by the mean.
  4. Repeat the 2nd and the 3rd procedures until the centers of the clusters converge.
A direct implementation of the 2nd procedure is to compute the distance between every center and every point and to find the minimum distance. As it is relatively slow, we have to introduce a data structure such as a tree.

Practice in OpenCV

A function to execute the K-means clustering is cv::kmeans. In the following program, the 3 dimensional space (RGB) is considered. A pixel on an image corresponds to a point in 3D space. The K-means clustering yields the K clusters each of which has a set of points with similar color.
  1. 3rd-11th lines : Display an input image. A variable image indicates a H x W matrix with 3 channels. H and W are the height and the width of the image, respectively.
  2. 13th-15th lines : The function cv::kmeans requires the matrix in which each row corresponds to a coordinate of a point. In this sample program, the coordinate of the point is (B,G,R). To meet the requirement, the matrix image must be modified by using a function reshape. After the modification, image translates into a T x 3 matrix with 1 channel (reshaped_image), where T=W * H.
  3. 18th-21th lines : As the function cv::kmeans also requires the matrix consisting of floating-point elements, reshaped_image must be converted to the floating-point matrix (reshaped_image32f) by applying a function convertTo.
  4. 23th-27th lines : After preparing two matrices for outputs (labels and centers) and creating an instance of cv::TermCriteria which represents conditions to quit the algorithm, we execute the function cv::kmeans. In this case, I chose the maximum number of iteration (100) as the condition to quit the algorithm. The final argument of the constructor of cv::TermCriteria is not used. The 6th argument of the function cv::kmeans is set to cv::KMEANS_RANDOM_CENTERS. This means that the initial centers of the clusters are randomly decided.
  5. 29th line : Display result. A function show_result is as follows:
  1. 3rd-7th lines : A variable labels indicates a T x 1 matrix. The each row of it has the value of the label. If cluster_number is 3, the value is either 0, 1, or 2. A variable centers is a cluster_number x 3 matrix. The each row of it has a coordinate of a center of a cluster. In this case, the coordinate corresponds to the vector specified in the RGB space.
  2. 9th line : Prepare a matrix rgb_image in which the final image is stored.
  3. 14th-16th lines : The matrix centers has floating-point values. These values are converted into std::uint8_t-type ones with the range [0,255]. Moreover, the number of channels is also translated into 3.
  4. 18th-23th lines:The label is replaced by the corresponding (B,G,R) to make the final image rgb_image.
The results are as follows:
cluster_number=2

cluster_number=3
cluster_number=5
cluster_number=10
original image

Development Environment

  1. Mac OS X 10.8.2
  2. Processor:3.06 GHz Intel Core 2 Duo
  3. Memory:4GB
  4. Xcode4.6 with Apple LLVM 4.2(C++ Language Dialect → GNU++11, C++ Standard Library → libc++)
  5. boost-1.51.0 built by the Apple LLVM 4.1. See here
  6. opencv-2.4.3 built by the Apple LLVM 4.2. See here.

Reference

  1. Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer, p424-430
  2. OpenCV Document

12 件のコメント:

  1. 本当にありがとう。。。

    返信削除
  2. unidentified ......... width ???

    返信削除
    返信
    1. Hello Sakaz,

      Thank for your pointing.
      I fixed it.

      削除
  3. hello sir,
    i could not understand the line
    const cv::Vec3b& rgb = centers_u8c3.ptr(*label_first)[0];
    pls tell me

    返信削除
  4. このコメントは投稿者によって削除されました。

    返信削除
    返信
    1. Hello reddydmm

      The following statements hold:
      assert(centers_u8c3.type() == CV_8UC3);
      assert(centers_u8c3.rows == cluster_number);
      assert(centers_u8c3.cols == 1);

      "*label_first" is a value of type int, which represents a cluster index.
      So, "centers_u8c3.ptr<cv::Vec3b>(*label_first)" means "*label_first"-th row of "centers_u8c3."
      We need 0-th column of it to get the color corresponding to the index.
      Finally, the color is obtained by "centers_u8c3.ptr<cv::Vec3b>(*label_first)[0]."

      削除
  5. hello seiya sir,
    Am working on Mias dataset(breast cancer) and i have given the column value more than 0 and still am getting the output. My doubt is after mutiplying 512*512 image i get only 1 column and 262144 rows. In that case then i should get a error if i give more than 0 as input to the 'label_first' i.e.

    centers_u8c3.ptr(*label_first)[>0]

    i dont know wheather i have wrongly interpreted the code. so help me out in this issue.

    返信削除
    返信
    1. Hello reddydmm

      I tried accessing "centers_u8c3.ptr<cv::Vec3b>(*label_first)[1]" in my program.
      Though it worked without error, the color in the image is obviously wrong. I think that the above access picks up the garbage on the memory.

      削除
  6. Hello sir,
    I will pass my code to you along with the breast image so that you get a clear picture of my problem. when i give the column value as 0 i get a good image, but if i go for '1' or '55' or any other number the color of the clustering changes. My main doubt is since there is only only one column which is accessed as '(*label_first)[0]' and other columns are empty. how can i get an output if i enter values greater than '0'. Since 1 st column and greater are either empty or filled with junk values

    返信削除
  7. int main( int argc, const char** argv )
    {
    Mat imgMat,im,im1,im2,im3,im4,im5,im6,temp,im11,im12,im13,im14,imi,imd,imr;
    float a,b,c,d;
    Mat im_gray = imread("mdb012.pgm");//im_gray = cvLoadImage("mdb012.pgm",CV_LOAD_IMAGE_GRAYSCALE);
    imshow("image", im_gray);
    resize(im_gray, imgMat, Size(512, 512), 0, 0, INTER_CUBIC);
    //cvSetImageROI(im_gray, cvRect(0, 0, 512, 512));
    //cvShowImage("image_1", im_gray);
    //int height, width;
    //height = im_gray->height;
    //width = im_gray->width;


    for(int i=0;i<(imgMat.rows);i++)
    {
    for(int j=0;j<(imgMat.cols);j++)
    {
    if( imgMat.at(i,j)>176)
    {
    imgMat.at(i,j)=1;
    }
    }
    }
    imshow("image_2",imgMat);
    namedWindow( "Display window", WINDOW_AUTOSIZE );// Create a window for display.
    std::cout << "image: " << imgMat.rows << ", " << imgMat.cols << std::endl;
    assert(imgMat.type() == CV_8UC3);
    cv::imshow("image", imgMat);
    cv::Mat reshaped_image = imgMat.reshape(1, imgMat.cols * imgMat.rows);
    std::cout << "reshaped image: " << reshaped_image.rows << ", " << reshaped_image.cols << std::endl;
    assert(reshaped_image.type() == CV_8UC1);

    cv::Mat reshaped_image32f;
    reshaped_image.convertTo(reshaped_image32f, CV_32FC1, 1.0 / 255.0);
    std::cout << "reshaped image 32f: " << reshaped_image32f.rows << ", " << reshaped_image32f.cols << std::endl;
    assert(reshaped_image32f.type() == CV_32FC1);

    cv::Mat labels;
    int cluster_number = 10;
    cv::TermCriteria criteria (cv::TermCriteria::COUNT, 512, 1);
    cv::Mat centers;
    cv::kmeans(reshaped_image32f, cluster_number, labels, criteria, 1, cv::KMEANS_RANDOM_CENTERS, centers);

    std::cout << "===\n";
    std::cout << "labels: " << labels.rows << " " << labels.cols << std::endl;
    std::cout << "centers: " << centers.rows << " " << centers.cols << std::endl;
    assert(labels.type() == CV_32SC1);
    assert(centers.type() == CV_32FC1);

    //cv::Mat rgb_image(imgMat.rows, imgMat.cols, CV_8UC3);
    cv::MatIterator_ rgb_first = imgMat.begin();
    cv::MatIterator_ rgb_last = imgMat.end();
    cv::MatConstIterator_ label_first = labels.begin();

    cv::Mat centers_u8;
    centers.convertTo(centers_u8, CV_8UC1, 255.0);
    cv::Mat centers_u8c3 = centers_u8.reshape(3);

    while ( rgb_first != rgb_last )
    {
    const cv::Vec3b rgb = centers_u8c3.ptr(*label_first)[4];

    *rgb_first = rgb;
    ++rgb_first;
    ++label_first;
    }

    cv::imshow("tmp", imgMat);

    waitKey(0);
    return 0;
    }


    hello sir , this is the code and plese give me your mail id so that i can mail the image specified above

    返信削除