Approach to count dominant colors in a image
This will be a small post that give some brief idea about how to calculate dominant colors in a image. Before going in further details I wanted to explain some background of my problem. The project I currently working on needs to count colors in a image. The count should not take all the different colors in the image, that means it should merge similar colors and count it as one. For e.g if there is a Red and light red then the count should be one, i.e Red. Just like human eye counts colors.
To solve the issue I tried different approach including Histogram quantization and several other approach. But nothing worked well. Then thought of applying some algorithm I come across when I was doing some research in collective intelligence, Euclidean distance algorithm and KMeans clustering algorithm.
This is how I solved the issue.
Step 1: Scan through the image and get all the pixels. Group similar pixels and also increase the count. I use a hash table for grouping. After the scanning I will get a hash table with all the colors and the count of each colors.
Step 2: Find out the dominant colors using the pixel count and remove the nearest pixels. To find the nearest color in the same domain I use Euclidean distance algorithm. I meant by dominant color is the color that has more pixel count. While removing the color, we should not remove the dominant color.
Step 3: The above step still will not give accurate results, this step 2 result will just a give a starting point for color count. I use the result from Step 2 as the cluster for applying KMeans clustering algorithm. For clusters take only the top n higher pixel counted colors from Step 2. Apply the clustering on the pixel data we got from Step 1.
Step 4: Apply the Euclidean distance on the result we got from Step 3. The result will be closer to the count of colors in the image. You can tweak to get closer result by increasing or decreasing the distance cutoff value.
I tested this method with images with less size, it may have worst performance in big images. There may be different more accurate method might be their. I cant do much with my limited knowledge on Image processing.
I will be more happy if any one can provide me a better approach. You can add it to the comment section.