PageRenderTime 560ms CodeModel.GetById 132ms app.highlight 173ms RepoModel.GetById 186ms app.codeStats 57ms

/articles/machine-learning-sample-color-quantization-using-k-means-clustering.md

https://github.com/berlstone/azure-content
Markdown | 158 lines | 119 code | 39 blank | 0 comment | 0 complexity | 22d8c3520ceaa61f50719180b83f4a96 MD5 | raw file
  1<properties title="Azure Machine Learning Sample: Color quantization using K-Means clustering" pageTitle="Machine Learning Sample: Color quantization using K-Means clustering | Azure" description="A sample Azure Machine Learning experiment that evaluates using different K-Means clustering values for quantizing a color image." metaKeywords="" services="" solutions="" documentationCenter="" authors="garye" videoId="" scriptId="" />
  2
  3<tags ms.service="machine-learning" ms.workload="tbd" ms.tgt_pltfrm="na" ms.devlang="na" ms.topic="article" ms.date="01/01/1900" ms.author="garye" />
  4
  5#Azure Machine Learning Sample: Color quantization using K-Means clustering
  6
  7*You can find the sample experiment associated with this model in ML Studio in the **EXPERIMENTS** section under the **SAMPLES** tab. The experiment name is:*
  8
  9	Sample Experiment - Color Based Image Compression using K-Means Clustering - Development
 10
 11##Problem description
 12
 13[Color quantization](http://en.wikipedia.org/wiki/Color_quantization "Color quantization") is the process of reducing the number of distinct colors in an image hence, compressing it. Normally, the intent is to preserve the color appearance of the image as much as possible, while reducing the number of colors, whether for memory limitations or compression. 
 14
 15##Data
 16
 17In this sample experiment, we are assuming any given 24-bit RGB image has 256 x 256 x 256 possible colors. And sure, we can build standard color histograms based on these intensity values. But another approach is to explicitly quantize the image and *reduce* the number of colors to say, 16 or 64. This creates a substantially smaller space and (ideally) less noise and variance. For this, we passed the pixel data (each pixel as a dataset row) to our K-Means clustering Module. 
 18
 19##Model
 20
 21The model is created as shown in the image below:
 22
 23![Model][image1]
 24
 25We ran K-Means clustering with K=10 through 500 in 5 different branches. First we calculated the clusters and then aggregated the clustering to the mean of their pixels colors (using an R Script). 
 26Finally, we associated each pixel with the aggregated cluster color and sent the new image out in CSV format. Meanwhile, we also calculated the Root Mean Squared Difference of the new pixel colors with the original image and shown them in a R plot (using Execute R Script). 
 27
 28##Results
 29
 30We tested the outcome on different number of clusters (colors) as shown on the experiment model below. As it's visible the more clustering create higher quality images with less compression:
 31
 32<table>
 33<tr><th>Original</th>
 34<td><img alt="Original" src="./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2a.jpg"></td>
 35</tr>
 36<tr><th>K=10</th>
 37<td><img alt="K=10" src="./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2b.png"></td>
 38</tr>
 39<tr><th>K=20</th>
 40<td><img alt="K=20" src="./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2c.png"></td>
 41</tr>
 42<tr><th>K=50</th>
 43<td><img alt="K=50" src="./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2d.png"></td>
 44</tr>
 45<tr><th>K=100</th>
 46<td><img alt="K=100" src="./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2e.png"></td>
 47</tr>
 48<tr><th>K=500</th>
 49<td><img alt="K=500" src="./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2f.png"></td>
 50</tr>
 51</table>
 52
 53We have also measured the accuracy using Root Mean Squared Difference to the Original Image Colors which can be seen from the second output port of the "Execute R Script" Module:
 54
 55![Output of Execute R Script module][image3]
 56
 57As it's visible, the more color clusters, the more colors match the original images (better quality). 
 58
 59##Code to convert images to CSV and reverse
 60
 61In order to feed the images into ML Studio, we wrote a simple convertor code which can convert image files to a csv format that ML Studio can use, and also one which converts them back to an image. Please feel free to use the following code. In the future we are planning to add a module for reading in images as well. 
 62
 63	using System;
 64	using System.Collections.Generic;
 65	using System.Linq;
 66	using System.Text;
 67	using System.Threading.Tasks;
 68	using System.Drawing;
 69	using System.Drawing.Imaging;
 70	using System.IO;
 71	 
 72	namespace Text2Image
 73	{
 74	    class Program
 75	    {
 76	        static void img2csv(string img_path)
 77	        {
 78	            FileInfo img_info = new FileInfo(img_path);
 79	            string destination_file_directory = img_info.DirectoryName + "\\";
 80	            string destination_file_name = img_info.Name.Remove(img_info.Name.LastIndexOf('.'), 4);
 81	            string destination_file_path = destination_file_directory + destination_file_name + ".csv";
 82	 
 83	            // Read the image
 84	            Bitmap img = new Bitmap(img_path);
 85	 
 86	            // Create the CSV File and write the header values
 87	            System.IO.StreamWriter file = new System.IO.StreamWriter(destination_file_path);
 88	            file.WriteLine("X,Y,R,G,B");
 89	 
 90	            // Write the Pixel values
 91	            for (int x = 0; x < img.Width; x++)
 92	                for (int y = 0; y < img.Height; y++)
 93	                {
 94	                    string line = x + "," + y + "," + img.GetPixel(x, y).R + "," + img.GetPixel(x, y).G + "," + img.GetPixel(x, y).B ;
 95	                    file.WriteLine(line);
 96	                }
 97	 
 98	            file.Close();
 99	        }
100	 
101	        static void csv2img(string csv_path)
102	        {
103	            FileInfo csv_info = new FileInfo(csv_path);
104	            string destination_file_directory = csv_info.DirectoryName + "\\";
105	            string destination_file_name = csv_info.Name.Remove(csv_info.Name.LastIndexOf('.'), 4);
106	            string destination_file_path = destination_file_directory + destination_file_name + ".png";
107	            
108	            // Read all the lines in the CSV file
109	            string[] lines = System.IO.File.ReadAllLines(csv_path);
110	 
111	            // set a new bitmap image with the provided width and height in the header
112	            string[] wh = lines.Last().Split(new Char[] { ' ', ',', '.', ':', '\t', '{', '}' });
113	            int img_width = Convert.ToInt32(wh[0])+1;
114	            int img_height = Convert.ToInt32(wh[1])+1;
115	 
116	            Bitmap bmp_img = new Bitmap(img_width, img_height);
117	 
118	            for (int i = 1; i < lines.Length ;i++ )
119	            {
120	                string[] values = lines[i].Split(new Char[] { ' ', ',', '.', ':', '\t', '{', '}' });
121	                if (values.Length < 3)
122	                    continue;
123	 
124	                int x = Convert.ToInt16(values[0]);
125	                int y = Convert.ToInt32(values[1]);
126	                int r = Convert.ToInt32(values[2]);
127	                int g = Convert.ToInt32(values[3]);
128	                int b = Convert.ToInt32(values[4]);
129	 
130	                bmp_img.SetPixel(x, y, Color.FromArgb(r, g, b));
131	            }
132	 
133	            bmp_img.Save(destination_file_path);
134	        }
135	 
136	        static void Main(string[] args)
137	        {
138	            string source_path = args[1];
139	            FileInfo source_info = new FileInfo(source_path);
140	 
141	            if (source_info.Extension == ".csv")
142	                csv2img(source_path);
143	            else
144	                img2csv(source_path);
145	        }
146	    }
147	}
148
149
150
151[image1]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image1.png
152[image2a]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2a.jpg
153[image2b]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2b.png
154[image2c]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2c.png
155[image2d]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2d.png
156[image2e]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2e.png
157[image2f]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image2f.png
158[image3]:./media/machine-learning-sample-color-quantization-using-k-means-clustering/image3.png