1. Overview
The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. In order to do these, I uses three methods steps by steps to get more accurate and delicate pictures, that is single scale search, multisacle pyramid search and edge align
2. Details
2.1 Single scale search
Single scale search is to exhaustively search over a window of possible displacements, which I set [-10,10]. Then, I use Normalized Cross-Correlation (NCC) to evaluate the performance of alignment. But for different r,g,b picture, the brightness may be different too, so I first do (im - np.mean(im)), then I got the relative relationship among pixels. After that I do NCC and search which displacements is the best one. What's more, because the border of the picture always contains a lot of noise, and due to the feature of np.roll which rolls some columes or rows to the other side, I only count for 90% of the picture central when caculate NCC.
2.2 Pyramid multisacle search
The exhaustive search only works well when the picture is small or it only need small displacements. When the picture is large, it takes pretty long time to align and the performance is not good enough. Therefore, I use image pyramid to shrink the picture until its width/length is about 200-300 pixels. Then I do exhaustive search, then base on the previous displacements, do exhaustive search again on next level picture but with smaller displacements window, for example, I take [-5,5] Do it until we are back to the orginal size, and the displacements is what we want. In practice, this method is really faster compare to single-scale, it only take about 20-30 second for the .tif picture to be correctly aligned.
Picture Show
Single-scale
g: (5,2) r: (12,3)
g: (-3, 2) r: (3, 2)
g: (3, 3) r: (6, 3)
g: (10, 10) r: (10, -10)
g: (10, 10) r: (10, 10)
g: (10, 10) r: (10, 10)
g: (10, 10) r: (10, 10)
g: (10, 10) r: (10, 10)
g: (10, -2) r: (10, -10)
g: (10, 10) r: (10, 10)
g: (10, 10) r: (10, 10)
g: (10, 2) r: (10, 1)
g: (10, 9) r: (10, 10)
Pyramid
g: (5,2) r: (12,3)
g: (-3, 2) r: (3, 2)
g: (3, 3) r: (6, 3)
g: (55, 9) r: (114, 11)
g: (49, 24) r: (104, 58)
g: (40, 17) r: (89, 23)
g: (33, -11) r: (140, -26)
g: (60, 16) r: (124, 13)
g: (82, 9) r: (178, 12)
g: (51, 26) r: (108, 36)
g: (80, 29) r: (175, 34)
g: (55, 12) r: (112, 10)
g: (43, 6) r: (87, 32)
B&W: 1.Edge alignment
However, merely with multisacle search won't solve every alignment. Some picture like emir.tif still performs bad. I think the reason is probably beause the empire is in the center of the picture and the brightness of r,g,b channels vary a lot. Therefore, I try a new way. I use cv2.Canny to find out where the edges are, then I do alignment according to these edges using method 2.2 . After I get the displacements, I apply it to the original r,g,b picture. And it turns out that it works really well.
Picture Show
Pyramid
g: (49, 24) r: (104, 58)
Edge alignment
g: (49, 24) r: (107, 40)
B&W: 2.Automatic contrasting
I use histogram equalization to achieve this. I first change the RGB image to HSV image, because the histogram of three channel of RGB is different , if I apply histogram equalization on each channel, their ratio will change, which cause the colour of the RGB image to change. When I get the HSV image I apply histogram equalization to it and then change it back to RGB image.
Picture Show
Original
Automatic contrasting
B&W: 3.Aligning and processing data from other sources
In the end, I download some images from the Internet to test the validation of my code, and it turns out to be good.