Project Description
The goal of this project is to reproduce a color image by aligning B, G, R color channels. Using different metrics for scoring alignments and implementing techniques to optimize its efficiency are the core of this project.
Naive Procedure
For my naive approach I found that a metric called mutual information worked best for me. In image processing, mutual information (MI)
is used to align two images by measuring the amount of shared information between them. It calculates the joint probability distribution
of pixel intensities from both images, helping to determine the best alignment. MI is robust for aligning images from different modalities,
such as grayscale and infrared, as it does not rely on direct pixel-to-pixel comparisons. The goal is to maximize the mutual information,
meaning the images are aligned when they share the most information. This makes MI ideal for tasks like medical imaging and multi-modal
image registration.
I used a range of displacements in X and Y direction from [-15,15] for the green and red channels and calculated the mutual information score with the
blue channel for each displacement, and chose the displacement with the best score.
Optimized Procedure - Pyramids
In order to deal with much larger images where an exhaustive search of displacements would be far too
inefficient, I implemented an image pyramid structure, where I iterate from a downscaled images by factor of 8, 4, 2, 1.
Starting from displacement 8, we go through a set amount of displacements, grab the one with best alignment score, and move
to the next iteration. The reason this implementation is faster is that we can lower the range of displacements we check
at each iteration because we have an approximate location of the best displacement, based on the last iteration. We can afford
larger search ranges on more downscaled images because computing the alignment metric is faster with lower resolution images.
The metric I used for this case is called structural similarity. The Structural Similarity Index (SSIM) is a powerful metric for
image alignment because it evaluates the perceived structural content of images, rather than just pixel-wise differences. It is
widely used in applications such as image registration, compression, and quality assessment. Computing SS score is significantly
faster than mutual information, but also more accurate than NCC, Euclidean, etc.
Another important thing to note is that borders can often mess up the calculations of metrics used to measure alignment scores.
By calculating metrics within a 20 pixel margin of the borders, I was able to get more accurate displacements.