CS 89/189 - Assignment 3

Denoising from sequence

First, grab the new basecode and input images from Canvas.

Basic sequence denoising

In align.cpp implement the function:

FloatImage denoiseSeq(const vector<FloatImage> &imSeq): this function takes an image sequence as input and returns a denoised version of the image computed by averaging all the images. At this point, you should assume that the images are perfectly aligned and are all the same size.

Try it on the sequence in the directory aligned-ISO3200 using the first part of testDenoiseSeq() in a3_main.cpp. We suggest testing with at least 16 images, and experimenting with other images to see how well the method converges.

Variance

In align.cpp implement the function:

FloatImage logSNR(const vector<FloatImage> &imSeq, float scale): that returns an image visualizing the per-pixel and per-channel \(10 \times \log_{10} \) of the squared signal-to-noise ratio scaled by scale. Make sure to use the unbiased estimator for variance in your calculation (i.e. division by \((n-1)\)).

Compare the signal-to-noise ratio of the ISO 3200 and ISO 400 sequences. Which ISO has better SNR? Answer the question in the README.txt file. Same as in the previous section, use at least 16 images, but more will give you better estimates. Visualize the variance of the images in aligned-ISO3200 using the second half of testDenoiseSeq() in a3_main.cpp.

Alignment

The image sequences you have looked at so far have been perfectly aligned. Sometimes, the camera might move, so we need to align the images before denoising.

In align.cpp implement the functions:

vector<int> align(const FloatImage &im1, const FloatImage &im2, int maxOffset): this function returns the [x, y] offset that best aligns im1 to match im2. Ignore the difference for all the pixels less than or equal to maxOffset away from the edges. Use a brute force approach that tries every possible integer translation and evaluates the quality of a match using the squared error norm (the sum of the squared pixel differences). The function roll() in align.cpp might come in handy. It circularly shifts an image, causing borders to be wrapped. However, since you will be ignoring boundary pixels, wrapping the pixel values should not be a problem. Make sure you test your procedure before moving forward.
FloatImage alignAndDenoise(const vector<FloatImage> &imSeq, int maxOffset): use align to implement this function. It should allow you to produce a denoised image even when the input sequence is not perfectly aligned. More specifically, it registers all images to the first image in the image sequence and then outputs a denoised image.

Use testDenoiseShiftSeq() and testOffset() in a3_main.cpp to help test your functions on the images from the green sequence in Input/green.

(a) Averaging

(b) Aligned averaging

Figure 1: Result of denoising by naively averaging 9 images (a) and then by averaging after first aligning the images (b). Zoom in on the image edges and note that first aligning the images helps to perserve the crispness of the edges.

Basic Demosaicing

Most digital sensors record color images through a Bayer mosaic, where each pixels captures only one of the three color channels, and a software interpolation is then needed to reconstruct all three channels at each pixel. The green channel is recorded twice as densely as red and blue, as shown below.

Figure 2

We represent raw images as a grayscale image (red, green, and blue channels are all the same). The images are encoded linearly so you do not have to account for gamma. You can open these images in your favorite image viewer and zoom in to see the pattern of the Bayer mosaic.

We provide you with a number of raw images and your task is to write functions that demosaic them. We encourage you to debug your code using raw/signs-small.png because it is not too big and exhibits the interesting challenges of demosaicing.

For simplicity, we ignore the case of pixels near the boundary of the image. That is, the first and last two rows and columns of pixels don't need to be reconstructed. This will allow you to focus on the general case and not worry about whether neighboring values are unavailable. It's actually not uncommon for cameras and software to return a slightly-cropped image for similar reasons. For the border pixels that you do not calculate, copy the pixels values from the same location in the original raw image to your output image. See http://www.luminous-landscape.com/contents/DNG-Recover-Edges.shtml

Basic green channel

In demosaic.cpp implement the function:

FloatImage basicGreen(const FloatImage &raw, int offset): that takes as input a raw image and returns a single-channel 2D image corresponding to the interpolated green channel. The offset encodes whether the top-left pixel or its right neighbor are the first green pixel. In the case of Figure 2 the second pixel is green so offset=1. For the image raw/signs-small.png offset=1. Make your code general for either offset since different cameras use different conventions.

Try your image on the included images in Input/raw and verify that you get a nice smooth interpolation. You can try on your own raw images by converting them using the program dcraw.

Basic red and blue

In demosaic.cpp implement the functions:

FloatImage basicRorB(const FloatImage &raw, int offsetX, int offsetY): Like basicGreen() this function takes a raw image and returns a 2D single-channel image as output. However, it deals with the sparser red and blue channels. offsetX, offsetY are the coordinates of the first pixel that is red or blue. Figure 2 shows that 0,0 is blue while 1,1 is red. The function will be called twice:
```
FloatImage red = basicRorB(raw, 1, 1);
FloatImage blue = basicRorB(raw, 0, 0);
```
Similar to the green-channel case, copy the values when they are available. For interpolated pixels that have two direct neighbors that are known (left-right or up-down), simply take the linear interpolation between the two values. For the remaining case, interpolate the four diagonal pixels. You can ignore the first and last two rows or columns to make sure that you have all the neighbors you need.
FloatImage basicDemosaic(const FloatImage &raw, int offsetGreen, int offsetRedX, int offsetRedY, int offsetBlueX, int offsetBlueY): that takes a raw image and returns a full RGB image demosaiced with the above functions. You might observe some checkerboard artifacts around strong edges. This is expected from such a naive approach.

Use testBasicDemosaic() in a3_main.cpp to help test your basic demosaicing functions.

Edge-based green

One central idea to improve demosaicing is to exploit structures and patterns in natural images. In particular, 1D structures like edges can be exploited to gain more resolution. We will implement the simplest version of this principle to improve the interpolation of the green channel. We focus on green because it has a denser sampling rate and usually a better SNR.

For each pixel, we will decide to adaptively interpolate either in the vertical or horizontal direction. That is, the final value will be the average of only two pixels, either up and down or left and right. We will base our decision on the comparison between the variation up-down and left-right. It is up to you to think or experiment and decide if you should interpolate along the direction of biggest or smallest difference.

In demosaic.cpp implement the functions:

FloatImage edgeBasedGreen(const FloatImage &raw, int offset): that takes a raw image and outputs an adaptively interpolated single-channel image corresponding to the green channel. This function should give perfect results for horizontal and vertical edges.
FloatImage edgeBasedGreenDemosaic(const FloatImage &raw, int offsetGreen, int offsetRedX, int offsetRedY, int offsetBlueX, int offsetBlueY): that takes a raw image and returns full RGB images with the green channel demosaiced with edgeBasedGreen() and the red and blue channels demosaiced with basicRorB().

Red and blue based on green

A number of demosaicing techniques work in two steps and focus on first getting a high-resolution interpolation of the green channel using a technique such as edgeBasedGreen(), and then using this high-quality green channel to guide the interpolation of red and blue.

One simple such approach is to interpolate the difference between red and green (resp. blue and green). Adapt your code above to interpolate the red or blue channel based not only on a raw input image, but also on a reconstructed green channel.

In demosaic.cpp implement the functions:

FloatImage greenBasedRorB(const FloatImage &raw, FloatImage &green, int offsetX, int offsetY): that proceeds pretty much as your basic version, except that it is the difference R-G or B-G that gets interpolated. In this case, we are not trying to be smart about 1D structures because we assume that this has been taken care of by the green channel. The demosaicing pipeline is then as follows:
```
FloatImage green = edgeBasedGreen(raw, 1);
FloatImage red = greenBasedRorB(raw, green, 1, 1);
FloatImage blue = greenBasedRorB(raw, green, 0, 0);
```
FloatImage improvedDemosaic(const FloatImage &raw, int offsetGreen, int offsetRedX, int offsetRedY, int offsetBlueX, int offsetBlueY): that takes a raw image and returns a full RGB images with the green channel demosaiced with edgeBasedGreen() and the red and blue channels demosaiced with greenBasedRorB().

Try this new improved demosaicing pipeline on raw/signs-small.png using testGreenEdgeDemosaic() in a3_main.cpp and notice that most (but not all) artifacts are gone.

(a) basicDemosaic

(b) edgeBasedGreenDemosaic

Figure 3: Results of demosaicing using the 3 different methods. Notice how artifacts appear around the edges of the resulting image when using basic interpolation. An edge-aware demosaicing algorithm significantly decreases the artifacts around these edges.

Sergey Prokudin-Gorsky

The Russian photographer Sergey Prokudin-Gorsky took beautiful color photographs in the early 1900s by sequentially exposing three plates with three different filters.

We include a number of these triplets of images in Input/Sergey (courtesy of Alyosha Efros). Your task is to reconstruct RGB images given these inputs.

Cropping and splitting

In align.cpp implement the function:

FloatImage split(const FloatImage &sergeyImg): that splits an image and turns it into one 3-channel image. We have cropped the original images so that the image boundaries are approximately 1/3 and 2/3 along the y dimension. Use floor to compute the height of your final output image from the height of your input image.

Alignment

The image that you get out of your split function will have its 3 channels misaligned.

In align.cpp implement the function:

FloatImage sergeyRGB(const FloatImage &sergeyImg, int maxOffset): that first calls your split() function, but then aligns the green and blue channels of your rgb image to the red channel. Your function should return a beautifully aligned color image.

(a) Naive RGB

(b) Aligned RGB

Figure 4: Generating an RGB image from a single grayscale Sergey image.

Use testSergey() in a3_main.cpp to help test your functions.

Extra credit (up to 10%)

Here are some ideas for extra credit for both graduate and undergradute students:

Implement a coarse-to-fine alignment. Compare against your brute-force one.
Take potential rotations into account for alignment. This could be slow!
Implement smarter demosaicing. Make sure you describe what you did. For example, you can use all three channels and a bigger neighborhood to decide the interpolation direction.
Implement an automatic method to intelligently crop and/or correct the tonality and colors of the Sergey Prokudin-Gorsky images. Describe the logic of your approach the readme.

Note: we need to be able to grade your extra credit separately from the regular portion of the assignment. Make sure to implement it in a way that allows this (e.g. as a different function).

Submission

Turn in your files using Canvas and make sure all your files are in the a3 directory under the root of the zip file. Include all sources (.cpp and .h) files, any of your images, and the output of your program. Don't include your actual executable (we don't need your _build directory), and remove any superfluous files before submission.

For this assignment, due to size, please do not include the original Input image directory in the zip file that you upload to Canvas. You can use make zip to zip the contents of your folder in this way from the terminal.

In your readme.txt file, you should also answer the following questions:

What ISO has better SNR?
Which direction you decided to interpolate along for the edgeBasedGreen channel.
How long did the assignment take?
Potential issues with your solution and explanation of partial completion (for partial credit)
Any extra credit you may have implemented
Collaboration acknowledgment (you must write your own code)
What was most unclear

Acknowledgments: This assignment is based off of one designed by Frédo Durand, Katherine L. Bouman, Gaurav Chaurasia, Adrian Vasile Dalca, and Neal Wadhwa for MIT's Digital & Computational Photography class.

Dartmouth CS 89.15/189.5, Fall 2015
Computational Aspects of Digital Photography

Assignment 3: Denoising and Demosaicing

Denoising from sequence

Basic sequence denoising

Variance

Alignment

Basic Demosaicing

Basic green channel

Basic red and blue

Edge-based green

Red and blue based on green

Sergey Prokudin-Gorsky

Cropping and splitting

Alignment

Extra credit (up to 10%)

Submission

Dartmouth CS 89.15/189.5, Fall 2015Computational Aspects of Digital Photography

Assignment 3: Denoising and Demosaicing

Denoising from sequence

Basic sequence denoising

Variance

Alignment

Basic Demosaicing

Basic green channel

Basic red and blue

Edge-based green

Red and blue based on green

Sergey Prokudin-Gorsky

Cropping and splitting

Alignment

Extra credit (up to 10%)

Submission

Dartmouth CS 89.15/189.5, Fall 2015
Computational Aspects of Digital Photography