TU Berlin

Scene Completion using millions of images

CG
seperator
seperator
Algorithm
Our algorithm can be divided into four parts:

I. Finding similar images
II. Computing the best translation for every image
III. Computing the best cut for every Image
IV. Blending
I. Finding similar images
Due to the huge size of an Image DB it is quite obvious that you have to speed up your search. To do this we used descriptors - smaller dataarrays computed from our original Images - and just searched over the Descriptor Database.
a)Our first attempt to do this, we used the Tiny LAB Descriptor which scaled all Images down to 16x16 and stored the Colorinformation.
b)Although the LAB attempt allready gave us some acceptable results, it is easy to imagine that this Descriptor is kind of suboptimal when we want to find Images with similar Content. Therefore we used the matlab implementation of the Gist Descriptor (to be found here) that works on orientations in different parts of the Images and therefore finds scenes with similar Content.
c) The gist descriptor uses absolutley no colorinformation, and works just on orientations. But similar scenes often have similar colors and so the last attempt to find "near" images was a weighted merge of gist and LAB descriptor, which in fact gives in average the best results.
II. Computing the best translation for every image
Now that we have our query Image, the query mask and similar images we have to process the images, so that the border becomes invisible to the unknowing beholder. The first step to do this is to find the translation with the highest correspondence of the two images in some "borderzone". Therefore we define our "borderzone" near the seam in the input mask that spreads from the original seam up to 80px distance to it.
[mask] [arrow] [border]
This borderzone is then moved all over the similar image and we compute the SSD in L a b colorspace for every translation. Only translations that completely cover the hole are considered. Unfotunately this is quit time intensive for input masks that allow many translations of the border.
III. Computing the best cut for every Image
Having found the best translation we start adjusting the seam using a graphcut algorithm. To compute the costs for each pixel we at first convert the images to the gradient domain and then compute the difference between the two images in gradient domain. Afterwards we add some costs depending on the pixels distance to the original border.
[query] [plus] [mask] [plus] [patch]
[arrow] [graphcut]
IV. Blending
After computing the optimal cut we use a Poissonblending to merge the two images.
[query] [plus] [grapcut] [plus] [patch]
[arrow] [result]
For more information on the algorithm have a look at the Scene Completion paper by Efros and Hays.
seperator