I was in my 4th year in university. The course was on digital image processing, touching on both theory and application in equal measure. There were only 3 students, including me.

The course was interesting, albeit mind-numbing when some of the equations march into the lecture. The programming assignments were more fun, since we got to apply the theories. One of them was a rotating-an-image assignment, which formed the basis of my bilinear interpolation code. That was fun.

There’s this assignment where the professor gave us a set of texture images as samples. I can’t remember how many there were, so let’s say there were 200 of them. Then he gave us, say, 50 images. **The assignment was to match those 50 images with the controlled set of textures**. All textures were greyscale to simplify the assignment.

The 50 unknowns didn’t match pixel for pixel with the controlled samples. But they were of the same texture. For example, the controlled samples had one of marbled floor. One of the unknown images was taken with that marble floor, but in a different position. Of course, the professor could have given us red herrings to match, but he said all 50 were taken from the sample set.

Then there’s the fact that he wanted to play with his new camera back then (he admitted to it), and took lots of pictures to give us as assignments… There was an assignment with a picture of a rubber ducky…

I can’t remember exactly all the tests I used to match the textures. What I did was come up with a theory/test, and compute that test for all the samples. Then I did the same thing for the unknown textures. Then I match the unknowns with the knowns. If they were within some threshold of acceptance, that unknown texture was deemed matched to the respective sample texture.

Basically, I’m matching the textures using heuristics.

One of the tests used histograms. Basically I charted from 0 to 255 the number of pixels with a specific greyscale value. Pure white pixels have a value of 255, and pure blacks have 0 value. Then I matched the unknowns with the samples using mean squared error. If the sample matched with the least error was less than some threshold I set, then that sample was the matched texture.

I had another test involving Fast Fourier Transforms (FFT). I think I discarded the complex values and matched the unknowns using the real values part.

There’s another test involving median filtering. The idea was to capture the groups of neighbouring pixels as some usable data. So instead of a 128 by 128 pixel sample, I reduced it to a 16 by 16 matrix. You know, this one’s a bit iffy… I can’t remember whether I actually did it, or I just came up with it writing this…

Anyway, there’s a test to capture “pattern” data. The histogram test involves all pixels. The median filter test (if I actually did one) cluster pixel information in groups. Let me see if I can explain this better…

In the image above, the top right corner has more black swirly thingies close together than other parts of the image. The histogram test cannot detect that the top right corner has more black. It can only detect how much black in total there is in the image. Positional information is lost. Hence the need for a pattern test.

The histogram test is objective. Test results are verifiable and repeatable. However, matching the unknown textures require that I set a threshold. This is where the tests become subjective. Who’s to say a particular threshold value is more accurate than another?

In the end, I think I had 5 or 6 tests, and gotten a 94 (or was it 96?) percent accuracy. I was tweaking my threshold values so I could yield higher accuracy rates. See how subjective those tests of mine were? *smile*

The programming language of choice was MATLAB (yes, Will?), as dictated by the professor. So everything was coded in MATLAB. Which was good, because I’d hate to implement FFT on my own…

There’s something else too. I weighted those test results. Say test A was supposedly more accurate than test B. Then I gave the results of test A more weight in my final calculation. Thus, roughly speaking, if 3 tests out of 6 say texture A was the one, then that’s the one. It could also mean 2 tests had more sway if both carried high weights, and the other 4 tests weren’t conclusive enough.

One of my classmates got higher accuracy rates (97 or 98 percent) than I did, no matter how much I tweaked threshold values and weights, no matter how many kinds of tests I added (or took out).

But here’s the thing, and I want you to note this. Given a larger sample size, and a different set of unknown textures to match, my set of tests might actually yield *better* results than those of that irritatingly smug classmate of mine.

Here’s another takeaway. **No one test can conclusively confirm and match the unknowns** (even with some error margin). It took a few tests working in concert to obtain a relatively high accuracy rate. Think about *that*.

Setting thresholds experimentally is always so cumbersome and inconsistent with no scientific/mathematical basis. Machine learning methods have been recently very popular and what most of the algos do is to learn the best parameters for a certain model and given some training data. These parameters are inclusive of thresholds to put it informally.

In fact, the method of giving weights to different “tests” or what I shall call classifiers is the first step in a meta learning algorithm known as boosting. In boosting, you have a set of relatively bad classifiers and by combining these classifiers (for example a weighted linear combination) it is able to improve its classification rate. And of course, there is a training stage where you have to feed training data sets so that the algorithm will find the best weight for each classifier. And inevitably, the quantity and quality of training data matters and this is still a topic of research. So you’re not the only one with this problem :p

This problem can go as far and as sophisticated as you want to. So which method did your classmate with higher classification rate use?

It’s like the Newton’s method; you iteratively refine the parameters and thresholds to give better results. Well, I didn’t know much about machine learning when I took the course…

And uh, my classmate… he’s like the sworn enemy… hahahaha… there were 3 students including me. One student’s my friend. The other is him (sworn enemy… :p). He didn’t reveal any of his work with us. He didn’t discuss any scholastic matters with us. He just kept to himself and only talked with his project professor (I think he co-wrote something with the professor) and the graduate students (studying for their masters or doctorate degrees).

So yeah, I didn’t know *anything* about what he used…

Though machine learning is my specialty, I like problems related to image processing. Your texture matching problem is similar to image segmentation and object recognition problems. A dizzying array of feature extraction methods have been devised to squeeze useful information from image regions. You are right, of course, about features including (or not) positional information, although sometimes surprisingly simple features can do a very good job.

-Will

(Yes, “Will of MATLAB fame”!)

Will Dwinnells last blog post..Parallel Programming: A First LookHi Will of MATLAB fame! ðŸ™‚ I think my professor was just throwing the problem at us. If we don’t know how difficult (or unsolveable) a problem is, we might actually hit upon some usable solution. (was this Einstein’s story?)