Matching v. Searching
A blog on matching, and why it's better than searching, with a slight bias toward iXmatch, by Bret A. Busse
Technorati Profile      Blogroll This

8.11.2003
 
Hunting for information

A government project describes search like this:

"Searching for information with available search technology is analogous to thinking of one town where an enemy might be hiding, dropping down on it, then, if empty, trying another town of the same or similar name or size, no matter how far away. The reason for this awkward hit-and-miss approach is that all current search engines use one of three methods. They either: (1) try to match one or a few words in a query with the same few words in the quarry (keywords) or (2) look in the places where greatest number of other people have looked (Probabilities), or (3) have humans categorize the kinds of places that targets of interest are most likely to be (Bayesian). However, in the physical world a skilled hunter, whenever possible, tracks the quarry, follows its spoor from place to place, paying close attention to the direction of movement, tracing by multiple clues from cooler leads to hotter.

To do the same sort of hunting in the information world, a search technology needs to be able to relate documents or messages to each other not just by a few selected words, but by anything that ties them together; that is, all of their contained "meanings", no matter what words they are expressed in. One could then examine particular relations between the current neighborhood and surrounding areas in "meaning space" and follow trails based on any clues that come to light."


Couldn't have written it better ourselves.
 
How close is close enough?

Finding exact matches is easy. Unfortunately, very few things in life, or in data, are perfect.

When you start looking at things as bundles of features, and you get to decide how to score things if they're under or over your target, and set your own threshold for "close enough", then you're matching.
8.4.2003
 
Collapsing hierarchies

When most companies built their product catalogs or Inter/Intranet sites, they spent a lot of time on the hierarchy - the categories and subcategories, and the products and documents that fall into them. I've heard enough horror stories about "debates" about where things should go to know the process isn't much fun. When people in your own organization have a hard time deciding how to categorize your products and documents, imagine how hard it is for your customers to figure out how you think about them.

The main reason for building a hierarchy is there's too much stuff to keep it all at the same level. We think keeping it all at the same level is the way to go. The most important thing remains helping a customer find exactly what s/he's looking for. We offer two options: 1. Identify groups (clusters) of things that are similar; and 2. Identify what makes the products/docs different. Either option gives the customer access to the entire collection and complete control over what to select - without relying on a hierarchy.

Powered by Blogger