On the Efficient Determination of Most Near Neighbors: Horseshoes, Hand Grenades, Web Search and Other Situations When Close is Close Enough

Author:   Mark S. Manasse ,  Gary Marchionini
Publisher:   Morgan & Claypool Publishers
ISBN:  

9781608450886


Pages:   88
Publication Date:   30 November 2012
Format:   Paperback
Availability:   Awaiting stock   Availability explained
The supplier is currently out of stock of this item. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out for you.

Our Price $92.40 Quantity:  
Add to Cart

Share |

On the Efficient Determination of Most Near Neighbors: Horseshoes, Hand Grenades, Web Search and Other Situations When Close is Close Enough


Add your own review!

Overview

"The time-worn aphorism ""close only counts in horseshoes and hand-grenades"" is clearly inadequate. Close also counts in golf, shuffleboard, archery, darts, curling, and other games of accuracy in which hitting the precise center of the target isn't to be expected every time, or in which we can expect to be driven from the target by skilled opponents. This lecture is not devoted to sports discussions, but to efficient algorithms for determining pairs of closely related web pages -- and a few other situations in which we have found that inexact matching is good enough; where proximity suffices. We will not, however, attempt to be comprehensive in the investigation of probabilistic algorithms, approximation algorithms, or even techniques for organizing the discovery of nearest neighbors. We are more concerned with finding nearby neighbors; if they are not particularly close by, we are not particularly interested.In thinking of when approximation is sufficient, remember the oft-told joke about two campers sitting around after dinner. They hear noises coming towards them. One of them reaches for a pair of running shoes, and starts to don them. The second then notes that even with running shoes, they cannot hope to outrun a bear, to which the first notes that most likely the bear will be satiated after catching the slower of them. We seek problems in which we don't need to be faster than the bear, just faster than the others fleeing the bear."

Full Product Details

Author:   Mark S. Manasse ,  Gary Marchionini
Publisher:   Morgan & Claypool Publishers
Imprint:   Morgan & Claypool Publishers
Dimensions:   Width: 19.10cm , Height: 0.40cm , Length: 23.50cm
Weight:   0.500kg
ISBN:  

9781608450886


ISBN 10:   1608450880
Pages:   88
Publication Date:   30 November 2012
Audience:   Professional and scholarly ,  Professional & Vocational
Format:   Paperback
Publisher's Status:   Active
Availability:   Awaiting stock   Availability explained
The supplier is currently out of stock of this item. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out for you.

Table of Contents

Introduction Comparing Web Pages for Similarity: An Overview A Personal History of Web Search Uniform Sampling after Alta Vista Why Weight (and How)? A Few Applications

Reviews

The material in this book grew from a simple question: We know how to easily determine whether two files are identical, but what do we know about determining whether two files are similar ? The answer was Not much, but when a theorist gives this answer, good things often happen. Such was the case here. This book will be important to practitioners interested in this and similar questions. It contains two intertwined threads; a mathematical treatment of the problem and an engineering thread that provides extremely efficient code for obtaining the solution at scale. I recommend it highly. Charles P. (Chuck) Thacker Microsoft Research 2009 Turing Award Winner From de-duplication to search, billion dollar industries rely on the ability to search for keys that are close to a specified key. The book by Mark Manasse provides a beautiful exposition of the field. Manasse is a well-known expert who has written some of the fundamental theoretical papers in the field; better still, he has worked on real products such as AltaVista and Windows file de-duplication. Mark has the rare ability to take theoretical ideas and convert them to sound engineering. The book will appeal to developers working in the web milieu because it illuminates the details that are often missing using code snippets. It will also appeal to researchers and students because of the uniform and insightful exposition of an important area. George Varghese Professor University of California, San Diego Mark Manasse, the father of micropayments, provides insight, techniques, and theory behind search on getting not too large, not too small, but just right results. This horseshoes mini-treatise comes right from the horse s mouth as an Alta Vistan he shows how the game was constructed by high dimensionality mapping into tractable space and time to find ringers and good outliers. Gordon Bell Microsoft Research


"""The material in this book grew from a simple question: ""We know how to easily determine whether two files are identical, but what do we know about determining whether two files are similar""? The answer was ""Not much,"" but when a theorist gives this answer, good things often happen. Such was the case here. This book will be important to practitioners interested in this and similar questions. It contains two intertwined threads; a mathematical treatment of the problem and an engineering thread that provides extremely efficient code for obtaining the solution at scale. I recommend it highly."" Charles P. (Chuck) Thacker Microsoft Research 2009 Turing Award Winner ""From de-duplication to search, billion dollar industries rely on the ability to search for keys that are close to a specified key. The book by Mark Manasse provides a beautiful exposition of the field. Manasse is a well-known expert who has written some of the fundamental theoretical papers in the field; better still, he has worked on real products such as AltaVista and Windows file de-duplication. Mark has the rare ability to take theoretical ideas and convert them to sound engineering. The book will appeal to developers working in the web milieu because it illuminates the details that are often missing using code snippets. It will also appeal to researchers and students because of the uniform and insightful exposition of an important area."" George Varghese Professor University of California, San Diego ""Mark Manasse, the father of micropayments, provides insight, techniques, and theory behind search on getting not too large, not too small, but just right results. This horseshoes mini-treatise comes right from the horse s mouth as an Alta Vistan he shows how the game was constructed by high dimensionality mapping into tractable space and time to find ringers and good outliers."" Gordon Bell Microsoft Research The material in this book grew from a simple question: ""We know how to easily determine whether two files are identical, but what do we know about determining whether two files are similar""? The answer was ""Not much,"" but when a theorist gives this answer, good things often happen. Such was the case here. This book will be important to practitioners interested in this and similar questions. It contains two intertwined threads; a mathematical treatment of the problem and an engineering thread that provides extremely efficient code for obtaining the solution at scale. I recommend it highly. Charles P. (Chuck) Thacker Microsoft Research 2009 Turing Award Winner From de-duplication to search, billion dollar industries rely on the ability to search for keys that are close to a specified key. The book by Mark Manasse provides a beautiful exposition of the field. Manasse is a well-known expert who has written some of the fundamental theoretical papers in the field; better still, he has worked on real products such as AltaVista and Windows file de-duplication. Mark has the rare ability to take theoretical ideas and convert them to sound engineering. The book will appeal to developers working in the web milieu because it illuminates the details that are often missing using code snippets. It will also appeal to researchers and students because of the uniform and insightful exposition of an important area. George Varghese Professor University of California, San Diego Mark Manasse, the father of micropayments, provides insight, techniques, and theory behind search on getting not too large, not too small, but just right results. This horseshoes mini-treatise comes right from the horse s mouth as an Alta Vistan he shows how the game was constructed by high dimensionality mapping into tractable space and time to find ringers and good outliers. Gordon Bell Microsoft Research"


Author Information

Tab Content 6

Author Website:  

Customer Reviews

Recent Reviews

No review item found!

Add your own review!

Countries Available

All regions
Latest Reading Guide

wl

Shopping Cart
Your cart is empty
Shopping cart
Mailing List