Nearest Neighbor 2017 – Nearest Neighbors for Modern Applications with Massive Data

This workshop will take place @ 201 B on Friday 12/8!

Many modern methods for prediction leverage nearest neighbor (NN) search to find past training examples most similar to a test example, an idea that dates back in text to at least the 11th century in the “Book of Optics” by Alhazen. Today, NN methods remain popular, often as a cog in a bigger prediction machine, used for instance in recommendation systems, forecasting baseball player performance and election outcomes, survival analysis in healthcare, image in-painting, crowdsourcing, graphon estimation, and more. The popularity of NN methods is due in no small part to the proliferation of high-quality fast approximate NN search methods that scale to high-dimensional massive datasets typical of contemporary applications. Moreover, NN prediction readily pairs with methods that learn similarities, such as metric learning methods or Siamese networks. In fact, some well-known pairings that result in nearest neighbor predictors that learn similarities include random forests and many boosting methods. NN methods is an exciting field at the intersection of classical statistics, machine learning, data structures and domain specific expertise.

Despite the popularity, success, and age of nearest neighbor methods, our theoretical understanding of them is still surprisingly incomplete and can also be disconnected from what practitioners actually want or care about. Many successful approximate nearest neighbor methods in practice do not have known theoretical guarantees, and many of the guarantees for exact nearest neighbor methods do not readily handle approximation. Meanwhile, many applications use variations on NN methods, for which existing theory may not extend to, or for which existing theory is not easily usable by a practitioner. The aim of this workshop is to bring together theoreticians and practitioners alike from these various different backgrounds with a diverse range of perspectives to bring everyone up to speed on:

Best known statistical/computational guarantees (especially recent non-asymptotic results)
Latest methods/systems that have been developed especially for fast approximate NN search that scale to massive datasets
Various applications in which NN methods are heavily used as a critical component in prediction or inference

By gathering a diverse crowd, we hope attendees share their perspectives, identify ways to bridge theory and practice, and discuss avenues of future research.

Location and Dates

This workshop will take place 9am-5:30pm on Friday, December 8, 2017 at the Long Beach Convention Center room 201B, co-located with the NIPS 2017 conference.

Organizers

George Chen, (Carnegie Mellon University)
Christina Lee, (Microsoft Research)
Devavrat Shah, (Massachusetts Institute of Technology)

Feel free to email any questions about the workshop and poster submissions to nn2017organizers@gmail.com.