**Session 1.** (9am-10:30am)

**9.00am-9.20am: George Chen (CMU)**

Title: A millennium of nearest neighbor methods – an introduction to the NIPS nearest neighbor workshop 2017

I give a brief history of nearest neighbor (NN) methods, starting from Alhazen’s “Book of Optics” in the 11th century, and leading up to present time. Surprisingly, the most general nonasymptotic results on NN prediction only recently emerged in 2014 in the work of Chaudhuri and Dasgupta. Turning toward “user-friendly” theory with an eye toward practitioners, I mention recent guarantees (2013-2015) in the contemporary applications of time series forecasting, online collaborative filtering, and medical image segmentation — in all three cases, clustering structure enables successful NN prediction.

**9.20am-10.05am: Kamalika Chaudhuri (UCSD)**

Title: Analyzing Robustness of Nearest Neighbors to Adversarial Examples

Motivated by applications such as autonomous vehicles, test-time attacks via adversarial examples have received a great deal of recent attention. In this setting, an adversary is capable of making queries to a classifier, and perturbs a test example by a small amount in order to force the classifier to report an incorrect label. While a long line of work has explored a number of attacks, not many reliable defenses are known, and there is an overall lack of general understanding about the foundations of designing machine learning algorithms robust to adversarial examples.

In this talk, we take a step towards addressing this challenging question by introducing a new theoretical framework, analogous to bias-variance theory, which we can use to tease out the causes of vulnerability. We apply our framework to a simple classification algorithm: nearest neighbors, and analyze its robustness to adversarial examples. Motivated by our analysis, we propose a modified version of the nearest neighbor algorithm, and demonstrate both theoretically and empirically that it has superior robustness to standard nearest neighbors.

**10.05am-10.25am: Sewoong Oh (UIUC)**

Title: New perspective from Blackwell’s “comparisons of experiments” on generative adversarial networks

We bring the tools from Blackwell’s seminal result on comparing two stochastic experiments, to shine new lights on the modern applications of great interest: generative adversarial networks (GAN). Binary hypothesis testing is at the center of GANs, and we propose new data processing inequalities that allows us to discover new algorithms to combat mode collapse, provide sharper analyses, and provide simpler proofs. This leads to a new architecture to handle one of the major challenges in GAN known as “mode collapse”; the lack of diversity in the samples generated by the learned generators. The hypothesis testing view of GAN allows us to analyze the new architecture and show that it encourages generators with no mode collapse. Experimental results show that the proposed architecture can learn to generate samples with diversity that is orders of magnitude better than competing approaches, while being simpler. For this talk, I will assume no prior background on GANs. The full paper is available online at https://arxiv.org/abs/1712.04086.

**Coffee Break 1.** (10:30am-11.00am)

**Session 2.** (11am-12:10pm)

**11.00am-11.45am: Piotr Indyk (MIT)**

Title: Data-dependent methods for similarity search in high dimensions

I will give an overview of recent research on designing provable data-dependent approximate similarity search methods for high-dimensional data. Unlike many approaches proposed in the algorithms literature, the new methods lead to data structures that adapt to the data while retaining correctness and running time guarantees. The high-level principle behind those methods is that every data set (even a completely random one) has some structure that can be exploited algorithmically. This interpolates between basic “data-oblivious” approaches (e.g., those based on simple random projections) that are analyzable but do not adapt easily to the structure of the data, and basic “data-dependent” methods that exploit specific structure of the data, but might not perform well if this structure is not present.

Concretely, I will cover two recent lines of research:

– Data-dependent Locality-Sensitive Hashing: approximate nearest neighbor search methods that provably improve over the best possible data-oblivious LSH algorithms (covers work from SODA’14, STOC’15 and NIPS’15).

– Data-dependent metric compression: algorithms that represent sets of points using a small number of bits while approximately preserving the distances up to higher accuracy than previously known (covers work from SODA’17 and NIPS’17).

**11.45am-12.05pm: Ke Li (UC Berkeley)**

Title: Fast k-Nearest Neighbor Search via Prioritized DCI

Most exact methods for k-nearest neighbour search suffer from the curse of dimensionality; that is, their query times exhibit exponential dependence on either the ambient or the intrinsic dimensionality. Dynamic Continuous Indexing (DCI) [19] offers a promising way of circumventing the curse and successfully reduces the dependence of query time on intrinsic dimensionality from exponential to sublinear. In this paper, we propose a variant of DCI, which we call Prioritized DCI, and show a remarkable improvement in the dependence of query time on intrinsic dimensionality. In particular, a linear increase in intrinsic dimensionality, or equivalently, an exponential increase in the number of points near a query, can be mostly counteracted with just a linear increase in space. We also demonstrate empirically that Prioritized DCI significantly outperforms prior methods. In particular, relative to Locality-Sensitive Hashing (LSH), Prioritized DCI reduces the number of distance evaluations by a factor of 14 to 116 and the memory consumption by a factor of 21.

**Lunch Break.** (12:10pm-1:45pm)

**Session 3.** (1:45pm-2:50pm)

**1.45pm-2.30pm: Alexei Efros (UC Berkeley)**

Title: How to stop worrying and learn to love Nearest Neighbors

Nearest neighbors is an algorithm everyone loves to hate. It’s too trivial, too brute-force, doesn’t offer any insights. In this talk, I will try to make you fall in love with the humble nearest neighbor. First, I will discuss the use of nearest neighbor as an exceptionally useful baseline to guard against our field’s natural bias in favor of elegant algorithms over data. Then, I will discuss some scenarios when nearest neighbors is particularly useful in practice.

**2.30pm-2.50pm: Vishal Doshi (Celect)**

Title: A Platform for Data Science

We describe an extensible data science platform based on non-parametric statistical methods such as nearest neighbors. This platform was borne out of the need for us to provide scalable and flexible predictive analytics solutions for retailers and the federal government. In this talk, I will describe the formalism associated with the platform, discuss its performance on benchmark datasets out-of-the-box and show a live demo for building a recommendation system and for detecting geo-spatial anomalies in maritime domain.

**Poster Session.** (2.50pm-4.15pm) includes the Coffee Break 3-3.30pm

Learning Supervised Binary Hashing without Binary Code Optimization (*Miguel Carreira-Perpinan; Ramin Raziperchikolaei*)

Sub-linear Privacy-preserving Search with Unsecured Server and Semi-honest Parties (*Beidi Chen*)

On Nearest Neighbors in Non Local Means Denoising (*Iuri Frosio; Kautz Jan*)

Fast k-Nearest Neighbour Search via Prioritized DCI (*Ke Li; Jitendra Malik*)

Generative Local Metric Learning for Nearest Neighbor Methods (*Yung-Kyun Noh; Masashi Sugiyama; Daniel D Lee*)

Private Document Classification in Federated Databases (*Phillipp Schoppmann; Adria Gascon; Borja Balle*)

Optimizing Revenue Over Data-Driven Assortments (*Deeksha Sinha; Theja Tulabandula*)

Fast Distance Metric Learning for Multi-Task Large Margin Nearest Neighbor Classification (*Adams Yu*)

**Session 4.** (4.15pm-5.30pm)

**4.15pm-5.00pm: Jonathan Yedidia (Analog Devices)**

Title: Boundary Forest Algorithm

I explain the Boundary Forest algorithm, a simple, fast, and incremental online algorithm that can be used in both supervised and unsupervised settings, for classification, regression, or retrieval problems. As a classification algorithm, its generalization performance is at least as good as K-nearest-neighbors, while being able to respond to queries very quickly. It maintains trees of examples that let it train and respond to test queries in a time logarithmic with the number of stored examples, which is typically very much less than the number of training examples.

**5.00pm-5.20pm: Christina Lee (Microsoft Research)**

Title: Iterative Collaborative Filtering for Sparse Matrix Estimation

The sparse matrix estimation problem consists of estimating the distribution of an n-by-n matrix Y , from a sparsely observed single instance of this matrix where the entries of Y are independent random variables. This captures a wide array of problems; special instances include matrix completion in the context of recommendation systems, graphon estimation, and community detection in (mixed membership) stochastic block models. Inspired by classical collaborative filtering for recommendation systems, we propose a novel iterative, collaborative filtering style algorithm for matrix estimation in this generic setting. Under model assumptions of uniform sampling, bounded entries, and finite spectrum, we provide bounds on the the mean squared error (MSE) of our estimator and show improved sample complexity.