Prototype selection for interpretable classification

Jacob Bien, Robert Tibshirani

Abstract

Prototype methods seek a minimal subset of samples that can serve as a distillation or condensed view of a data set. As the size of modern data sets grows, being able to present a domain specialist with a short list of "representative" samples chosen from the data set is of increasing interpretative value. While much recent statistical research has been focused on producing sparse-in-the-variables methods, this paper aims at achieving sparsity in the samples. We discuss a method for selecting prototypes in the classification setting (in which the samples fall into known discrete categories). Our method of focus is derived from three basic properties that we believe a good prototype set should satisfy. This intuition is translated into a set cover optimization problem, which we solve approximately using standard approaches. While prototype selection is usually viewed as purely a means toward building an efficient classifier, in this paper we emphasize the inherent value of having a set of prototypical elements. That said, by using the nearest-neighbor rule on the set of prototypes, we can of course discuss our method as a classifier as well.

Keywords

Artificial Intelligence & Data Science

📄 Full Paper Available as PDF

This paper is available as a downloadable PDF.

📄 Download PDF

Comments (0)

No comments yet. Be the first to comment.

Paper Details

Authors Jacob Bien ,
Robert Tibshirani
Published 2012-02-27
Category Artificial Intelligence And Data Science
Status Non-peer-reviewed Preprint
DOI 10.1214/11-AOAS495
Language English
Word Count 184

Prototype selection for interpretable classification

Abstract

Keywords

✨ AI Plain-English Summary

Comments (0)

Related Papers

Digital technology, tele-medicine and artificial intelligence in...

When pandemics impact economies and climate change: Exploring the impacts of...

An empirical overview of nonlinearity and overfitting in machine learning...

Advances in Feature Selection with Mutual Information