How To Compare One Million Images?

Co-authors: Jeremy Douglass, Tara Zepel.


In this article we use the challenge of working with a set of one million manga pages to motivate the need for a computational approach for the exploration of collections of such size, and to explain our particular method that combines digital image analysis and a novel visualization technique.

Any reflection on culture begins with an informal comparison between multiple objects in order to understand their similarities and differences. For instance, if we want our analysis to properly reflect the range of graphical techniques used today by manga artists across thousands of manga books, millions of pages in these books, and tens of millions of individual panels, we need to be able to examine details of individual images and to find patterns of difference and similarity across large numbers of images.

To do this, we need a mechanism that would allow us to precisely compare sets of images of any size – from a few dozens to millions. We discuss how our method, which combines automatic digital image analysis and media visualization, addresses these requirements.

