Computer Vision News - January 2018

A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs Every month, Computer Vision News reviews a research paper from our field. This month we have chosen to review A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs . We are indebted to the authors ( Dileep George, Wolfgang Lehrach, Ken Kansky, Miguel Lázaro-Gredilla, Christopher Laan, Bhaskara Marthi, Xinghua Lou, Zhaoshi Meng, Yi Liu, Huayan Wang, Alex Lavin and D. Scott Phoenix ) for allowing us to use their images to illustrate this review. Their work is here and their source code is here . We found this paper particularly interesting for a number of reasons: 1) It goes against the current: it does not use deep learning methods and still achieves impressive results. All the methods and articles the paper is based on are from before 2006. 2) The paper develops a method called recursive cortical network (RCN) – whose structure is an attempt to simulate actual neural structures from neuroscience research insights. The model’s internal logic can be justified / understood; this is in contrast again to most current deep learning methods, where we are dealing with a black box, which - while achieving impressive results - it can only rarely (if ever) be understood why this is so. 3) The supplemental materials which come with the paper are explained in impressive detail, including ready to run code which you can try by yourself. 4) Finally, and most importantly, the paper demonstrates impressive results in breaking CAPTCHAs , with a method that achieves higher precision then deep learning methods, though requiring a relatively small dataset to train on. Introduction: Recent deep learning methods developed to break CAPTCHA required millions of images to train on, while a human being doesn’t require a single image beyond the one presented, to understand a given CAPTCHA. Using insights gleaned from neuroscience, the authors propose their recursive cortical network (RCN) model: A probabilistic generative model for vision, in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified manner. How does perception in the human brain function? Big question! Some of the perception mechanisms’ capacity derives from our ability to use past experience 4 Computer Vision News Research Research by Assaf Spanier “All the methods the paper is based on are from before 2006”