INTEGRATING CO-TRAINING AND RECOGNITION FOR TEXT DETECTION (WedPmPO1)
Author(s) :
Wen Wu (Carnegie Mellon University, United States of America)
Datong Chen (Carnegie Mellon University, United States of America)
Jie Yang (Carnegie Mellon University, United States of America)
Abstract : This paper proposes a new co-training scheme to utilize two additional information resources: unlabeled data and optical character recognition (OCR) to detection text in scene. Training a good text detector requires large mount of labeled data, which is very expensive to be obtained. Co-training has been demonstrated to be a powerful tool for many semi-supervised learning problems using unlabeled data. However, most of previous work on co-training only focuses on interactive labeling of unlabeled instances by two trained classifiers without considering third-party knowledge on the unlabeled data. The approach we proposed in this paper integrates the co-training algorithm and text recognition to improve text detection with unlabeled data. Experimental results on the ICDAR03 text locating competition dataset shows that the proposed approach work as effectively as a supervised trained classifier even with reduced amount of training data.

Menu