Introduction to Object Recognition
Jan Flusser
Institute of Information Theory and Automation, Czech Academy of Sciences, Prague, Czech Republic
Search for more papers by this authorTomáš Suk
Institute of Information Theory and Automation, Czech Academy of Sciences, Prague, Czech Republic
Search for more papers by this authorBarbara Zitová
Institute of Information Theory and Automation, Czech Academy of Sciences, Prague, Czech Republic
Search for more papers by this authorJan Flusser
Institute of Information Theory and Automation, Czech Academy of Sciences, Prague, Czech Republic
Search for more papers by this authorTomáš Suk
Institute of Information Theory and Automation, Czech Academy of Sciences, Prague, Czech Republic
Search for more papers by this authorBarbara Zitová
Institute of Information Theory and Automation, Czech Academy of Sciences, Prague, Czech Republic
Search for more papers by this authorSummary
This chapter is a brief introduction to the principles of automatic object recognition. It introduces the basic terms, concepts, and approaches to feature-based classification. It also introduces the term invariant and provides an overview of the invariants which have been proposed for visual object description and recognition. The chapter presents eight basic ones-simple shape features, complete visual features, transformation coefficient features, wavelet-based features, textural features, differential invariants, point set invariants, and moment invariants. It briefly reviews the basic classifier types and two popular techniques used for improving the classification-classifier fusion and dimensionality reduction. The nearest-neighbor (NN) classifier, sometimes also called the minimum distance classifier, is the most intuitive classifier. Classifiers called the Support vector machines (SVMs) are generalizations of a classical notion of linear classifiers. Artificial neural networks (ANNs) are ‘biologically inspired’ classifiers. The chapter explains how to increase the classifier performance, and shows how the classifier performance should be evaluated.
References
- D. Kundur and D. Hatzinakos, “Blind image deconvolution,” IEEE Signal Processing Magazine, vol. 13, no. 3, pp. 43–64, 1996.
- B. Zitová and J. Flusser, “Image registration methods: A survey,” Image and Vision Computing, vol. 21, no. 11, pp. 977–1000, 2003.
- S. Lončarić, “A survey of shape analysis techniques,” Pattern Recognition, vol. 31, no. 8, pp. 983–1001, 1998.
- D. Zhang and G. Lu, “Review of shape representation and description techniques,” Pattern Recognition, vol. 37, no. 1, pp. 1–19, 2004.
-
F. B. Neal and J. C. Russ, Measuring Shape. CRC Pres, 2012.
10.1201/b12092 Google Scholar
- I. T. Young, J. E. Walker, and J. E. Bowie, “An analysis technique for biological shape. I,” Information and Control, vol. 25, no. 4, pp. 357–370, 1974.
- J. Žunić, K. Hirota, and P. L. Rosin, “A Hu moment invariant as a shape circularity measure,” Pattern Recognition, vol. 43, no. 1, pp. 47–57, 2010.
-
P. L. Rosin and J. Žunić, “2D shape measures for computer vision,” in Handbook of Applied Algorithms : Solving Scientific, Engineering and Practical Problems (A. Nayak and I. Stojmenovic, eds.), pp. 347–372, Wiley, 2008.
10.1002/9780470175668.ch12 Google Scholar
- P. L. Rosin and C. L. Mumford, “A symmetric convexity measure,” Artificial Intelligence, vol. 103, no. 2, pp. 101–111, 2006.
- P. L. Rosin and J. Žunić, “Measuring rectilinearity,” Computer Vision and Image Understanding, vol. 99, no. 2, pp. 175–188, 2005.
- P. L. Rosin, “Measuring sigmoidality,” Pattern Recognition, vol. 37, no. 8, pp. 1735–1744, 2004.
- J. Žunić and P. L. Rosin, “A new convexity measure for polygons,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 7, pp. 923–934, 2004.
- T. Peli, “An algorithm for recognition and localization of rotated and scaled objects,” Proceedings of the IEEE, vol. 69, no. 4, pp. 483–485, 1981.
- A. Goshtasby, “Description and discrimination of planar shapes using shape matrices,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 7, no. 6, pp. 738–743, 1985.
- A. Taza and C. Y. Suen, “Description of planar shapes using shape matrices,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 19, no. 5, pp. 1281–1289, 1989.
-
H. Freeman, “On the encoding of arbitrary geometric configurations,” IRE Transactions on Electronic Computers, vol. 10, no. 2, pp. 260–268, 1961.
10.1109/TEC.1961.5219197 Google Scholar
-
H. Freeman, “Computer processing of line-drawing images,” ACM Computing Surveys, vol. 6, no. 1, pp. 57–97, 1974.
10.1145/356625.356627 Google Scholar
- H. Freeman, “Lines, curves, and the characterization of shape,” in International Federation for Information Processing (IFIP) Congress (S. H. Lavington, ed.), pp. 629–639, Elsevier, 1980.
- C. T. Zahn and R. Z. Roskies, “Fourier descriptors for plane closed curves,” IEEE Transactions on Computers, vol. C-21, no. 3, pp. 269–281, 1972.
- C. C. Lin and R. Chellapa, “Classification of partial 2-D shapes using Fourier descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp. 686–690, 1987.
- K. Arbter, W. E. Snyder, H. Burkhardt, and G. Hirzinger, “Application of affine-invariant Fourier descriptors to recognition of 3-D objects,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 640–647, 1990.
- J. Zhang and T. Tan, “Brief review of invariant texture analysis methods,” Pattern Recognition, vol. 35, no. 3, pp. 735–747, 2002.
-
M. Petrou and P. G. Sevilla, Image Processing: Dealing with Texture. Wiley, 2006.
10.1002/047003534X Google Scholar
-
R. M. Haralick, K. Shanmugam, and I. Dinstein, “Textural features for image classification,” IEEE Transactions on Systems, Man and Cybernetics, vol. 3, no. 6, pp. 610–621, 1973.
10.1109/TSMC.1973.4309314 Google Scholar
- T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of texture measures with classification based on feature distributions,” Pattern Recognition, vol. 19, no. 3, pp. 51–59, 1996.
- T. Ojala, M. Pietikäinen, and T. Mäenpää, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002.
- J. Fehr and H. Burkhardt, “3D rotation invariant local binary patterns,” in 19th International Conference on Pattern Recognition ICPR'08, pp. 1–4, IEEE, 2008.
- K. J. Dana, B. van Ginneken, S. K. Nayar, and J. J. Koenderink, “Reflectance and texture of real-world surfaces,” ACM Transactions on Graphics, vol. 18, no. 1, pp. 1–34, 1999.
- M. Haindl and J. Filip, Visual Texture . Advances in Computer Vision and Pattern Recognition, London, UK: Springer-Verlag, 2013.
- J. Filip and M. Haindl, “Bidirectional texture function modeling: A state of the art survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1921–1940, 2009.
- S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 3rd ed., 2008.
- G. C.-H. Chuang and C.-C. J. Kuo, “Wavelet descriptor of planar curves: Theory and applications,” IEEE Transactions on Image Processing, vol. 5, no. 1, pp. 56–70, 1996.
- P. Wunsch and A. F. Laine, “Wavelet descriptors for multiresolution recognition of handprinted characters,” Pattern Recognition, vol. 28, no. 8, pp. 1237–1249, 1995.
- Q. M. Tieng and W. W. Boles, “An application of wavelet-based affine-invariant representation,” Pattern Recognition Letters, vol. 16, no. 12, pp. 1287–1296, 1995.
- M. Khalil and M. Bayeoumi, “A dyadic wavelet affine invariant function for 2D shape recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1152–1163, 2001.
- S. Osowski and D. D. Nghia, “Fourier and wavelet descriptors for shape recognition using neural networks –a comparative study,” Pattern Recognition, vol. 35, no. 9, pp. 1949–1957, 2002.
- G. V. de Wouwer, P. Scheunders, S. Livens, and D. V. Dyck, “Wavelet correlation signatures for color texture characterization,” Pattern Recognition, vol. 32, no. 3, pp. 443–451, 1999.
- A. Laine and J. Fan, “Texture classification by wavelet packet signatures,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1186–1191, 1993.
-
S. G. Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 1, pp. 674–693, 1989.
10.1109/34.192463 Google Scholar
- K. Huang and S. Aviyente, “Wavelet feature selection for image classification,” IEEE Transactions on Image Processing, vol. 17, no. 9, pp. 1709–1720, 2008.
- C. Papageorgiou and T. Poggio, “A trainable system for object detection,” International Journal of Computer Vision, vol. 38, no. 1, pp. 15–33, 2000.
- E. Wilczynski, Projective Differential Geometry of Curves and Ruled Surfaces. Leipzig, Germany: B. G. Teubner, 1906.
- I. Weiss, “Noise resistant invariants of curves,” in Geometric Invariance in Computer Vision (J. L. Mundy and A. Zisserman, eds.), (Cambridge, Massachusetts, USA), pp. 135–156, MIT Press, 1992.
- I. Weiss, “Projective invariants of shapes,” in Proceedings of the Computer Vision and Pattern Recognition CVPR '88, pp. 1125–1134, IEEE, 1988.
- A. M. Bruckstein, E. Rivlin, and I. Weiss, “Scale space semi-local invariants,” Image and Vision Computing, vol. 15, no. 5, pp. 335–344, 1997.
- W. S. Ibrahim Ali and F. S. Cohen, “Registering coronal histological 2-D sections of a rat brain with coronal sections of a 3-D brain atlas using geometric curve invariants and B-spline representanion,” IEEE Transactions on Medical Imaging, vol. 17, no. 6, pp. 957–966, 1998.
- O. Horáček, J. Kamenický, and J. Flusser, “Recognition of partially occluded and deformed binary objects,” Pattern Recognition Letters, vol. 29, no. 3, pp. 360–369, 2008.
-
Y. Lamdan, J. Schwartz, and H. Wolfson, “Object recognition by affine invariant matching,” in Proceedings of the Computer Vision and Pattern Recognition CVPR'88, pp. 335–344, IEEE, 1988.
10.1109/CVPR.1988.196257 Google Scholar
- F. Krolupper and J. Flusser, “Polygonal shape description for recognition of partially occluded objects,” Pattern Recognition Letters, vol. 28, no. 9, pp. 1002–1011, 2007.
- Z. Yang and F. Cohen, “Image registration and object recognition using affine invariants and convex hulls,” IEEE Transactions on Image Processing, vol. 8, no. 7, pp. 934–946, 1999.
- J. Flusser, “Affine invariants of convex polygons,” IEEE Transactions on Image Processing, vol. 11, no. 9, pp. 1117–1118, 2002.
- C. A. Rothwell, A. Zisserman, D. A. Forsyth, and J. L. Mundy, “Fast recognition using algebraic invariants,” in Geometric Invariance in Computer Vision (J. L. Mundy and A. Zisserman, eds.), pp. 398–407, MIT Press, 1992.
- F. Mokhtarian and S. Abbasi, “Shape similarity retrieval under affine transforms,” Pattern Recognition, vol. 35, no. 1, pp. 31–41, 2002.
- D. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the International Conference on Computer Vision ICCV'99, vol. 2, pp. 1150–1157, IEEE, 1999.
- N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proceedings of the Conference on Computer Vision and Pattern Recognition CVPR'05, vol. 1, pp. 886–893, IEEE, 2005.
- H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded-up robust features (SURF),” Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346–359, 2008.
- J. L. Mundy and A. Zisserman, Geometric Invariance in Computer Vision. Cambridge, Massachusetts, USA: MIT Press, 1992.
- T. Suk and J. Flusser, “Vertex-based features for recognition of projectively deformed polygons,” Pattern Recognition, vol. 29, no. 3, pp. 361–367, 1996.
- R. Lenz and P. Meer, “Point configuration invariants under simultaneous projective and permutation transformations,” Pattern Recognition, vol. 27, no. 11, pp. 1523–1532, 1994.
- N. S. V. Rao, W. Wu, and C. W. Glover, “Algorithms for recognizing planar polygonal configurations using perspective images,” IEEE Transactions on Robotics and Automation, vol. 8, no. 4, pp. 480–486, 1992.
- D. Hilbert, Theory of Algebraic Invariants. Cambridge, U.K.: Cambridge University Press,1993.
- J. H. Grace and A. Young, The Algebra of Invariants. Cambridge University Press, 1903.
- G. B. Gurevich, Foundations of the Theory of Algebraic Invariants. Groningen, The Netherlands: Nordhoff, 1964.
-
I. Schur, Vorlesungen über Invariantentheorie. Berlin, Germany: Springer, 1968. (in German).
10.1007/978-3-642-95032-2 Google Scholar
- M.-K. Hu, “Visual pattern recognition by moment invariants,” IRE Transactions on Information Theory, vol. 8, no. 2, pp. 179–187, 1962.
-
G. A. Papakostas, ed., Moments and Moment Invariants - Theory and Applications. Science Gate Publishing, 2014.
10.15579/gcsr.vol1 Google Scholar
- R. Mukundan and K. R. Ramakrishnan, Moment Functions in Image Analysis. Singapore: World Scientific, 1998.
- M. Pawlak, Image Analysis by Moments: Reconstruction and Computational Aspects. Wrocław, Poland: Oficyna Wydawnicza Politechniki Wrocławskiej, 2006.
-
J. Flusser, T. Suk, and B. Zitová, Moments and Moment Invariants in Pattern Recognition. Chichester, U.K.: Wiley, 2009.
10.1002/9780470684757 Google Scholar
- R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. Wiley Interscience, 2nd ed., 2001.
- S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 4th ed., 2009.
- V. N. Vapnik, “Pattern recognition using generalized portrait method,” Automation and Remote Control, vol. 24, pp. 774–780, 1963.
- V. N. Vapnik and C. Cortes, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
- B. E. Boser, I. M. Guyon, and V. N. Vapnik, “A training algorithm for optimal margin classifiers,” in Proceedings of the fifth annual workshop on Computational learning theory COLT'92, pp. 144–152, ACM Press, 1992.
-
K.-B. Duan and S. S. Keerthi, “Which is the best multiclass SVM method? an empirical study,” in Multiple Classifier Systems MCS'05 (N. C. Oza, R. Polikar, J. Kittler, and F. Roli, eds.), vol. 3541 of Lecture Notes in Computer Science, (Berlin, Heidelberg, Germany), pp. 278–285, Springer, 2005.
10.1007/11494683_28 Google Scholar
- F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychological Review, vol. 65, no. 6, pp. 386–408, 1958.
-
B. D. Ripley, Pattern Recognition and Neural Networks. Cambridge University Press, 3rd ed., 1996.
10.1017/CBO9780511812651 Google Scholar
- K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, no. 4, pp. 193–202, 1980.
- Stanford Vision Lab, “Imagenet large scale visual recognition challenge (ILSVRC),” 2015. http://www.image-net.org/challenges/LSVRC/ .
- D. C. Cireşan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber, “Flexible, high performance convolutional neural networks for image classification,” in Proceedings of the 22nd international joint conference on Artificial Intelligence IJCAI'11, vol. 2, pp. 1237–1242, AAAI Press, 2011.
- Algoritmy.net, “Letter frequency (English),” 2008–2015. http://en.algoritmy.net/article/40379/Letter-frequency-English .
- J. Kittler, M. Hatef, R. P. Duin, and J. Matas, “On combining classifiers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226–239, 1998.
-
L. Prevost and M. Milgram, “Static and dynamic classifier fusion for character recognition,” in Proceedings of the Fourth International Conference on Document Analysis and Recognition ICDAR'97, vol. 2, pp. 499–506, IEEE, 1997.
10.1109/ICDAR.1997.620549 Google Scholar
- Q. Fu, X. Q. Ding, T. Z. Li, and C. S. Liu, “An effective and practical classifier fusion strategy for improving handwritten character recognition,” in Ninth International Conference on Document Analysis and Recognition ICDAR'07, vol. 2, pp. 1038–1042, IEEE, 2007.
- I. N. Dimou, G. C. Manikis, and M. E. Zervakis, “Classifier fusion approaches for diagnostic cancer models,” in Engineering in Medicine and Biology Society EMBS'06., pp. 5334–5337, IEEE, 2006.
- D. Wang, J. M. Keller, C. A. Carson, K. K. McAdoo-Edwards, and C. W. Bailey, “Use of fuzzy-logic-inspired features to improve bacterial recognition through classifier fusion,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 28, no. 4, pp. 583–591, 1998.
- C.-X. Zhang and R. P. W. Duin, “An experimental study of one- and two-level classifier fusion for different sample sizes,” Pattern Recognition Letters, vol. 32, no. 14, pp. 1756–1767, 2011.
- L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms. Wiley, 2004.
-
K. Fukunaga, Introduction to Statistical Pattern Recognition. Computer Science and Scientific Computing, Morgan Kaufmann, Academic Press, 2nd ed., 1990.
10.1016/B978-0-08-047865-4.50011-9 Google Scholar
- P. Somol, P. Pudil, and J. Kittler, “Fast branch & bound algorithms for optimal feature selection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 7, pp. 900–912, 2004.
- P. Pudil, J. Novovičová, and J. Kittler, “Floating search methods in feature selection,” Pattern Recognition Letters, vol. 15, no. 11, pp. 1119–1125, 1994.
- P. Somol, P. Pudil, J. Novovičová, and P. Paclík, “Adaptive floating search methods in feature selection,” Pattern Recognition Letters, vol. 20, no. 11–13, pp. 1157–1163, 1999.
- J. Novovičová, P. Pudil, and J. Kittler, “Divergence based feature selection for multimodal class densities,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 1, pp. 218–223, 1996.
- R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1–2, pp. 273–324, 1997.