Semantic Analysis for Multimedia Security Application

This chapter presents the current state - of - the - art in semantic analysis of video data and its application in the areas of surveillance and infrastructure security. In this area, numerous semantic computing challenges encountered at different stages of video data processing, including low - level image processing, object identification, motion detection, tracking, event modeling/ representation, and event classification, have been elaborated. Most of the surveillance applications of multimedia deal with low - resolution noisy video data for which events of interest are generally characterized by motion activities. The chapter elaborates numerous technical challenges in video processing and have presented a broad review of existing techniques for motion - based semantic analysis of multimedia data.

Controlled Vocabulary Terms

multimedia databases; security; semantic Web

REFERENCES

D. Taylor, In the news, IEEE Intell. Syst., 2006, p. 102.
Google Scholar
W. Al-Khatib, F. Day, A. Ghafoor, and P. B. Berra, Semantic modeling and knowledge representation in multimedia systems, IEEE Trans. Knowledge Data Eng., 11 (1): 64–80, 1999.
10.1109/69.755616
Web of Science® Google Scholar
S. Dagtas, W. Al-Khatib, A. Ghafoor, and R. L. Kashyap, Models for motion-based video indexing and retrieval, IEEE Trans. Image Process., Special Issue on Image Processing for Digital Libraries, 1 (9): 88–101, 2000.
10.1109/83.817601
Google Scholar
A. Ghafoor, Z. Zhang, Z. Zhou, and M. Lew, Guest editors'introduction to the special issue: Machine learning approaches to multimedia information retrieval,” ACM Multimedia Sys. J., August 2006, pp. 1–2.
10.1007/s00530-006-0040-2
Web of Science® Google Scholar
S.-C. Chen, M.-L. Shyu, S. Peeta, and C. Zhang, Learning-based spatio-temporal vehicle tracking and indexing for transportation multimedia database systems, IEEE Trans. Intell. Transport. Syst., 4 (3): 154–167, 2003.
10.1109/TITS.2003.821290
Web of Science® Google Scholar
M. Chen, S.-C. Chen, M.-L. Shyu, and K. Wickramaratna, Semantic event detection via temporal analysis and multimodal data mining, IEEE Signal Process. Mag., Special Issue on Semantic Retrieval of Multimedia, 23 (2): 38–46, 2006.
10.1109/MSP.2006.1621447
Web of Science® Google Scholar
C. Carson, S. Belongie, H. Greenspan, and J. Malik, Region-based image querying in Proc. Computer Vision and Pattern Recognition (CVPR), Workshop on Content - based Access of Image and Video Libraries, 1997.
Google Scholar
L. Fuentesa, Assessment of image processing techniques as a means of improving personal security in public transport, EPSRC Internal Report, April 2002.
Google Scholar
Q. Iqbal and J. K. Aggarwal, Using structure in content-based image retrieval, in Proc. of the IASTED International Conference Signal and Image Processing (SIP), Nassau, Bahamas, October 18–21, 1999, pp. 129–133.
Google Scholar
S. Gong, J. Ng, and J. Sherrah, On the semantics of visual behaviour, structured events, and trajectories of human action, Image and Vision Computing, 20 : 873–888, 2002.
10.1016/S0262-8856(02)00096-3
Web of Science® Google Scholar
J. Monaco, How to Read a Film: The Art, Technology, Language, History, and Theory of Film and Media, Oxford University Press, New York, 1977.
Google Scholar
M. R. Naphade, R. Mehrotra, A. M. Fermant, J. Warnick, T. S. Huang, and A. M. Tekalp, A high performance shot boundary detection algorithm using multiple cues, in Proc. IEEE International Conference on Image Processing, Vol. 2, October 1998, pp. 884–887.
Google Scholar
J. S. Borecsky and L. A. Rowe, Comparison of video shot boundary detection techniques, Proc. SPIE, 26670 : 170–179, 1996.
10.1117/12.234794
Web of Science® Google Scholar
S. V. Porter, M. Mirmehdi, and B. T. Thomas, Video cut detection using frequency domain correlation, in Proc. 15th International Conference on Pattern Recognition, IEEE Computer Society, September 2000, pp. 413–416.
Web of Science® Google Scholar
X. M. Liu and T. Chen, Shot boundary detection using temporal statistics modeling, Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2002, Orlando, FL, May 2002.
Google Scholar
J. H. Oh, K. A. Hua, and N. Liang, A content-based scene change detection and classification technique using background tracking, in Proc. of IS & T/SPIE Conference on Multimedia Computing and Networking 2000, January 24–28, 2000, pp. 254–265.
Google Scholar
J.-Y. Chen, C. Taskiran, A. Albiol, E. J. Delp, and C. A. Bouman, ViBE: A compressed video database structured for active browsing and search, IEEE Trans. Multimedia, 6 (1): 103–118, 2004.
10.1109/TMM.2003.819783
Web of Science® Google Scholar
A. Hanjalic, Shot-boundary detection: unraveled and resolved? IEEE Trans. Circuits Syst. Video Technol., 12 (2): 90–105, 2002.
10.1109/76.988656
Web of Science® Google Scholar
D. Lelescu and D. Schonfeld, Statistical sequential analysis for real-time scene change detection on compressed multimedia bitstream, IEEE Trans. on Multimedia, 5 : 106–107, 2003.
10.1109/TMM.2003.808819
Web of Science® Google Scholar
A. Nagasaka and Y. Tanaka, Automatic video indexing and full-video search for object appearances, Visual Database Syst., II, 33 (4): 543–550, 1992.
Google Scholar
H. Zhang, J. Wu, D. Zhong, and S. W. Smoliar, An integrated system for content - based video retrieval and browsing, Pattern Recognition, 30 (4): 643–658, 1997.
10.1016/S0031-3203(96)00109-4
Web of Science® Google Scholar
T. Lin, H. J. Zhang, and Q.-Y. Shi, Video content representation for shot retrieval and scene extraction, Int. J. Image Graphics, 1 (3): 507–526, 2001.
10.1142/S0219467801000293
Google Scholar
D. Lelescu and D. Schonfeld, Video skimming and summarization based on principal component analysis, in Proc. IFIP/IEEE International Conference on Management of Multimedia Networks and Services, 2001, pp. 128–141.
Google Scholar
M. R. Naphade, T. Kristjansson, B. Frey, and T. S. Huang, Probabilistic multimedia objects multijets: A novel approach to indexing and retrieval in multimedia systems, in Proc. IEEE International Conference on Image Processing, Vol. 3, Chicago, IL, October 1998, pp. 536–540.
Google Scholar
S. F. Chang, H. Chen, J. Meng, H. Sundaram, and D. Zhong, A fully automated content-based video search engine supporting spatiotemporal queries,” IEEE Trans. on Circuits Syst. Video Technol., 8 (5): 602–615, 1998.
10.1109/76.718507
Web of Science® Google Scholar
A. Yilmaz, O. Javed, and M. Shah, Object tracking: A survey, ACM Comput. Surv., 38 (4): 1–45, 2006.
10.1145/1177352.1177355
Web of Science® Google Scholar
Y. F. Day, S. D. Dagtas, M. Iino, A. Khokhar, and A. Ghafoor, Spatio-temporal modeling of video data for on-line object-oriented query processing, in Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Washington, DC, May 1995, pp. 98–105.
Google Scholar
Y. F. Day, S. D. Dagtas, M. Iino, A. Khokhar, and A. Ghafoor, A multi-level abstraction and modeling in video database, ACM/Springer-Verlag J. Multimedia Syst., 7 (5): 409–423, 1999.
10.1007/s005300050142
Web of Science® Google Scholar
L. Fuentesa and S. Velastinb, People tracking in surveillance applications, Image and Vision Computing, 24 : 1165–1171, 2006.
10.1016/j.imavis.2005.06.006
Web of Science® Google Scholar
Y. F. Day, S. D. Dagtas, M. Iino, A. Khokhar, and A. Ghafoor, Object-oriented conceptual modeling of video data, in Proceedings of the IEEE International Conference on Data Engineering, Taipei, Taiwan, March 1995, pp. 401–408.
Google Scholar
N. Dimitrova and F. Golshani, Motion recovery for video content classification, ACM Trans. Information Syst., 13 (4): 408–439, 1995.
10.1145/211430.211433
Web of Science® Google Scholar
D. Schonfeld and D. Lelescu, VORTEX: Video retrieval and tracking from compressed multimedia databases — Multiple object tracking from MPEG-2 bitstream,” J. Vis. Commun. Image Representation, Special Issue on Multimedia Database Management, 11 : 154–182, 2000.
10.1006/jvci.1999.0432
Web of Science® Google Scholar
D. Schonfeld, K. Hariharakrishnan, P. Raffy, and F. Yassa, Object tracking using adaptive block matching, in Proc. IEEE International Conference on Multimedia and Expo (ICME), Baltimore, Maryland, 2003.
Google Scholar
W. Chen and S. F. Chang, Motion trajectory matching of video objects, in Proc. IS & T/SPIE, 2000, pp. 544–553.
Google Scholar
F. I. Bashir, A. A. Khokhar, and D. Schonfeld, Real-time motion trajectory-based indexing and retrieval of video sequences, IEEE Trans. Multimedia, 9 (1): 58–65, 2007.
10.1109/TMM.2006.886346
Web of Science® Google Scholar
F. I. Bashir, A. A. Khokhar, and D. Schonfeld, Segmented trajectory based indexing and retrieval of video data, in Proc. of IEEE International Conference on Image Processing, 2003, pp. 623–626.
Google Scholar
Y. Yacoob and M. J. Black, Parameterized modelling and recognition of activities, in Proc. Computer Vision Image Understanding, 73 (2): 232–247, 1999.
10.1006/cviu.1998.0726
Web of Science® Google Scholar
B. Katz, J. Lin, C. Stauffer, and E. Grimson, Answering questions about moving objects in surveillance videos, in Proc. of 2003 AAAI Spring Symposium on New Directions in Question Answering, 2003.
Google Scholar
S. Hongeng, R. Nevatia, and F. Bremond, Video-based event recognition: Activity representation and probabilistic recognition methods, Computer Vision and Image Understanding, 96 : 129–162, 2004.
10.1016/j.cviu.2004.02.005
Web of Science® Google Scholar
C. B. Shim and J. W. Chang, Efficient similar trajectory-based retrieval for moving objects in video databases, in Proc. Conference on Image and Video Retrieval (CIVR) 2003, LNCS 2728, in 2003, pp. 163–173.
Google Scholar
J. Ben-Arie, Z. Wang, P. Pandit, and S. Rajaram, Human activity recognition using multidimensional indexing, Pattern Anal. Machine Intell. (PAMI), 24 (8): 1091–1104, 2002.
10.1109/TPAMI.2002.1023805
Web of Science® Google Scholar
A. Divakaran, K. Miyahara, K. Peker, R. Radhakrishnan, and Z. Xiong, Video mining using combinations of unsupervised and supervised learning techniques, paper presented at the SPIE Conference on Storage and Retrieval for Multimedia Databases, Vol. 5307, January 2004, pp. 235–243.
Google Scholar
J. Oh and B. Bandi, Multimedia data mining framework for raw video sequences, in MDM/KDD02: Third International Workshop on Multimedia Data Mining, July 23–26, 2002.
Google Scholar
C. Rao, A. Yilmaz, and M. Shah, View-invariant representation and recognition of actions, in Int. J. Computer Vision, 50 (2): 203–226, 2002.
10.1023/A:1020350100748
Web of Science® Google Scholar
L. Liao, D. Fox, and H. Kautz, Location-based activity recognition, in Proc. Ninth Neural Information Processing Systems (NIPS), 2005.
Google Scholar
S. Blunsden, E. Andrade, and R. Fisher, Non parametric classification of human interaction, in Proc. Third Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), Part II, LNCS 4478, 2007, pp. 347–354.
Google Scholar
A. Yilmaz and M. Shah, Recognizing human actions in videos acquired by uncalibrated moving cameras, in Proc. International Conference on Computer Vision (ICCV), 2005.
Google Scholar
V. Parameswaran and R. Chellappa, View invariants for human action recognition, in Proc. Computer Vision and Pattern Recognition (CVPR), 2003.
Google Scholar
C. Vogler and D. Metaxas, Parallel hidden markov models for American sign language recognition, in Proc. International Conference on Computer Vision (ICCV), 1999, pp. 116–122.
Google Scholar
A. Yilmaz and M. Shah, Action sketch: A novel action representation, in Proc. Computer Vision and Pattern Recognition (CVPR), 2005.
Google Scholar
J. Snoek, J. Hoey L. Stewart, R. Zemel, and A. Mihailidis, Automated detection of unusual events on stairs, Journal of Image and Vision Computing, 27 (1–2): 135–166, 2009.
Google Scholar
J. Lou, Q. Liu, T. Tan, and W. Hu, Semantic interpretation of object activities in a surveillance system, in Proc. 16th International Conference on Pattern Recognition, 2002.
Google Scholar
Y. Sheikh and M. Shah, Exploring the space of an action for human action recognition, in Proc. International Conference on Computer Vision (ICCV), 2005.
Google Scholar
E. Ustunel, D. Schonfeld, and A. Khokhar, Null-space representation for view - invariant motion trajectory classification-recognition and indexing-retrieval, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, NV, 2008.
Google Scholar
A. R. Mansouri, A. Mitiche, and R. E. Feghali, Spatio-temporal motion segmentation via level set partial differential equations, in Proc. 5th IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI'02), 2002, pp. 243–247.
Google Scholar
J. Min and R. Kasturi, Activity recognition based on multiple motion trajectories, in Proc. 17th International Conference on Pattern Recognition (ICPR'04), 4 : 199– 202, 2004.
Google Scholar
M. K. Shan and S. Y. Lee, Content-based video retrieval via motion trajectories, in Proc. SPIE, Electronic Imaging and Multimedia Systems II, Vol. 3561, 1998, pp. 52–61.
Google Scholar
M. R. Naphade, I. V. Kozintsev, and T. S. Huang, A factor graph framework for semantic video indexing, IEEE Trans. Circuits Syst. for Video Technol., 12 (1), 2002, pp. 191–201.
Google Scholar
A. B. Benitez, J. R. Smith, and S. F. Chang, MediaNet: A multimedia information network for knowledge representation, in Proc. SPIE Conference on Internet Multimedia Management Systems (IS & T/SPIE-2000), Vol. 4210, Boston, MA, November 6–8, 2000.
Google Scholar
T. Huang, D. Koller, J. Malik, G. Ogasawara, B. Rao, S. Russel, and J. Weber, Automatic symbolic traffic scene analysis using belief networks, J. AAI, 966–972, 1994.
Google Scholar
S.-C. Chen, M.-L. Shyu, and N. Zhao, An enhanced query model for soccer video retrieval using temporal relationships, in Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), Tokyo, Japan, April 5–8, 2005, pp. 1133–1134.
Google Scholar
W. Al-Khatib and A. Ghafoor, An approach for video meta-data modeling and query processing, in Proceedings of the 7th ACM Multimedia International Conference, Orlando, FL, October 30–November 5, 1999, pp. 215–224.
Google Scholar
D. A. Tran, K. A. Hua, and K. Vu, Semantic reasoning based video database systems, in Proc. 11th Intl. Conf. on Database and Expert Systems Applications, September 4–8, 2000, pp. 41–50.
Google Scholar
D. A. Tran, K. A. Hua, and K. Vu, VideoGraph: A graphical object-based model for representing and querying video data, in Proc. ACM Intl. Conference on Conceptual Modeling, 2000.
Google Scholar
Y. F. Day, A. Khokhar, and A. Ghafoor, A frame-work for semantic modeling of video data for content-based indexing and retrieval, ACM Multimedia, Orlando, FL, October 1999.
Google Scholar
C. Decleir, M. H. Hacid, and J. Kouloumdjian, A database approach for modeling and querying video data, in Proc. 15th International Conference on Data Engineering, Sydney, Australia, 1999.
Google Scholar
N. Kodali, C. Farkas, and D. Wikesekera, Enforcing semantics-aware security in multimedia surveillance, J. Data Semantics, LNCS 3360, 30 : 199–221, 2004.
Google Scholar
J. Ayars, Synchronized multimedia integration language, W3C recommendation, 2001.
Google Scholar
M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, Actions as space-time shapes, in Proc. International Conference on Computer Vision (ICCV), 2005.
Google Scholar
T. Catarci, M. F. Costabile, S. Levialdi, C. Batini, Visual query systems for databases: A survey, Technical Report Rapporto di Ricerca SI/RR 95/17, Dipartimento di Scienze dell'Informazione, Universita degli Studi di Roma, October 1995.
Google Scholar
J. M. Chambers, Computational Methods for Data Analysis, Wiley, New York, 1977.
Google Scholar
N. P. Cuntoor and R. Chellappa, Epitomic representation of human activities, in Proc. Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, June 2007.
Google Scholar
N. Dimitrova and F. Golshani, Px for semantic video database retrieval, in Proc. ACM Multimedia, San Francisco, 1994, pp. 219–226.
Google Scholar
M. Flickner, H. Sawhney, W. Niblack, J. Ashley, D. Steele, and P. Yanker, Query by image and video content: The QBIC system, IEEE Computer, 28 (9): 23–32, 1995.
10.1109/2.410146
Web of Science® Google Scholar
A. Gupta and L. S. Davis, Objects in action: An approach for combining action understanding and object perception, in Proc. Computer Vision and Pattern Recognition (CVPR), 2007.
Google Scholar
R. Hamid, S. Maddi, A. Bobick, and I. Essa, Unsupervised analysis of activity sequences using event-motifs, paper presented at the 4th ACM International Workshop on Video Surveillance & Sensor Networks (VSSN), Santa Barbara, CA, October 2006.
Google Scholar
J. Han and B. Bhanu, Human activity recognition in thermal infrared imagery, in Proc. 2nd Joint IEEE International Workshop on Object Tracking and Classification in and Beyond the Visible Spectrum (OTCBVS), 2005.
Google Scholar
S. S. Intille and A. F. Bobick, Recognizing planned, multiperson action, Computer Vision and Image Understanding, 81 : 414–445, 2001.
10.1006/cviu.2000.0896
Web of Science® Google Scholar
I. T. Jolliffe, Principal Component Analysis, Springer-Verlag, New York, 1986.
10.1007/978-1-4757-1904-8
Google Scholar
S. Kaushik and E. A. Rundensteiner, SVIQUEL: A spatial visual query and exploration language, in Proc. 9th International Conf. on Database and Expert Systems Applications–-DEXA'98, LNCS, Vol. 1460, 1998, pp. 290–299.
Google Scholar
A. Khokhar, E. Albuz, and E. Kocalar, Quantized CIELab*space and encoded spatial structure for scalable indexing of large color image archives, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2000, ICASSP'00, Vol. 6, 2000.
Google Scholar
K. V. Laerhoven and H. Gellersen, Spine versus Porcupine: A study in distributed wearable activity recognition, in Proc. International Semantic Web Conference (ISWC), 2004.
Google Scholar
B. Laxton, J. Lim, and D. Kriegman, Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video, in Proc. Computer Vision and Pattern Recognition (CVPR), 2007.
Google Scholar
D. Lelescu and D. Schonfeld, Real-time scene change detection on compressed multimedia bitstream based on statistical sequential analysis, in Proc. IEEE International Conference on Multimedia and Expo, 2000, pp. 1141–1144.
Google Scholar
X. Ma, F. I. Bashir, A. A. Khokhar, and D. Schonfeld, Event analysis based on multiple interactive motion trajectories, IEEE Trans. Circuits Syst. Video Technol., accepted for publication.
Google Scholar
J. C. Niebles, H. Wang, and L. Fei-Fei, Unsupervised learning of human action categories using spatial-temporal words, in Proc. British Machine Vision Conference (BMVC), 2006.
Google Scholar
D. J. Patterson, D. Fox, and H. Kautz, Fine-grained activity recognition by aggregating abstract object usage, in Proc. International Semantic Web Conference (ISWC), 2005.
Google Scholar
A. Pentland, R. W. Picard, and S. Sclaroff, PhotoBook : Content-based manipulation of image databases, Int. J. Computer Vision, 1996.
10.1007/BF00123143
Web of Science® Google Scholar
P. Peursum, S. Venkatesh, and G. West, Tracking-as-recognition for articulated full-body human motion analysis, in Proc. Computer Vision and Pattern Recognition (CVPR), 2007.
Google Scholar
N. Ravi, N. Dandekar, P. Mysore, and M. L. Littman, Activity recognition from accelerometer data, in Proc. Conference on Innovative Applications of Artificial Intelligence (IAAI), 2005.
Google Scholar
E. Sahouria and A. Zakhor, A trajectory based video indexing system for street surveillance, in Proc. IEEE Int. Conf. on Image Processing (ICIP), 1999.
Google Scholar
E. Shechtman and M. Irani, Space-time behavioral correlation, in Proc. Computer Vision and Pattern Recognition (CVPR), 2005.
Google Scholar
J. R. Smith and S. F. Chang, VisualSeek: A fully automated content-based image query system, in Proc. ACM Multimedia, 87–93, 1996.
Google Scholar
P. K. Turaga, A. Veeraraghavan, and R. Chellappa, From videos to verbs: Mining videos for activities using a cascade of dynamical systems, in Proc. Computer Vision and Pattern Recognition (CVPR), 2007.
Google Scholar
A. Veeraraghavan, R. Chellappa, and A. K. Roy-Chowdhury, The function space of an activity, in Proc. Computer Vision and Pattern Recognition (CVPR), 2006.
Google Scholar
D. White and R. Jain, Similarity indexing: Algorithms and performance, in Proc. SPIE Storage and Retrieval for Image and Video Databases, 1996.
Google Scholar
J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. M. Rehg, A scalable approach to activity recognition based on object use, in Proc. International Conference on Computer Vision (ICCV), 2007.
Google Scholar

Semantic Computing