::Home|Research Profile|Multimedia Information Retrieval |
|
|
Indexing and retrieval of digital video is a very active research area. Temporal video segmentation is an important step in many video processing applications. The growing amount of digital video footage is driving the need for more effective methods for shot classification, summarization, efficient access, retrieval, and browsing of large video databases. Shot boundary detection is the first step towards further analysis of the video content.
Two methods for shot boundary detection have been developed. The first approach to shot transition detection in the uncompressed image domain, we have developed, is based on the mutual information and the joint entropy between two consecutive video frames.
The detection technique was tested on the TRECVID2003 video test set having different types of shots and containing significant object and camera motion inside the shots. The application of these entropy-based techniques for shot cut detection was experimentally proven to be very efficient, since they produce false acceptance rates very close to zero. The second approach to automated shot boundary detection is using singular value decomposition (SVD). We have used SVD for its capabilities to derive a refined low dimensional feature space from a high dimensional raw feature space, where pattern similarity can easily be detected.
The method can detect cuts and gradual transitions, such as dissolves, fades and wipes. The detection technique was tested on TV video sequences having various types of shots and significant object and camera motion inside the shots. The experiments demonstrated that, by using the projected feature space we can efficiently differentiate between gradual transitions and cuts, pans, object or camera motion, while most of the methods based on histograms fail to characterize these types of video transitions.
-
Z. Cernekova, I. Pitas and C. Nikou, "Information theory-based shot cut/fade detection and video summarization", IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no.1, page(s): 82- 91, January 2006. Z.Cernekova, C.Kotropoulos and I.Pitas, "Video Shot Segmentation using Singular Value Decomposition", in Proc. of 2003 IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. III, pp. 181-184, Hong-Kong, April 2003 (appears also in Proc. IEEE Multimedia and Expo 2003 (ICME), pp. 301-304, Baltimore , July 2003). Z.Cernekova, C.Kotropoulos and I.Pitas, "Video Shot Boundary Detection using Singular Value Decomposition", in Proc. of 4th European Workshop on Image Analysis for Multimedia Interactive Services(WIAMIS-2003), London, April 2003.
MOUMIR - "Models for Unified Multimedia Information Retrieval", RTN, EC MUSCLE - “Multimedia Understanding through Semantics, Computation and LEarning” (FP6-507752) VISNET - European Network of Excellence, funded under the European Commission IST FP6 programme COST211 - "Redundancy Reduction Techniques and Content Analysis for Multimedia Services" |
|||||||||||||
© 2006 |
|||||||||||||
One of the most fundamental technologies necessary for the management of digital video is the retrieval (from a video database) of one or more video segments that the user is interested in. The methods used for approaching video retrieval are similar to those used for the retrieval of other types of multimedia objects, such as images. Retrieval usually follows one of two paradigms:
Another technology which is useful for the management of video, particularly with respect to rights protection, is fingerprinting. This is defined as the identification of a video segment using a representation called fingerprint , which is extracted from the video content. The fingerprint must uniquely identify a video segment, and must be invariant to manipulation of the video.
The general idea of our approach is that the existence of faces of specific individuals can be used to characterize a video segment. Assuming that the faces in the video have been detected and identified, the video signature (or fingerprint) consists of quartets of the following:
Given the above representation, we compute the similarity of two videos, for a certain displacement d:
where Fi(n,m) is the certainty that person m appears in frame n of video segment i. Using this representation, our algorithm for retrieval is as follows:
Downloads - Relevant Publications C. Cotsaces, N. Nikolaidis and I. Pitas, "The use of face indicator functions for video indexing and fingerprinting", in Proc. of Int. Workshop on Content-Based Multimedia Indexing (CBMI 2005), Riga, Latvia, 21-23 June, 2005. C. Cotsaces, N.Nikolaidis and I.Pitas, "Video Indexing by Face Occurrence-based Signatures", in Proc. of 2006 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Toulouse, France, 14-19 May, 2006.
MUSCLE - “Multimedia Understanding through Semantics, Computation and LEarning” (FP6-507752) |
|||||||||||||
© 2006 |
|||||||||||||
The ever-growing amount of digital information has created a critical need for the development of assisting data management algorithms. Scene change detection is employed in order to manage large volumes of audio-visual data. Typically it is a tool aiming to group audio-visual data into meaningful categories and thus provide fast browsing and retrieval capabilities. Video shot and scene detection is essential to automatic content-based video segmentation. A video shot is a collection of video frames obtained through a continuous camera recording. Similar background and motion patterns typify the set of frames within a shot. Video shots usually lead to a far too fine segmentation in terms of the semantic audio-visual data representation. In order to acquire an effective non-linear access to video information, the data are grouped into scenes , where scenes are defined as sequences of related shots chosen according to certain semantic rules.
A novel scene change detection method has been developed where
In order to integrate audio and video information
The method has been tested on the well-established TRECVID2003 database. The results are very promising as higher Recall and Precision rates have been attained than the ones recorded by all contemporary algorithms our algorithm competed against.
-
M. Kyperountas, Z. Cernekova, C. Kotropoulos, M. Gavrielides, and I. Pitas, “Audio PCA in a novel multimedia scheme for scene change detection”, in Proc. of ICASSP 2004, Montreal, May 2004. M. Kyperountas, Z. Cernekova, C. Kotropoulos, M. Gavrielides, and I. Pitas, “Scene change detection using audiovisual clues”, in Proc. of Norwegian Conference on Image Processing and Pattern Recognition (NOBIM 2004), Stavanger, Norway, 27-28 May 2004. M. Kyperountas, C. Kotropoulos and I. Pitas, “Enhanced eigen-audioframes for audiovisual scene change detection”, IEEE Transactions on Multimedia, accepted in 2006.
MOUMIR - "Models for Unified Multimedia Information Retrieval", RTN, EC MUSCLE - “Multimedia Understanding through Semantics, Computation and LEarning” (FP6-507752) VISNET - European Network of Excellence, funded under the European Commission IST FP6 programme |
|||||||||||||
© 2006 |