Anthropocentric Video Content Description

MPEG-7 has emerged as the standard for multimedia data content description. As it is in his early age, it tries to evolve in a direction in which semantic content description can be implemented. Although many descriptors (Ds) and description schemes (DSs) provided by the MPEG-7 standard can help to implement semantics of a media, grouping together several mpeg-7 classes can provide better results in the video production and video analysis tasks.


Our Method

We provide some classes to extend the mpeg-7 standard so it can handle, in a more uniform way, the video media data. Several classes are proposed in this context and we prove that this kind of schemes can provide more flexible tools

By those new descriptors we achieve:

  • An Anthropocentric Perspective for Movies.
  • We introduce Descriptors and Description Schemes, in order to manipulate in better way low level information, and thus provide semantic entities.
  • The relations between objects within a movie are very informative for high level information extraction.
  • Information like “This actor is in this shot and smiling” can be ingested in the proposed profile and in a post process of this information one can extract semantics for this shot.

Main Characteristics:

  • Descriptors (Ds) and Description Schemes (DSs) which are gathering low information within their tags.
  • Tags which are selected to suit research areas like face detection, object tracking, motion detection, facial expression extraction etc.
  • The organization of the aforementioned low level information will produce objects with meaning in order to extract high level information

The Descriptors and Description Schemes

Class Name


Movie Class

Container Class

Version Class

Container Class

Scene Class

Container Class

Shot Class

Container Class

Take Class

Container Class

Frame Class

Object Class

Sound Class

Container Class

Actor Class

Object Class

Object Appearance Class

Event Class

High Order Semantic Class

Container Class

Camera Class

Object Class

Camera Use Class

Event Class

Lens Class

Object Class

Anthropocentric View

  • Simple annotation can provide information about everything but it can not be generated automatically.
  • The annotation process is therefore subjective as in all manual annotation processes and also demands for an intensive labor.
  • The proposed profile aims at providing support for combining calculated low level features into higher level semantic entities.





Relevant Publications

N. Vretos, V. Solachidis and I. Pitas, "An Anthropocentric Description Scheme For Movies Content Classification And Indexing" , in Proc. of European Signal Processing Conf. (EUSIPCO 2005) , Antalya, Turkey, 4-8 September, 2005.

N. Vretos, V. Solachidis and I. Pitas, "An MPEG-7 Based Description Scheme For Video Analysis using Anthropocentric Video Content Descriptors", in Lecture Notes in Computer Science, Advances in Informatics: 10th Panhellenic Conf. on Informatics, PCI 2005 , vol. 3746 / 2005, pp. 725 - 734, Volos, Greece, 11-13 November, 2005.


Research Projects

NM2 - “New media for a new millennium” (IST-004124), FP6S


© 2006