::Home|Research Profile|Security/Biometrics

 

Facial Biometrics - Face Representation
Automatic Face Verification using Elastic Graph Matching (EGM)
Mobile Multimedia Processing for Emergency Handling
Face Detection - Facial Feature Extraction and Tracking
Face Verification
Compensation for Variable Recording Conditions
Speaker Verification

Facial Biometrics - Face Representation

AIIA Lab has developed a series of techniques for face representation in order to be used in facial biometrics and in particular in face verification problems, which are the main are of the lab's expertise.


Our Methods

A series of techniques for exploiting the discriminant information in part-based face decomposition has been developed. The idea of these decompositions is to represent facial images as a linear combination of basis images that are intuitive related to facial parts based on Non-negative Matrix Factorization (NMF). The NMF algorithm, like Principal Component Analysis (PCA), represents a facial image as a linear combination of basis images. The difference with PCA is that it does not allow negative elements either in the basis vectors or in the representation weights used in the linear combination of the basis images. This constraint results to radically different bases than PCA. On one hand, the bases of PCA are the Eigenfaces, some of which resemble distorted versions of the entire face. On the other hand, the bases of NMF are localized features that correspond better to the intuitive notion of face parts.

Two classes of techniques have been developed:

  • In the first class of techniques, we incorporate discriminant constraints inside the NMF decomposition. We have proposed two such techniques, both motivated by the fact that we want a part based decomposition with enhanced discriminant power. The first method gives basis images that are the same for all the different facial classes, which is called Discriminant NMF (DNMF), while the latter results to a class specific decomposition that is unique for each facial (person) class and is called Class-Specific DNMF (CSDNMF). The intuitive motivation behind the class-specific methods is to find for every face a unique decomposition into its own discriminant parts. These approaches are consistent with the image representation paradigms of neuroscience which involve sparseness, non-negative constraints, minimization of redundant information and enhanced discriminant power. The various part-based decompositions tested are presented below.

  

Decompositions of the XM2VTS database in basis images that look like facial parts a) Non-negative matrix factorization, b) Local Non-negative Matrix factorization c) Discriminant Non-negative matrix factorization d) Class Specific Discriminant Non-negative matrix factorization

  • The second class of techniques uses the NMF basis images in order to discover a low dimensional space and search for discriminant projections in this space. This is similar to Fisherfaces, where an initial PCA based dimensionality reduction step is used, before applying LDA in this new space for finding discriminant projections. This technique is called NMFfaces. A series of bases for holistic techniques are given below.

Holistic basis images from a) Principal Component Analysis b) Linear Discriminant Analysis c) Non-negative Matrix Factorization (NMF) plus Linear Discriminant Analysis (LDA) the so-called NMFFaces.

The experiments have been conducted in ‘Configuration 1' of XM2VTS database as well. The tested methods are separated to those that produce local, part-based bases like NMF, DNMF and CSDNMF and to those whose bases are distorted versions of faces like Eigenfaces, Fisherfaces and NMFfaces. The best EER for the part based decompositions like NMF, LNMF, CSDNMF and DNMF was 3.0% in terms of EER and has been achieved by CSDNMF. The best EER for the Fisherfaces, Eigenfaces and NMFfaces has been 0.8% and has been achieved by the NMFfaces. The advantage of the subspace methods like NMFface etc is that they are very fast (less that 1m sec for a matching) but require perfect alignment of the training and the test images in order to perform well. T he proposed part-based discriminant techniques outperform other part-based techniques like NMF, LNMF etc., while the proposed NMFfaces outperform the well known Eigenfaces and Fisherfaces.

ROC curves for the various tested approaches in the XM2VTS database a) for holistic face representations b) for part-based representations.


Downloads

-


Relevant Publications

S. Zafeiriou, A. Tefas, I. Buciu and I. Pitas, "Exploiting Discriminant Information in Non-negative Matrix Factorization with application to Frontal Face Verification", IEEE Transactions on Neural Networks, vol. 17, no. 3, pp. 683-695, May, 2006.

S. Zafeiriou, A. Tefas, I. Buciu and I. Pitas, "Class-Specific Discriminant Non-negative Matrix Factorization for Frontal Face Verification", in Proc. of Int. Conf. on Advances in Pattern Recognition (ICAPR 2005), Bath, United Kingdom, 22-25 August, 2005.

S. Zafeiriou, A. Tefas and I. Pitas, "Discriminant NMFfaces for Frontal Face Verification", in Proc. of IEEE Int. Workshop on Machine Learning for Signal Processing (MLSP 2005), Mystic, Connecticut, 28-30 September, 2005.


Research Projects

BioSec - Biometrics and Security, IST - IP, FP6

BioSecure - Biometrics for Secure Authentication, NOE, IST-FP6, EC.

top

© 2006

Automatic Face Verification using Elastic Graph Matching (EGM)

Face verification the procedure of establishing the veracity of an identity claim using one or more face images. It is a very difficult problem that spans various disciplines like computer vision, pattern recognition, computational intelligence etc.

A few major problems concerning face verification are:

  • Aging: As the face of a person is continuously changing during the passing of the years, the data contained in the training database of a system may become useless. Aging is probably the most challenging problem not only for face verification systems, but for any kind of biometric modality.
  • Illumination: The task of face verification is also difficult under variations in illumination and lighting conditions. The main reason is that the appearance of a face is dramatically altered by variations in illumination. It is worth noting that the best results for face verification, referred to literature, are given not just from very efficient algorithms but from a combination of a robust algorithm and databases created under controlled environments.
  • Head pose is another problem that face verification algorithms usually have to tackle. So far, most of the faces verification techniques referred to frontal facial pose.
  • Facial expression, image size and quality as facial hair, cosmetics and accessories (such as eyeglasses), may dramatically encumber the process, as they could easily change the characteristic features that an algorithm is trying to distinguish.


Our Methods

AIIA lab has developed a series of techniques for boosting the performance of EGM for face verification.

Discriminant Elastic Graph Matching using morphological features

A series of variants of the Elastic Graph Matching (EGM) algorithms have been developed. In EGM, the reference object graph is created by overlaying a rectangular elastic sparse graph on the object image and calculating a Gabor wavelet bank response or the output of morphological dilation-erosion at each graph node. The graph matching process is implemented by a stochastic optimization of a cost function which takes into account both jet similarities and node deformation. Recently we have proposed a generalized framework that uses discriminant techniques at all the phases of EGM for face verification, the so-called discriminant elastic graph matching (DEGM) algorithm. The general framework of DEGM that was developed in order to enhance the face verification performance of an arbitrary elastic graph matching algorithm can be summarized in the following three steps:

  • Employ discriminant techniques for feature selection at every node.
  • Combine both the node jet similarity measure and the node deformation exploiting discriminant information to form the new local similarity measure at each node.
  • Weigh the new similarity values at the nodes of the elastic graphs by coefficients that derive through a discriminant analysis scheme.

In our DEGM we have used a novel morphological multiscale analysis that is more robust against noise and illumination changes than the typical morphological multiscale analysis (we call it Discriminant Normalized Morphological Elastic Graph Matching (DNMEGM)).

The XM2VTS database has been used in our experiments since it is the state of the art database for comparing face verification technologies. The XM2VTS database contains 295 subjects, 4 recording sessions and two shots (repetitions) per recording session. The data is divided in three different sets: the training set, the evaluation set and the test set. The training set is used to create client and impostor models for each person. The evaluation set is used to learn the discriminant weights and the thresholds. In order to evaluate the DNMEGM algorithm the ‘Configuration I' experimental setup of the XM2VTS database was used and we have achieved an Equal Error Rate (EER) equal to 1.4% using fully automatic alignment.

Unlike the most subspace techniques that require perfect alignment to perform well, EGM algorithm can be combined with fully automatic ones. The drawback of the EGM algorithm is the computational complexity for building the needed multiscale analysis and matching. For example at a resolution of 720x576 the algorithm takes 7 seconds for feature extraction and matching. The abbreviations of the tested techniques are shown in Table 1. The error rates according to the XM2VTS protocol are shown in Table 2. In Table 3 and 4 the performance of the proposed method in comparison to other methods tested in fully automatic manner is shown. As it can be verified the proposed technique outperforms all the other approaches in Configuration I and II of the XM2VTS database.

Tested approaches

Error Rates according to XM2VTS protocol for Configuration I

 

A comparison of the achieved Total Error Rates for Configuration I in XM2VTS database

 
A comparison of the achieved Total Error Rates for Configuration II in XM2VTS database

 

Discriminant Graph Structure for Face Verification

A novel algorithm for finding discriminant person-specific facial models is proposed and tested for frontal face verification. The most discriminant features of a person's face are found and a deformable model is placed in the spatial coordinates that correspond to these discriminant features. The discriminant deformable models, for verifying the person's identity, that are learned through this procedure are elastic graphs that are dense in the facial areas considered discriminant for a specific person and sparse in other less significant facial areas. In order to find such graphs, we have introduced a heuristic cost optimization algorithm, which has as outcome the graph that optimizes a pre-selected discriminant cost. The cost is formed by calculating the significance of each node using discriminant values like the ones proposed. We assume that nodes with high discriminant values correspond to facial points with high discriminant capability. Ten, we try to represent, in a better way, the corresponding neighbourhood by adding more nodes round the original one. This practically means that we expand the nodes that are considered to be discriminant.

Some graphs that have been learned by the proposed algorithm with nodes placed at discriminant facial landmarks of each face

The proposed graphs have been applied in face verification in XM2VTS database and the results have been very close to the ones derived from DNMEGM (i.e. a TER =2.814 %) without using two all the discriminant steps of DNMEGM.


Downloads

-


Relevant Publications

C.Kotropoulos, A.Tefas and I.Pitas, "Morphological elastic graph matching applied to frontal face authentication under well-controlled and real conditions", Pattern Recognition, vol. 33, no. 12, pp. 1935-1947, October, 2000.

C.Kotropoulos,A.Tefas and I.Pitas, "Frontal face authentication using discriminating grids with morphological feature vectors", IEEE Transactions on Multimedia, vol. 2, no. 1, pp. 14-26, March, 2000.

C.Kotropoulos, A.Tefas and I.Pitas, "Frontal face authentication using morphological elastic graph matching", IEEE Transactions on Image Processing, vol. 9, no. 4, pp. 555-560, April, 2000.

A.Tefas, C.Kotropoulos and I.Pitas, "Using Support vector machines to enhance the performance of elastic graph matching for frontal face authentication", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 7, pp. 735-746, July, 2001.

A.Tefas, C.Kotropoulos and I.Pitas, "Face verification using elastic graph matching based on morphological signal decomposition", Signal Processing, vol. 82, no. 6, pp. 833-851, June, 2002.

S. Zafeiriou, A. Tefas and I. Pitas, "Elastic Graph Matching versus Linear Subspace Methods for Frontal Face Verification", in Proc. of IEEE Int. Workshop on Nonlinear Signal and Image Processing (NSIP 2005), Sapporo, Japan, 18-20 May, 2005.


Research Projects

BioSec - Biometrics and Security, IST - IP, FP6

BioSecure - Biometrics for Secure Authentication, NOE, IST-FP6, EC.

top

© 2006

Mobile Multimedia Processing for Emergency Handling

Thermal video and image processing is a fundamental component of an advanced mobile service that will provide critical multimodal communication support for emergency teams during rescue operations. After the introduction of the new service, rescue operations will benefit enormously from sophisticated multimodal interaction and on-line, on-site access to data services providing up-to-date operation status information, as well as details concerning aspects of the emergency, such as location and environment.

Thermal camera documents are pretty reliable and accurate regarding showing the exact temperature values in the scene. As these sensors have recently started to spread, few image processing analysis performed on the data they provide.


Our Method

Our Tools
The thermal video processing component constructed by the AIIA laboratory is running as a standalone executable in the application server cluster. It listens to a port as localhost to receive the XML message package sent by the remote client. The package is saved as a local XML file and parsed by the service to extract the data server location and file name information, as well as the list of processing tasks together with the regions of interest for which they are to be performed. The tasks offered by the application are presented below along with descriptive image processing outcome. A proper metadata description of the analysis outcome is populated back to the main server.

 

Colorization Tool

Hotspot detection tool

Fire detection tool

Boundary detection tool

Boundary object recognition tool

Boundary tracking tool

Automatic object detection


Our Research

  • Fire detection with Fourier analysis
  • Content description of thermal videos
  • Supervised object extraction by snakes
    • Tracing quadtrees for better concavity performance
    • Content adaptive heterogeneous snakes
  • Automatic object detection and recognition
    • Creating human pose database
    • Hierarchical clustering for template databases
    • Textual description of database for faster matching
    • Divide and conquer strategy for affine distortions
    • Object simplification for faster matching


Downloads

-


Relevant Publications

A. Hajdu, A. Roubies and I. Pitas, ”Optimised chamfer matching for snake-based image contour representations”, in IEEE International Conference on Multimedia & Expo (ICME2006), Toronto, Canada, 2006.

A. Roubies, A. Hajdu and I. Pitas, "Improving Concavity Performance of Snake Algorithms", in Proc. of Int. Symposium Communications, Control and Signal Processing (ISCCSP 2006), Marrakech, Morocco, 13-15 March, 2006.


Research Projects

SHARE - “Mobile Support for Rescue Forces, Integrating Multiple Modes of Interaction” (FP6- 004218)

top

© 2006

Face Detection - Facial Feature Extraction and Tracking

Mosaic Images

Abstract face model at the resolution level of the quartet image

Horizontal and vertical image profiles

 

Outcome of face detection

Two frontal face images from M2VTS database. Their quartet and octet images are shown in the second and third column, respectively. The outcome of face detection algorithm is depicted in the last column. Key facial features are detected inside the overlaid rectangles.

 

Downloads

-


Relevant Publications

K. Sobottka and I. Pitas, "A Fully Automatic Approach to Facial Feature Detection and Tracking", in Proc. of 1st Int. Conf. on Audio- and Video-based Biometric Person Authentication (AVBPA'97), pp. 77-84, Crans-Montana, Switzerland, 12-14 March, 1997.

K.Sobottka and I.Pitas, "A novel method for automatic face segmentation, facial feature extraction and tracking", Image Communication, Elsevier , vol. 12, no. 3, pp. 263-281, June, 1998. (pp. 1-18,pp. 19-22, pp. 23-34)


Research Projects

M2VTS - "Multi-modal Verification Techniques for Teleservices and Security Applications"

top

© 2006

Face Verification

Our approaches

MDLA

Dilations and Erosions
Dilated and eroded images for scales 1-9 with a scaled hemisphere

Grid matching procedure

 
The model grids are shown overlaid on the images in the first column. Deformed grids when a client (i.e., the correct person) or impostors (i.e., a person pretending to be somebody he is not) are shown in the second and third column.

Discriminatory power coefficients for the grid nodes in MDLA

 
 
 

The brighter a node is the bigger discriminatory power possesses. The intensity of the nodes is normalized for visualization purposes.

 

MSD-MDLA

Image Analysis


The stages of shape decomposition used to extract the feature vectors for the nodes of the grid

 

Grid matching procedure

 
The model grids are shown overlaid on the images in the first column. Deformed grids when a client (i.e., the correct person) or impostors (i.e., a person pretending to be somebody he is not) are shown in the second and third column.

Discriminatory power coefficients for the grid nodes in MDLA

 
 
 

The brighter a node is the bigger discriminatory power possesses. The intensity of the nodes is normalized for visualization purposes.

Morphological Pyramids

Image Analysis


Three morphological pyramids used to extract the feature vectors for each node of the grid

Grid matching procedure

The grid matching when the test person is the same with the reference one
 
The grid matching when the test person is different from the reference one

 

Downloads

-


Relevant Publications

 


Research Projects

M2VTS - "Multi-modal Verification Techniques for Teleservices and Security Applications"

top

© 2006

Compensation for Variable Recording Conditions

Samples images from MATRA-NORTEL database

They have been recorded under variable lighting conditions. The faces appear in different sizes and at different positions. The subjects' facial expressions are not neutral.

 
 
 
Images before normalization
 
 
 
Images after normalization

 

 

Downloads

-


Relevant Publications

-


Research Projects

-

top

© 2006

Robust Feature Coefficients with Applications to Automatic Speaker Recognition

Presentation of feature extraction techniques

The selection of the best parameter representation of acoustic data is an important task in designing a speaker recognition
system. The usual objectives in selecting a representation of acoustic data are to compress the speech data by eliminating information not
pertinent to the phonetic analysis and to enhance those aspects of the signal that contribute significantly to the detection of the
phonetic differences.

In these experiments we use two different kinds of signal representation: cepstral and mel-frequency cepstral coefficients. The experiments
were conducted on the M2VTS database, which is made up from 37 adult male and female speakers and provides 5 shots for each
person. During each shot, speakers have been asked to count from '0' to '9' in their native language (most of them are french speaking).
This waveform represents an example of a speech signal versus time for the shot 01 of speaker BS. The speech sequence is split in 10 voiced segments of variable duration.

The sequence of processing in order to extract the cepstral coefficients is showed in the block diagram

The sequence of processing in order to extract the mel-frequency cepstral features includes the following steps:

  • Window the data with a Hamming window
  • Take the Fast Fourier Transform
  • Find the magnitude of the FFT
  • Convert the FFT data into filterbank outputs
  • Calculate the decimal logarithm
  • Find the Discrete Cosine Transform

Performance Results

In order to estimate the performance of the feature extraction techniques the experimental data showed in the table was used:

Parameter

Value

Sampling Frequency

48000 Hz

Processing Frequency

12000 Hz

Frame Length

30 msec

Overlap

20 msec

Number of FFT bins

512

Number of Cepstral Coefficients

12

Number of Mel-Cepstral Coefficients

12

Number of triangular filters

40

Size of Codebooks

12

During the experiments, two different kinds of classification error rates have been measured. The first kind is referred to the percentage of the identification error rate in a closed-set of speakers. The table below, shows the exact percentage of the identification error in 8 different shot combinations.

(%) Recognition Error
Training Shots Testing Shot Cepstral Coefficients Mel-Cepstral Coefficients
1,2,3 4 13,513514 8,108108
1,2,3 5 5,405405 5,405405
1,3,4 2 0,000000 5,405405
1,3,4 5 5,405405 2,702703
2,3,4 1 5,405405 5,405405
2,3,4 5 8,108108 2,702703
1,2,4 3 0,000000 2,702703
1,2,4 5 8,108108 2,702703

The second kind of recognition error is referred to the False Acceptance (FA) and the False Rejection (FR) Rate in an open-set of speakers based on the Brussels protocol training and testing procedures. For both cepstral and mel-cepstral parameters the Receiver Operating Characteristics (ROC) are plotted in figures I and II.

 


The observation of both performance results in the Brussels protocol case leads to the following conclusion remarks:

  • The cepstral coefficients are proved to have better performance under normal conditions. In fact, if the objective is to equally minimize the false acceptance and the false rejection rates, cepstral representation is proved to be more efficient (provides a smaller Equal Error Rate, EER).
  • The mel-frequency cepstral coefficients maintain higher verification performance under special demanding conditions, where an extra level of security must be provided. Particularly, in access control applications, where the false acceptance rate should be minimized (even if the false rejection rate is increased), the mel-cepstral coefficients are proved to have better performance.

Downloads

-


Relevant Publications

K. Sobottka and I. Pitas, "A Fully Automatic Approach to Facial Feature Detection and Tracking", in Proc. of 1st Int. Conf. on Audio- and Video-based Biometric Person Authentication (AVBPA'97), pp. 77-84, Crans-Montana, Switzerland, 12-14 March, 1997.

K.Sobottka and I.Pitas, "A novel method for automatic face segmentation, facial feature extraction and tracking", Image Communication, Elsevier , vol. 12, no. 3, pp. 263-281, June, 1998. (pp. 1-18,pp. 19-22, pp. 23-34)


Research Projects

M2VTS - "Multi-modal Verification Techniques for Teleservices and Security Applications"

top

© 2006