Monday, June 1, 2009

DSP Projects Continued...

Face Detection

Description
In today's world, Biometric Identification has many applications. Though many biometric methods such as fingerprint recognition and iris recognition are in use, face recognition is probably the most easiest and user-friendly method.

To recognise a face, firstly we need to identify the part of the image in which a face is located. Simply put, Face Detection means to "Identify and locate human faces in an image regardless of their Position, Scale, in-plane rotation, Orientation, pose (out-of-plane rotation) and illumination."

There are various methods to detect faces in a given image. These are broadly classified as
Knowledge based methods
A face is represented by using human coded rules e.g., a face (frontal pose) is always supposed to have 2 eyes and a nose so we could look for 2 dark spots separated by a small amount in the image to know whether the image is a face or not?

Feature Invariant Methods
Here a face is represented by many features like edges, intensity, shape, texture, color, etc.

Template Matching Methods
In this case, a given image is matched to a template face image to locate the face in the image. The matching technique could be as simple as a correlation between the two.

Appearance Based Methods
These involve training a classifier using labelled images that are faces and also those that are not faces so that when a test case is presented to the classifier it can identify whether the image is a face or not a face.

There are many algorithms for face detection, the most popular being the Viola-Jones Face Detection algorithm.

References

Face Detection homepage
This site contains information regarding Face Detection, like the various publications, datasets, publicly available source code or software etc.
A really good point to start your project on Face Detection.



Tuesday, May 5, 2009

Digital Signal Processing (DSP) Projects

Automatic Speech Recognition (ASR)


Description

Automatic Speech Recognition, as the name suggests, is the use of artificial (non-human) intelligence like say a computer to recognise our speech. A project in this field would mainly involve two different things:

  • the study of how we as humans produce and recognise speech, the study of the characterstics of a speech signal which can be exploited to build an artificial intelligence system;

  • the study of how, given a speech signal, an artificially intelligent system (again say a computer) would be able to recognise it.

The first point mainly deals with the Signal Processing part, whereas the second point deals with the Pattern Recognition and Machine Learning Part.

The above mentioned approach is just one of the many approaches being currently pursued. I felt this approach is simpler and easier to start off with. From a project point of view, Speech Recognition system (infact any pattern recognition system) needs some amount of training before we can start testing. The training and testing data basically constitutes the database you are working with and this database can be a custom-application specific database or a database which is publicly available.

There are various kinds of speech recognition systems needed for different applications. Here I am listing the two major different kinds of speech recognition systems: Keyword (Isolated word) Recognition, Continous Speech Recognition.

For more details, I have listed some references available on the world-wide-web along with a brief writeup for each reference.

References

Automatic Speech Recognition - A Brief History of the Technology Development; B.H. Juang and Lawrence R. Rabiner
A introduction to speech recognition and its applications.

MIT Open CourseWare - Automatic Speech Recognition
A course on ASR from MIT. All the lecture slides are available for download.

Speech Recognition on Wikipedia
The usual Wikipedia reference.

SpeechLinks - Speech Recognition; Speech Technology Hyperlinks Page from CMU
A lot of links linking to a lot of speech recognition related material.

Keyword(Isolated Word) Recognition

Description

Here again as the name suggests, we are looking to recognise just a single spoken (isolated) word at a given time. The problem is a lot simpler in this case as we know that the user has spoken a single isolated word and that has to be recognised.

Once the speech signal is recorded, we would have to suitably endpoint it and then do a feature extraction (Mel-Frequency Cepstrum Coefficients (MFCC) is the most commonly used feature) and then compare between a test feature and a training feature using machine learning techniques (Dynamic Time Warping (DTW) is one such technique which is easy to implement). The references listed below will be useful in learning more about Feature Extraction and Machine Learning techniques.

References

Speech Technology: A Practical Introduction; Topic: Spectrogram, Cepstrum and Mel-Frequency Analysis; Kishore Prahallad
A really nice presentation on some important topics like Spectrograms and MFCC's

PLP and RASTA (and MFCC, and inversion) in Matlab using melfcc.m and invmelfcc.m
Some MATLAB codes for extracting Spectrograms, MFCCs given a speech signal.

VOICEBOX: Speech Processing Toolbox for MATLAB
MATLAB codes for various speech processing applications including extracting MFCC's.


H. Sakoe and S. Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition", IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-26, No. 1, pp. 43-49, Feb. 1978.
This paper explains the concept of Dynamic Time Warping (DTW) for isolated word recognition.


L.R. Rabiner and M.R. Sambur, "An algorithm for determining the endpoints of isolated utterances," Bell Syst. Tech. J., vol.54, pp.297-315, Feb.1975.
A simple algorithm for endpoint detection using energy and zero crossing rate (ZCR) as features.








Saturday, May 2, 2009

And so It begins...

Having just finished my B.Tech Course, I wanted to do something that would be of help to engineering students like me. Hence I have decided to write this blog.


As a first step, I am putting up a list of ideas/topics for interesting projects for Engineering Students studying Electronics and Communication, Telecommunication, Computer Science and Information Science.


I have been personally involved in some of these projects while others are just what I have noticed or felt interested in during the four years of my engineering. While some ideas are very broad, some are specific enough to be a project title themselves.


Although a lot of websites are available which list out engineering projects, I feel this blog will be different in the sense that it will give you a perspective on which projects are tough/easy, small/big, research/implementations etc. and will help you develop your own ideas by helping you find resources available on the web.


In the future, I am hoping to put in details (like description, resources available on the web, reference papers etc.) about each project so that it is really helpful for engineering students searching for project ideas.


I am listing the ideas by broad subject based categories:


Digital Signal Processing(DSP)

Automatic Speech Recognition

Keyword Recognition

Continous speech Recognition

Emotion Detection in Speech

Speaker Identification

Speech-Music Separation

Automatic Music Transcription

Music Synthesis

Image Compression using DWT and EZW

Voice Morphing

Speech Synthesis

Dual-Tone Multifrequency Signalling (DTMF)

Face Detection


Pattern Recognition and Artificial Intelligence

Face Recognition

Object Recognition

A simple Optical Character Recognition Framework using Artificial Neural Networks(ANNs)


Robotics

Line Following Robot (Grid-Solver)

Wall Following Robot (Maze-Solver)

Voice Controlled Robot


Communication

Software Defined Radio based on QPSK modulation


Logic Synthesis

Creating a Binary Decision Diagram (BDD) Toolkit


Digital System Design

Design and Implementation of a PID Controller on FPGA

Implementation of the Viterbi Decoding Algorithm


Analog Design

Design of a Musical Fountain Controller


Microcontroller Based Implementations

MP3 Player using Microcontroller


VLSI Testing

Design of a ATPG (Automatic Test Pattern Generator) for digital circuits


PS: Do put in your valuable comments and if you have something you would like to share, you are always welcome!