Visual Speech Recognition using 3D Lip Shape from Stereo Video
- Visual Speech Recognition using 3D Lip Shape from Stereo Video
- 이연주; 고혜승; 최귀원; 윤인찬
- visual speech recognition; 3D lip shape model; 3D reconstruction
- Issue Date
- The 7th Asian Pacific Conference on Biomechanics
- Recently, visual speech recognition techniques have been actively researched because visual information such as
lip movement is effective to improve the performance of the automatic speech recognition in noisy environment.
In addition, visual speech recognition techniques can be used as a useful communication tool for the people with
voice impairment or hard-of-hearing people. Most current visual speech recognition methods have focused on
two dimensional (2D) lip features obtained from a single lip image. This paper presents a novel method for
visual speech recognition using 3D lip shape from stereo video. Calibrated stereo camera was used to obtain 3D
information of lip shape. The proposed 3D lip shape feature is extracted by a model-based method to minimize
the effects caused by head movements and the correspondence problem of stereo-based 3D reconstruction. To
make 3D lip shape model, 3D motion marker data, which is ground-truth data, was acquired by multiple motion
cameras and Principal Component Analysis was applied to the aligned 3D motion data. Figure 1 shows the
process of the proposed 3D lip shape feature extraction. Lip feature points (LFPs) were extracted separately from
the left and right images using a point extraction algorithm. From the extracted corresponding LFPs, 3D lip
shape was reconstructed by triangulation . Finally, 3D lip shape feature was extracted by 3D shape model
fitting. For word recognition, the Hidden Markov Model algorithm was used. In experiments, stereo video data
for two subjects was used and speech words consisted of consecutive five digits (0~4), which were pronounced
in Korean. In the experimental results, the proposed 3D shape feature showed a better word recognition
performance compared to a 2D shape feature.
- Appears in Collections:
- KIST Publication > Conference Paper
- Files in This Item:
There are no files associated with this item.
- RIS (EndNote)
- XLS (Excel)
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.