INDEX. List of Figures...XII List of Tables...XV 1. INTRODUCTION TO RECOGNITION OF FOR TEXT TO SPEECH CONVERSION

Size: px

Start display at page:

Download "INDEX. List of Figures...XII List of Tables...XV 1. INTRODUCTION TO RECOGNITION OF FOR TEXT TO SPEECH CONVERSION"

Roberta Strickland
7 years ago
Views:

1 INDEX Page No. List of Figures...XII List of Tables...XV 1. INTRODUCTION TO RECOGNITION OF FOR TEXT TO SPEECH CONVERSION 1.1 Introduction Statement of the problem Objective of the study Rational of the study Scope of the study Limitations of the study Literature review for the study A glance over related literature Some empirical study Some generic TTS frameworks MBROLA SYNTHESIZER FESTIVAL FLITE Speech Synthesis Markup Languages: An Overview Spoken Text Mark up Language (STML) Java Speech API Mark up Language (JSML) SABLE W3C Speech Synthesis Markup Language (SSML) Apple Speech Synthesis Manager Microsoft Speech API (SAPI) Microsoft Speech Application Software Development Kit (SASDK) VoiceXML (VXML) SML in VHML Sort summary of some speech Engine Linguistic studies in India...23 VI

2 C-DAC BANGALORE MATRUBHASHA API IIT MUMBAI VANI FRAMEWORK HP LABS HINDI TTS IIT KHARAGPUR- SHRUTI TTS SIMPUTER TRUST DHVANI TTS OTHER INSTITUTIONS IIT MADRAS IIIT HYDERABAD HYDERABAD CENTRAL UNIVERSITY (HCU) VAANI IISC BANGALORE -THIRUKKURAL & VAACHAKA UTKAL UNIVERSITY, ORISSA TATA INSTITUTE OF FUNDAMENTAL RESEARCH (TIFR), MUMBAI C-DAC, NOIDA COLLEGE OF ENGINEERING, GUINDY, CHENNAI Salient features of the present study Glossary of terms Organization of the Thesis...33 REFERENCES TEXT TO SPEECH CONVERSION TECHNOLOGY 2.1 Introduction Text to speech conversion - Basic methodology Naturalness Intelligibility Issues and approaches in text-to-speech synthesis Natural Language Processing (NLP) Module Text Analysis Text Normalization Phonetic Analysis Prosodic Analysis Meaning of Prosody Types of prosodic structures...45 VII

3 Rule based prediction Data-driven or stochastic methods ARCHITECTURE FOR PROSODY GENERATION Digital Signal Processing (DSP) module Human Speech Production Mechanism Types of modern synthesis Technologies Articulatory Synthesis Formant Synthesis Formant Synthesis methodology Challenges in Formant Synthesis Concatenative synthesis Approach Unit selection synthesis Diphone synthesis Domain-specific synthesis Database preparation Text to Speech Projects and Products...66 REFERENCE DESIGNING & DEVELOPMENT OF TEXT TO SPEECH CONVERSION MODEL 3.1 Introduction Concatenate Synthesis Technique Gujarati character feature The Basics Framework of a Gujarati symbol Gujarati Consonants / Vowels Concatenative Synthesis Model Base Tables and Master Database preparation Model Creating base tables Making master database empty Mater Table creation Phoneme corpus recording Model VIII

4 Phoneme Selection Phoneme Recording Silence Removal Testing Correctness Saving audio file Synthesis Engine Creation Model Text Editor Phoneme separation and searching Concatenation Playing converted audio file TTS Testing Model REFERENCE PROTOTYPE AND COMPONENTS DEVELOPMENT FOR THE TEXT TO SPEECH CONVERSION MODEL 4.1 Introduction Gujarati Text-to-speech Architecture Text Normalization Text Segmentation Wav Concatenation Software and hardware requirement Hardware requirement Software requirement Microsoft visual studio SQL Database C #.NET Free Audio Editor NAudio mansi.ttf Font True Type Font (TTF) Font development programs and its utility The Font Creator Program Need to create Gujarati font IX

5 Character list of developed Gujarati font named mansi.ttf List of Consonants List of vowels List of Digits List of special characters Database and sound file preparation Base table and master table management module Entry Empty Merge Add half consonants Add General consonants Barakhadi Add digits and special single consonants Add special consonants Barakhadi Sound recording Pre-recording process Speaker Selection Sound file format Sound files naming and storage criteria Recording process Text to speech conversion Logical Development (Algorithm) Text to speech synthesis Engine module Text area Button panel Related Microsoft Visual C# code / programs used in Text to Speech Synthesizer development / testing process Class creation for database connection Base Table data entry and master table creation Base Table data entry sub module Sound recording module Text to speech engine module Listening test module X

6 4.7 Annexure I References RESULTS, DISCUSSION, CONCLUSION AND FUTURE SCOPE FOR EXTENSION OF THE RESEARCH WORK 5.1 Introduction Performance analysis criteria for Text to speech engine model for Gujarati text recognition Performance analysis of Categorical Rating Test Clearness Speed Sound Quality Pronunciation Concentration Intonation Stress Pronunciation mistakes Performance analysis of listening test Results and discussion Conclusion Future scope Reference Publications by the candidate XI

TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE

TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE Sangam P. Borkar M.E. (Electronics)Dissertation Guided by Prof. S. P. Patil Head of Electronics Department Rajarambapu Institute of Technology Sakharale,