engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios
|
|
- Jesse Robbins
- 8 years ago
- Views:
Transcription
1 engin erzin Associate Professor Department of Computer Engineering Ph.D. Bilkent University eerzin Engin Erzin s research interests include speech processing, multimodal signal processing, pattern recognition and human-computer interfaces. Prof. Erzin is a member of Multimedia, Vision and Graphics Laboratory (MVGL), where he is actively part of many national and international research projects. E. Erzin. Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings. IEEE Transactions on Audio, Speech and Language Processing, The speech processing research area, which refers to analysis, synthesis and recognition of speech signals, is playing a key role in the state-of-art digital speech communication and multimedia services. While Internet and wireless telephony is expected to remain one of the most important application for several years to come, the use of speech processing applications, such as automatic speech recognition (ASR), text-to-speech synthesis (TTS), speaker identification/verification, emotion and mood analysis from speech, is expected to increase in multimedia-rich scenarios. M. E. Sargın, Y. Yemez, E. Erzin, and A. M. Tekalp. Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation. IEEE Transactions on Pattern Analysis and Machine Learning, the use of speech processing applications is expected to surge in multimedia-rich scenarios Multimodal signal processing refers to combined processing of signals from multiple modalities such as speech, still images, video, and other sources. It plays a key role in the design of future human-computer interfaces and intelligent systems, such as intelligent vehicles. The ultimate goal of human-computer interface research is to develop a machine that is able to identify humans, to analyze and understand them from biometric input signals and to synthesize a human-like output in response, in a similar way to human-to-human communication. The study of relations and correlations between different modality signals plays an important role in effective use of multimodal information. Prof. Erzin s active research activities in the area of multimodal signal processing include speech/speaker recognition, body motion analysis, speech-driven face gesture analysis and synthesis, speaker animation, audio-driven body animation and driver behavior modeling. More details on Prof. Erzin s research activities and current research projects are available under: U. Bag cı and E. Erzin. Automatic classification,of musical genres using intergenre similarity. IEEE Signal Processing Letters, H. E. C etingu l, E. Erzin, Y. Yemez, and A. M. Tekalp. Multimodal speaker/speech recognition using lip motion, lip texture and audio. Signal Processing, E. Erzin, Y. Yemez, and A. M. Tekalp. Multimodal speaker identification using an adaptive classifier cascade based on modality reliability. IEEE Transactions on Multimedia, Multimodal recognition system 1
2 2 Graduate Students
3 can yagli M.S. Koç University, 2010 Can Yagli. Artificial bandwidth extension of speech using temporal clustering. Master s thesis, Koç University, In this thesis, we investigate the Artificial Bandwidth Extension problem, which aims to reconstruct the missing frequency in wideband speech from narrowband speech. To solve the problem, we utilize the well-known source-filter reproduction of the human voice production system.. C. Yagli and E. Erzin. Artificial bandwidth extension using linear prediction within temporal clusters. submitted to ICASSP 11, ferda ofli Ph.D. Koç University, 2010 Advisor: Murat Tekalp, Yücel Yemez, Engin Erzin Ferda Ofli. Learning Statistical Music-to-Dance Mappings for Choreography Synthesis. PhD thesis, Koç University, We propose many-to-many statistical mappings from music measures (music segments) to dance figures (dance segments) towards generating plausible music-driven dance choreographies. We assume that dance figures (dance segment boundaries) coincide with music measures (music segment boundaries).. F. Ofli, E. Erzin, Y. Yemez, and A.M. Tekalp. Multi-modal analysis of dance performances for music-driven choreography synthesis. In ICASSP 10, Dallas, USA, F. Ofli, E. Erzin, Y. Yemez, A.M. Tekalp, A.T. Erdem, C. Erdem, T. Abaci, and M. Ozkan. Unsupervised dance figure analysis from video for dancing avatar animation. In ICIP 08, San Diego, USA, F. Ofli, C. Canton-Ferrer, J. Tilmanne, Y. Demir, E. Bozkurt, Y. Yemez, E. Erzin, and A.M. Tekalp. Audio-driven human body motion analysis and synthesis. In ICASSP 08, Las Vegas, USA, F. Ofli, Y. Demir, E. Erzin, Y. Yemez,, and A. M. Tekalp. Multicamera audio-visual analysis of dance figures. In IEEE Int. Conf. on Multimedia Expo, ICME-2007., F. Ofli, Y. Demir, C. Canton-Ferrer, J. Tilmanne, K. Balcı, E. Bozkurt, I. Kızıloğlu, Y. Yemez, E. Erzin, A.M. Tekalp, L. Akarun, and A.T. Erdem. Çok bakışlı işitsel-görsel dans verilerinin analizi ve sentezi (analysis and synthesis of multiview audio-visual dance figures). In SIU 08, Didim, Turkey, 2008.
4 elif bozkurt M.S. Koç University, Elif Bozkurt. Emotion recognition from speech. Master s thesis, Koç University, We present formant position based weighted Mel Frequency Cepstral Coefficient (WMFCC) features for the emotion recognition problem and compare performance results with commonly used feature sets. Since, the Line Spectral Frequency (LSF) features are positioned close to each other around formant frequencies, we propose normalized inverse harmonic mean function to weight critical band energies for the extraction of MFCC features.. E. Bozkurt, C. Eroglu Erdem, T. Erdem, and E. Erzin. Formant position based weighted spectral features for emotion recognition. submitted to Speech Communication, E. Bozkurt, E. Erzin, C. Eroglu Erdem, and T. Erdem. Improving automatic emotion recognition from speech signals. In INTERSPEECH 09, UK, F. Ofli, Y. Demir, C. Canton-Ferrer, J. Tilmanne, K. Balcı, E. Bozkurt, I. Kızıloğlu, Y. Yemez, E. Erzin, A.M. Tekalp, L. Akarun, and A.T. Erdem. Çok bakışlı işitsel-görsel dans verilerinin analizi ve sentezi (analysis and synthesis of multiview audio-visual dance figures). In SIU 08, Didim, Turkey, emre öztürk M.S. Koç University, 2010 Emre Öztürk. Driver status identification from driving behavior signals. Master s thesis, Koç University, Driving behavior signals differ in how and under which conditions the driver use vehicle control units, such as pedals, driving wheel, etc. In this study we investigate how the driving behavior signals differ among drivers and among different driving tasks.. E. Ozturk and E. Erzin. Driving status identification under different distraction conditions from driving behaviour signals. In 4th Biennial Workshop on DSP for In-Vehicle Systems and Safety, UTD, TX, USA, 2009.
5 yasemin demir Ph.D. student at University of California, Berkeley M.S. Koç University, 2008 Yasemin Demir. Music - driven dance synthesis by multimodal dance performance analysis. Master s thesis, Koç University, We present a framework for evaluation of audio feature and dance figure correlation for audio - visual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm.. Y. Demir, E. Erzin, Y. Yemez, and A. M. Tekalp. Evaluation of audio features for audio-visual analysis of dance figures. In EUSIPCO 08, Lausanne, Switzerland, F. Ofli, C. Canton-Ferrer, J. Tilmanne, Y. Demir, E. Bozkurt, Y. Yemez, E. Erzin, and A.M. Tekalp. Audio-driven human body motion analysis and synthesis. In ICASSP 08, Las Vegas, USA, F. Ofli, Y. Demir, E. Erzin, Y. Yemez,, and A. M. Tekalp. Multicamera audio-visual analysis of dance figures. In IEEE Int. Conf. on Multimedia Expo, ICME-2007., F. Ofli, Y. Demir, C. Canton-Ferrer, J. Tilmanne, K. Balcı, E. Bozkurt, I. Kızıloğlu, Y. Yemez, E. Erzin, A.M. Tekalp, L. Akarun, and A.T. Erdem. Çok bakışlı işitsel-görsel dans verilerinin analizi ve sentezi (analysis and synthesis of multiview audio-visual dance figures). In SIU 08, Didim, Turkey, emre sargın MTS at Google Ph.D. student at University of California, Santa Barbara M.S. Koç University, 2006 Advisor: Murat Tekalp, Yücel Yemez, Engin Erzin Emre Sargın. Audio-visual correlation modeling for speaker identification and synthesis. Master s thesis, Koç University, This thesis addresses two major problems of multimodal signal processing using audiovisual correlation modeling: speaker recognition and speaker synthesis. We address the first problem, i.e., the audiovisual speaker recognition problem within an open-set identification framework, where audio (speech) and lip texture (intensity) modalities are fused employing a combination of early and late integration techniques.. M. E. Sargın, Y. Yemez, E. Erzin, and A. M. Tekalp. Analysis of head gesture and prosody patterns for prosody-driven head-gesture animation. IEEE Transactions on Pattern Analysis and Machine Intelligence, M. E. Sargın, Y. Yemez, and A.M. Tekalp. Audio-visual synchronization and fusion using canonical correlation analysis. IEEE Transactions on Multimedia, 9(7): , November M. E. Sargın, Y. Yemez, E. Erzin, and A. M. Tekalp. Prosody-driven head-gesture animation. In IEEE Int. Conf. on Acoustic, Speech, Signal Proc. (ICASSP 07), 2007.
6 ulaş bağcı Ph.D. student at University of Nottingham, UK M.S. Koç University, Ulaş Bağcı. Boosting classifiers for automatic music genre classification. Master s thesis, Koç University, Music genre classification is an important tool for music information retrieval systems and has been finding important applications in various media platforms. Two important problems of the automatic music genre classification are feature extraction and classifier design.. U. Bağcı and E. Erzin. Automatic classification of musical genres using inter-genre similarity. IEEE Signal Processing Letters, Vol. 14, No. 8, pp , August U. Bağcı and E. Erzin. Boosting classifiers for music genre classification. In 20th International Symposium on Computer and Information Sciences (ISCIS 2005), Berlin, U. Bağcı and E. Erzin. Müzik türlerinin sınıflanmasında benzer kesişim bilgileri uygulamaları. In SIU 2006, Antalya, ertan çetingul Ph.D. student at Johns Hopkins University, Baltimore M.S. Koç University, 2005 Advisor: Murat Tekalp, Engin Erzin, Yücel Yemez Ertan Çetingul. Discrimination analysis of lip motion features for multimodal speaker identification and speech-reading. Master s thesis, Koç University, In this thesis a new multimodal speaker/speech recognition system that integrates audio, lip texture, lip geometry, and lip motion modalities is presented. There have been several studies that jointly use audio, lip intensity and/or lip geometry information for speaker identification and speech recognition applications.. H.E. Cetingul, E. Erzin, Y. Yemez, and Tekalp A.M. Multimodal speaker/speech recognition using lip motion, lip texture and audio. Signal Processing, Special Section: Multimodal Human-Computer Interfaces, 86: , December H.E. Cetingul, E. Erzin, Y. Yemez, and Tekalp A.M. Discriminative analysis of lip motion features for speaker identification and speech-reading. IEEE Transactions on Image Processing, 15: , October H.E. Cetingul, E. Erzin, Y. Yemez, and Tekalp A.M. Robust lip-motion features for speaker identification. In IEEE Int. Conf. on Acoustic, Speech and Signal Processing, Philadelphia, March 2005.
7 alper kanak TUBITAK-UEKAE M.S. Koç University, 2004 Advisor: Murat Tekalp, Engin Erzin, Yücel Yemez Alper Kanak. Multimodal speaker identification with audio-video processing. Master s thesis, Koç University, In this these we present a multimodal text=dependent speaker identification system. The objective is to improve the recognition performance over conventional unimodal or bimodal schemes.. A. Kanak, E. Erzin, Y Yemez, and A.M. Tekalp. Speaker identification using multimodal audio-video processing. IEEE Int. Conf. on Image Processing, A. Kanak, E. Erzin, Y Yemez, and A.M. Tekalp. Joint audio-video processing for biometric speaker identification. IEEE Int. Conf. on Acoustic, Speech and Signal Processing,
1992-1997 : PhD, Doctor of Philosophy in Electrical and Electronic Engineering
Yücel YEMEZ (Assoc. Prof.) Department of Computer Engineering Koç University Rumelifeneri Yolu 34450 Sarıyer, Istanbul, Turkey Tel: +90 (212) 338-1585 Fax: +90 (212) 338-1548 E-Mail: yyemez@ku.edu.tr Website:
More informationCV - Arif Tanju Erdem. Date of birth: 20.02.1965 EDUCATION
CV - Arif Tanju Erdem Date of birth: 20.02.1965 EDUCATION Ph.D. Electrical Engineering, University of Rochester, Rochester, New York, USA, 1990. M.S. Electrical Engineering, University of Rochester, Rochester,
More informationEmotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
More informationbiometric person recognition recognizing individuals according to their physical and behavioral characteristics has
Feature Article Multimodal Person Recognition for Human Vehicle Interaction Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience.
More informationSpeech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction
: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)
More informationRecent advances in Digital Music Processing and Indexing
Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components
More informationAnnotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
More informationSpeech Signal Processing: An Overview
Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech
More informationEstablishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
More informationDR AYŞE KÜÇÜKYILMAZ. Imperial College London Personal Robotics Laboratory Department of Electrical and Electronic Engineering SW7 2BT London UK
DR AYŞE KÜÇÜKYILMAZ Imperial College London Personal Robotics Laboratory Department of Electrical and Electronic Engineering SW7 2BT London UK http://home.ku.edu.tr/~akucukyilmaz a.kucukyilmaz@imperial.ac.uk
More informationEvent Detection in Basketball Video Using Multiple Modalities
Event Detection in Basketball Video Using Multiple Modalities Min Xu, Ling-Yu Duan, Changsheng Xu, *Mohan Kankanhalli, Qi Tian Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore, 119613
More informationMobile Multimedia Application for Deaf Users
Mobile Multimedia Application for Deaf Users Attila Tihanyi Pázmány Péter Catholic University, Faculty of Information Technology 1083 Budapest, Práter u. 50/a. Hungary E-mail: tihanyia@itk.ppke.hu Abstract
More informationPROFESSIONAL EXPERIENCE
RESUME - A. Murat Tekalp, Professor College of Engineering, Koç University Rumeli Feneri Yolu, 34450 Sarıyer, Istanbul, Turkey Phone: +90-212-338-1593 FAX: +90-212-338-1548 E-Mail: mtekalp@ku.edu.tr Web:
More informationTurgut Ozal University. Computer Engineering Department. TR-06010 Ankara, Turkey
Dr. YILDIRAY YALMAN Associate Professor CONTACT INFORMATION Turgut Ozal University Computer Engineering Department TR-06010 Ankara, Turkey Phone: +90 (0)312-5515437 E-mail: yyalman@turgutozal.edu.tr RESEARCH
More informationCHANWOO KIM (BIRTH: APR. 9, 1976) Language Technologies Institute School of Computer Science Aug. 8, 2005 present
CHANWOO KIM (BIRTH: APR. 9, 1976) 2602E NSH Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Phone: +1-412-726-3996 Email: chanwook@cs.cmu.edu RESEARCH INTERESTS Speech recognition system,
More informationSeparation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
More informationAPPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
More informationQMeter Tools for Quality Measurement in Telecommunication Network
QMeter Tools for Measurement in Telecommunication Network Akram Aburas 1 and Prof. Khalid Al-Mashouq 2 1 Advanced Communications & Electronics Systems, Riyadh, Saudi Arabia akram@aces-co.com 2 Electrical
More informationAn Arabic Text-To-Speech System Based on Artificial Neural Networks
Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department
More informationHabilitation. Bonn University. Information Retrieval. Dec. 2007. PhD students. General Goals. Music Synchronization: Audio-Audio
Perspektivenvorlesung Information Retrieval Music and Motion Bonn University Prof. Dr. Michael Clausen PD Dr. Frank Kurth Dipl.-Inform. Christian Fremerey Dipl.-Inform. David Damm Dipl.-Inform. Sebastian
More informationAdvanced Speech-Audio Processing in Mobile Phones and Hearing Aids
Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain
More informationBLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be
More informationA TOOL FOR TEACHING LINEAR PREDICTIVE CODING
A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering
More informationSemantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
More informationTeaching in School of Electronic, Information and Electrical Engineering
Introduction to Teaching in School of Electronic, Information and Electrical Engineering Shanghai Jiao Tong University Outline Organization of SEIEE Faculty Enrollments Undergraduate Programs Sample Curricula
More informationA secure face tracking system
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 10 (2014), pp. 959-964 International Research Publications House http://www. irphouse.com A secure face tracking
More informationAUTOMATIC VIDEO STRUCTURING BASED ON HMMS AND AUDIO VISUAL INTEGRATION
AUTOMATIC VIDEO STRUCTURING BASED ON HMMS AND AUDIO VISUAL INTEGRATION P. Gros (1), E. Kijak (2) and G. Gravier (1) (1) IRISA CNRS (2) IRISA Université de Rennes 1 Campus Universitaire de Beaulieu 35042
More informationImmersive Audio Rendering Algorithms
1. Research Team Immersive Audio Rendering Algorithms Project Leader: Other Faculty: Graduate Students: Industrial Partner(s): Prof. Chris Kyriakakis, Electrical Engineering Prof. Tomlinson Holman, CNTV
More informationDR AYŞE KÜÇÜKYILMAZ. Yeditepe University Department of Computer Engineering Kayışdağı Caddesi 34755 Istanbul Turkey
DR AYŞE KÜÇÜKYILMAZ Yeditepe University Department of Computer Engineering Kayışdağı Caddesi 34755 Istanbul Turkey http://cse.yeditepe.edu.tr/~akucukyilmaz akucukyilmaz@cse.yeditepe.edu.tr QUALIFICATIONS
More informationGraduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina
Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures
More informationMultimedia Technology Bachelor of Science
Multimedia Technology Bachelor of Science 1. Program s Name Thai Name : ว ทยาศาสตรบ ณฑ ต สาขาว ชาเทคโนโลย ม ลต ม เด ย English Name : Bachelor of Science Program in Multimedia Technology 2. Degree Full
More informationA Multi-User 3-D Virtual Environment with Interactive Collaboration and Shared Whiteboard Technologies
Abstract A Multi-User 3-D Virtual Environment with Interactive Collaboration and Shared Whiteboard Technologies Wing Ho Leung and Tsuhan Chen Carnegie Mellon University A multi-user 3-D virtual environment
More informationSPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
More informationHow do non-expert users exploit simultaneous inputs in multimodal interaction?
How do non-expert users exploit simultaneous inputs in multimodal interaction? Knut Kvale, John Rugelbak and Ingunn Amdal 1 Telenor R&D, Norway knut.kvale@telenor.com, john.rugelbak@telenor.com, ingunn.amdal@tele.ntnu.no
More informationGiuseppe Riccardi, Marco Ronchetti. University of Trento
Giuseppe Riccardi, Marco Ronchetti University of Trento 1 Outline Searching Information Next Generation Search Interfaces Needle E-learning Application Multimedia Docs Indexing, Search and Presentation
More informationA Sound Analysis and Synthesis System for Generating an Instrumental Piri Song
, pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,
More informationColorado School of Mines Computer Vision Professor William Hoff
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description
More informationModeling and Design of Intelligent Agent System
International Journal of Control, Automation, and Systems Vol. 1, No. 2, June 2003 257 Modeling and Design of Intelligent Agent System Dae Su Kim, Chang Suk Kim, and Kee Wook Rim Abstract: In this study,
More informationMaking Machines Understand Facial Motion & Expressions Like Humans Do
Making Machines Understand Facial Motion & Expressions Like Humans Do Ana C. Andrés del Valle & Jean-Luc Dugelay Multimedia Communications Dpt. Institut Eurécom 2229 route des Crêtes. BP 193. Sophia Antipolis.
More informationHow to Improve the Sound Quality of Your Microphone
An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies Andreas Maier, Julian Exner, Stefan Steidl, Anton Batliner, Tino Haderlein, and Elmar Nöth Universität Erlangen-Nürnberg,
More informationMULTIMODAL VIRTUAL ASSISTANTS FOR CONSUMER AND ENTERPRISE
MULTIMODAL VIRTUAL ASSISTANTS FOR CONSUMER AND ENTERPRISE Michael Johnston Lead Inventive Scientist 1 mjohnston@interactions.net VIRTUAL ASSISTANTS: CONSUMER Calling Local Search Messaging Web Search Calendar
More informationVEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James
More informationCONTROL, COMMUNICATION & SIGNAL PROCESSING (CCSP)
CONTROL, COMMUNICATION & SIGNAL PROCESSING (CCSP) KEY RESEARCH AREAS Data compression for speech, audio, images, and video Digital and analog signal processing Image and video processing Computer vision
More informationFRANCESCO BELLOCCHIO S CURRICULUM VITAE ET STUDIORUM
FRANCESCO BELLOCCHIO S CURRICULUM VITAE ET STUDIORUM April 2011 Index Personal details and education 1 Research activities 2 Teaching and tutorial activities 3 Conference organization and review activities
More informationCourse overview Processamento de sinais 2009/10 LEA
Course overview Processamento de sinais 2009/10 LEA João Pedro Gomes jpg@isr.ist.utl.pt Instituto Superior Técnico Processamento de sinais MEAer (IST) Course overview 1 / 19 Course overview Motivation:
More informationTracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer
More informationDiscriminative Multimodal Biometric. Authentication Based on Quality Measures
Discriminative Multimodal Biometric Authentication Based on Quality Measures Julian Fierrez-Aguilar a,, Javier Ortega-Garcia a, Joaquin Gonzalez-Rodriguez a, Josef Bigun b a Escuela Politecnica Superior,
More informationThirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
More informationSchool Class Monitoring System Based on Audio Signal Processing
C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.
More informationINTELLIGENT AGENTS AND SUPPORT FOR BROWSING AND NAVIGATION IN COMPLEX SHOPPING SCENARIOS
ACTS GUIDELINE GAM-G7 INTELLIGENT AGENTS AND SUPPORT FOR BROWSING AND NAVIGATION IN COMPLEX SHOPPING SCENARIOS Editor: Martin G. Steer (martin@eurovoice.co.uk) Contributors: TeleShoppe ACTS Guideline GAM-G7
More informationHow To Get A Computer Engineering Degree
COMPUTER ENGINEERING GRADUTE PROGRAM FOR MASTER S DEGREE (With Thesis) PREPARATORY PROGRAM* COME 27 Advanced Object Oriented Programming 5 COME 21 Data Structures and Algorithms COME 22 COME 1 COME 1 COME
More informationDevelopment of a Service Robot System for a Remote Child Monitoring Platform
, pp.153-162 http://dx.doi.org/10.14257/ijsh.2014.8.5.14 Development of a Service Robot System for a Remote Child Monitoring Platform Taewoo Han 1 and Yong-Ho Seo 2, * 1 Department of Game and Multimedia,
More informationMultisensor Data Fusion and Applications
Multisensor Data Fusion and Applications Pramod K. Varshney Department of Electrical Engineering and Computer Science Syracuse University 121 Link Hall Syracuse, New York 13244 USA E-mail: varshney@syr.edu
More informationJournal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition
IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu
More informationFacial Expression Analysis and Synthesis
1. Research Team Facial Expression Analysis and Synthesis Project Leader: Other Faculty: Post Doc(s): Graduate Students: Undergraduate Students: Industrial Partner(s): Prof. Ulrich Neumann, IMSC and Computer
More informationGender Identification using MFCC for Telephone Applications A Comparative Study
Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is
More informationSPEAKER IDENTITY INDEXING IN AUDIO-VISUAL DOCUMENTS
SPEAKER IDENTITY INDEXING IN AUDIO-VISUAL DOCUMENTS Mbarek Charhad, Daniel Moraru, Stéphane Ayache and Georges Quénot CLIPS-IMAG BP 53, 38041 Grenoble cedex 9, France Georges.Quenot@imag.fr ABSTRACT The
More informationRate control algorithms for video coding. Citation. Issued Date 2000. http://hdl.handle.net/10722/31094
Title Rate control algorithms for video coding Author(s) Ng, Cheuk-yan; 吳 卓 恩 Citation Issued Date 2000 URL http://hdl.handle.net/10722/31094 Rights The author retains all proprietary rights, (such as
More informationVoice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification
Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication
More informationWelcome to KU Engineering A. Murat Tekalp, Dean
Welcome to KU Engineering A. Murat Tekalp, Dean http://eng.ku.edu.tr www.facebook.com/kocuniveng 45 Faculty Members Computer Engineering Electrical and Electronics Engineering Mechanical Engineering Industrial
More informationHow To Filter Spam Image From A Picture By Color Or Color
Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among
More informationPakistan-U.S. Science and Technology Cooperation Program Annual Technical Report Form
Pakistan-U.S. Science and Technology Cooperation Program Annual Technical Report Form Reports should be prepared jointly by the Pakistani and U.S. principal investigators and should cover all project-related
More informationDESIGN OF CLUSTER OF SIP SERVER BY LOAD BALANCER
INTERNATIONAL JOURNAL OF REVIEWS ON RECENT ELECTRONICS AND COMPUTER SCIENCE DESIGN OF CLUSTER OF SIP SERVER BY LOAD BALANCER M.Vishwashanthi 1, S.Ravi Kumar 2 1 M.Tech Student, Dept of CSE, Anurag Group
More informationGLOVE-BASED GESTURE RECOGNITION SYSTEM
CLAWAR 2012 Proceedings of the Fifteenth International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines, Baltimore, MD, USA, 23 26 July 2012 747 GLOVE-BASED GESTURE
More informationMPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music
ISO/IEC MPEG USAC Unified Speech and Audio Coding MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music The standardization of MPEG USAC in ISO/IEC is now in its final
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationJune Zhang (Zhong-Ju Zhang)
(Zhong-Ju Zhang) Carnegie Mellon University Dept. Electrical and Computer Engineering, 5000 Forbes Ave. Pittsburgh, PA 15213 Phone: 678-899-2492 E-Mail: junez@andrew.cmu.edu http://users.ece.cmu.edu/~junez
More informationInternet Video Streaming and Cloud-based Multimedia Applications. Outline
Internet Video Streaming and Cloud-based Multimedia Applications Yifeng He, yhe@ee.ryerson.ca Ling Guan, lguan@ee.ryerson.ca 1 Outline Internet video streaming Overview Video coding Approaches for video
More informationSocial Signal Processing Understanding Nonverbal Behavior in Human- Human Interactions
Social Signal Processing Understanding Nonverbal Behavior in Human- Human Interactions A.Vinciarelli University of Glasgow and Idiap Research Institute http://www.dcs.gla.ac.uk/~ vincia e- mail: vincia@dcs.gla.ac.uk
More informationHybrid Lossless Compression Method For Binary Images
M.F. TALU AND İ. TÜRKOĞLU/ IU-JEEE Vol. 11(2), (2011), 1399-1405 Hybrid Lossless Compression Method For Binary Images M. Fatih TALU, İbrahim TÜRKOĞLU Inonu University, Dept. of Computer Engineering, Engineering
More informationISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining
A Review: Image Retrieval Using Web Multimedia Satish Bansal*, K K Yadav** *, **Assistant Professor Prestige Institute Of Management, Gwalior (MP), India Abstract Multimedia object include audio, video,
More informationA Voice and Ink XML Multimodal Architecture for Mobile e-commerce Systems
A Voice and Ink XML Multimodal Architecture for Mobile e-commerce Systems Zouheir Trabelsi, Sung-Hyuk Cha, Darshan Desai, Charles Tappert CSIS, Pace University, 861 Bedford Road, Pleasantville NY 10570-9913
More informationFace Locating and Tracking for Human{Computer Interaction. Carnegie Mellon University. Pittsburgh, PA 15213
Face Locating and Tracking for Human{Computer Interaction Martin Hunke Alex Waibel School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract Eective Human-to-Human communication
More informationThe Department of Electrical and Computer Engineering (ECE) offers the following graduate degree programs:
Note that these pages are extracted from the full Graduate Catalog, please refer to it for complete details. College of 1 ELECTRICAL AND COMPUTER ENGINEERING www.ece.neu.edu SHEILA S. HEMAMI, PHD Professor
More informationL9: Cepstral analysis
L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,
More informationHow To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3
Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web. By C.Moreno, A. Antolin and F.Diaz-de-Maria. Summary By Maheshwar Jayaraman 1 1. Introduction Voice Over IP is
More informationLecture 1-10: Spectrograms
Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed
More informationEricsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
More informationA very brief introduction to Electronic Engineering & Computer Science. Geraint A. Wiggins Professor of Computational Creativity & Head of School
A very brief introduction to Electronic Engineering & Computer Science Geraint A. Wiggins Professor of Computational Creativity & Head of School Example Careers Engineering and Computer Science solves
More informationMulti-Modal Acoustic Echo Canceller for Video Conferencing Systems
Multi-Modal Acoustic Echo Canceller for Video Conferencing Systems Mario Gazziro,Guilherme Almeida,Paulo Matias, Hirokazu Tanaka and Shigenobu Minami ICMC/USP, Brazil Email: mariogazziro@usp.br Wernher
More informationC E D A T 8 5. Innovating services and technologies for speech content management
C E D A T 8 5 Innovating services and technologies for speech content management Company profile 25 years experience in the market of transcription/reporting services; Cedat 85 Group: Cedat 85 srl Subtitle
More informationVictoria Kostina Curriculum Vitae - September 6, 2015 Page 1 of 5. Victoria Kostina
Victoria Kostina Curriculum Vitae - September 6, 2015 Page 1 of 5 Victoria Kostina Department of Electrical Engineering www.caltech.edu/~vkostina California Institute of Technology, CA 91125 vkostina@caltech.edu
More informationCurriculum Vitae. 1 Person Dr. Horst O. Bunke, Prof. Em. Date of birth July 30, 1949 Place of birth Langenzenn, Germany Citizenship Swiss and German
Curriculum Vitae 1 Person Name Dr. Horst O. Bunke, Prof. Em. Date of birth July 30, 1949 Place of birth Langenzenn, Germany Citizenship Swiss and German 2 Education 1974 Dipl.-Inf. Degree from the University
More informationSpeech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML
Speech Recognition of a Voice-Access Automotive Telematics System using VoiceXML Ing-Yi Chen Tsung-Chi Huang ichen@csie.ntut.edu.tw rick@ilab.csie.ntut.edu.tw Department of Computer Science and Information
More informationROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS
18th European Signal Processing Conference (EUSIPCO-21) Aalborg, Denmark, August 23-27, 21 ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS Christos Tzagkarakis and
More informationTemplate-based Eye and Mouth Detection for 3D Video Conferencing
Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer
More informationBig Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
More informationFaculté Polytechnique. Computational Attention and Social Signal Processing for Video Surveillance. Matei Mancas matei.mancas@umons.ac.
Faculté Polytechnique Computational Attention and Social Signal Processing for Video Surveillance Matei Mancas matei.mancas@umons.ac.be Overview Attention? «Social» features Top-down models Camera embodiment
More informationBlender in Research & Education
Blender in Research & Education 1 Overview The RWTH Aachen University The Research Projects Blender in Research Modeling and scripting Video editing Blender in Education Modeling Simulation Rendering 2
More informationObjective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification
Objective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification Raphael Ullmann 1,2, Ramya Rasipuram 1, Mathew Magimai.-Doss 1, and Hervé Bourlard 1,2 1 Idiap Research Institute,
More informationBasic Theory of Intermedia Composing with Sounds and Images
(This article was written originally in 1997 as part of a larger text to accompany a Ph.D. thesis at the Music Department of the University of California, San Diego. It was published in "Monochord. De
More informationTalking Head: Synthetic Video Facial Animation in MPEG-4.
Talking Head: Synthetic Video Facial Animation in MPEG-4. A. Fedorov, T. Firsova, V. Kuriakin, E. Martinova, K. Rodyushkin and V. Zhislina Intel Russian Research Center, Nizhni Novgorod, Russia Abstract
More informationIntegrating Multi-Modal Messages across Heterogeneous Networks.
Integrating Multi-Modal Messages across Heterogeneous Networks. Ramiro Liscano, Roger Impey, Qinxin Yu * and Suhayya Abu-Hakima Institute for Information Technology, National Research Council Canada, Montreal
More information7/3/12 Yusuf Sinan Akgül
1/6 Yusuf Sinan Akgul Assoc. Prof. Department Of Computer Engineering Gebze Institute of Technology Turkey akgul{at}bilmuh.gyte.edu.tr +90 262 605 2221 Education Ph.D. 2000, Department of Computer and
More informationLimitations of Human Vision. What is computer vision? What is computer vision (cont d)?
What is computer vision? Limitations of Human Vision Slide 1 Computer vision (image understanding) is a discipline that studies how to reconstruct, interpret and understand a 3D scene from its 2D images
More informationIntroduction. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr
Introduction Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr What is computer vision? What does it mean, to see? The plain man's answer (and Aristotle's, too)
More informationEMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS
EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS VALERY A. PETRUSHIN Andersen Consulting 3773 Willow Rd. Northbrook, IL 60062 petr@cstar.ac.com ABSTRACT The paper describes two experimental
More informationDatabase-Centered Architecture for Traffic Incident Detection, Management, and Analysis
Database-Centered Architecture for Traffic Incident Detection, Management, and Analysis Shailendra Bhonsle, Mohan Trivedi, and Amarnath Gupta* Department of Electrical and Computer Engineering, *San Diego
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationMachine Learning. CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205. Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004
CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205 Machine Learning Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml What
More information