The Different Types of Computer Networking Software

Size: px
Start display at page:

Download "The Different Types of Computer Networking Software"

Transcription

1 Department of Linguistics and Philology Språkteknologiprogrammet (Language Technology Programme) Master s thesis in Computational Linguistics 10th June 2005 A Speech-Driven Automatic Receptionist Written in VoiceXML Katarina Matzon Supervisors: Beáta Megyesi, Uppsala University Tobias Öhman, Voxway AB

2 Abstract This thesis describes the implementation of a speech-driven receptionist for Voxway AB. The receptionist was designed to be used by smaller Swedish companies. It answers calls coming into the company and directs the calls to an employee based on speech input from the user. It also handles unrecognized names and unanswered phonecalls. It was programmed in VoiceXML and ColdFusion. A database was designed and implemented to store data needed in order to make the receptionist dynamic and to log call statistics. The telephony application was evaluated by test users and a user survey. A website (programmed in HTML and ColdFusion) was designed to administrate the telephony application and allow companies to customize the application as well as view statistics about their usage of the application.

3 Contents Abstract Contents List of Figures List of Tables Acknowledgements ii iii v vi vii 1 Introduction Purpose Outline Dialogue Systems Speech Recognition Dialogue Management Design Methods Human Communication Design of Dialogue Generator VoiceXML ColdFusion Programming the Receptionist Static Receptionist Design of Dialogue Basic Code Building Grammars for Use Integrating Error Handling in the Code Integrating Dynamics Building the Database Using ColdFusion to Integrate Dynamics Organizing the Code for Dynamics Dynamic Queries and Output Dynamic Grammars Dynamic Prompts Implementing Statistical Element iii

4 4 Evaluation Evaluation Method Testing Test Users Evaluation of Results Designing the Web Interface 27 6 Concluding Remarks Future Improvements A Database 32 Bibliography 33 iv

5 List of Figures 2.1 The three modules of a dialogue system The relationship between SGML, HTML, XML and VoiceXML A simple VoiceXML example The seven subsystems of VoiceXML Stages of Development of Receptionist Example Dialogue Receptionist Applications s chain of events Example of different types of VoiceXML grammars Example of error handling in a dialogue Static event handling for an unanswered call Example of a possible conversation Query to find company name and ID Example of ColdFusion output Dynamic Grammar Dynamic Prompt Task example Home Page Employee List Blank form for new employees v

6 List of Tables 4.1 User Satisfaction Survey with Average Scores User Satisfaction Scores vi

7 Acknowledgements I would like to thank the people without whom this paper would not be what it is today. Thank you to both my supervisors Beáta Megyesi and Tobias Öhman. Thank you Bea for your encouragement and advice in writing this thesis, and thank you Tobias for all your encouragement and help on the programming of the receptionist. I would like to thank Botond Pakucs at KTH for contributing with advice on the evaluation of dialogue systems. I would also like to thank my friend Jens Bergqvist for helping me record incredible sound for the receptionist so that it sounds more professional. Thank you to all my friends and family who have supported me this semester and were always around to talk when I needed a break. And lastly, I especially want to thank my boyfriend, Johan, for being such an incredible support and help throughout this process, thank you for being my bollplank! vii

8 1 Introduction Natural language processing, the study of linguistics and computer science, is growing everyday. Everywhere people go today computers are understanding and interpreting the human language. One of the branches of computational linguistics is speech technology where computers understand and output speech. More and more companies are using speech technology. If you call the Swedish railway company you will be speaking to a computer to book your tickets or if you call the postal office in the United States you will be speaking to a computer to find out the postal code you need. Soon enough we will not need to type into keyboards because it will be standard to talk to your home computers. People are already speaking to their mini-computers. For example, when a person calls their friend on their mobile, they just say the friend s name and the call is connected (Dobler, 2000). Or when you are driving in your car and your navigational system is reciting directions for you to follow to your next destination (Wikipedia). These features improve our lives at home or at work. One branch of speech technology is spoken dialogue systems. Spoken dialogue systems utilize speech technology to enable humans and computers to interact by means of human speech. Here both aspects of speech technology, speech recognition and speech synthesis, combine to interact with humans in the form of a dialogue. In the Merriam-Webster English Online Dictionary dialogue is defined as follows Dialogue a conversation between two or more persons; also : a similar exchange between a person and something else (as a computer) b : an exchange of ideas and opinions c : a discussion between representatives of parties to a conflict that is aimed at resolution A spoken dialogue system can then be defined as a system designed to perform a spoken conversation between a person and a computer. One area where these systems are increasingly popular is the telephone industry. In the end of the 1990 s, telephone companies wanted to develop a common language to voice enable the web, in other words, to build dialogue systems that work over the web and over the telephone. The result of this discussion was VoiceXML (Voice Extensible Markup Language) (W3C, 2003). VoiceXML made it much simpler for companies to build web-enabled applications that include speech over the telephone and expanded the possibilities for voice applications. 1.1 Purpose The purpose of this thesis is to develop a speech-driven receptionist for Voxway AB. Voxway AB is a company specializing in developing and hosting IVR (Interactive 1

9 Voice Response) applications with speech technology. This task involves developing an automatic receptionist for small companies where the goal is to form a comfortable and efficient dialogue between the caller and the automated service. The dialogue system is programmed with VoiceXML. The receptionist is designed to expect the name or position of a person at the company. In case the requested person may be reached at several numbers, the application asks which number it should connect to (mobile, home, work). After the system knows the correct number, it connects the call. It also handles problems such as unrecognized names, busy signals, and unavailability. Besides the dialogue aspect, the application involves designing a database and web interface that can be accessed by each company in order to customize the application to their needs. Each company has its own application content that is stored in a database and accessed by using the telephone number that receives the call as a key. The information in the database is managed by the website which is designed to allow different companies to enter the site with a password and enter the information for each employee that is necessary for the receptionist to be able to connect a call. The website allows companies to see call statistics about the calls coming into the company and calls transferred within the company. 1.2 Outline This paper describes the implementation of an automatic receptionist. Chapter two gives a background on dialogue systems in order to prepare the user for chapter three which discusses the implementation of the receptionist from the static receptionist to the dynamic receptionist. The next chapter describes the evaluation of the implemented receptionist. The chapter to follow the evaluation describes the website design and implementation. The paper ends with concluding remarks and suggestions for future improvements. 2

10 2 Dialogue Systems Spoken dialogue systems are systems built to handle human-computer interaction in the form of speech. A system normally consists of different modules that handle different aspects of the dialogue. A simple system consists of three modules, a speech recognizer, a dialogue manager and an output generator as seen in Figure 2.1 (Gustafson, 2002). Speech Recognizer Dialogue Manager Output Generator Figure 2.1: The three modules of a dialogue system The first part is the automatic speech recognizer which converts the speech that is the input into text that the computer can parse. Once the text is parsed, it is sent to the dialogue manager which decides how the system should react to the input. Often, the reaction is to send output to the output component or generator. The output component consists of recorded prompts or text-to-speech (TTS) which converts a given output into speech to be recited to the user. Together these components form a dialogue system. This system can then accept input as speech, parse this input, decide how to handle the input, and send output via the generator. This is how a general dialogue system works, but systems are designed with different goals in mind and each component in the system will be formed differently depending on the goal. For example, the CU communicator is an interactive dialogue system for travel information over the phone (Pellom and Ward, 2000). In comparison, a system with an entirely different goal is August, a multimodal dialogue system which was used to interact with people at the cultural center in Stockholm (Gustafson et al., 1999). Since dialogue systems can differ so greatly, they are divided into three categories. The first is the task-oriented dialogue. This dialogue has well-defined goals and this is usually a simple dialogue. Examples include simple question and answer systems such as the CU communicator mentioned above. Another example is a system that gives traintimes over the telephone such as the Philips automatic train timetable information system (Aust et al., 1995). The second type of dialogue is the explorative dialogue where the goals are not as well-defined but instead the goals are to acquire knowledge about complex tasks or browse information (Gustafson, 2002). An ex- 3

11 ample would be an information browsing system such as AdApt which allows users to find out information about available apartments in the Stockholm area (Gustafson et al., 2000). Although there is a goal in their interaction it is not easily defined. With AdApt, the goal may be to find an apartment to buy or simply to browse available apartments out of curiosity. The third type of dialogue is context-oriented. These dialogues are focused on the actual dialogue situation. The primary goal for the user in this interaction is to be entertained (Gustafson, 2002). This dialogue is based on the system, its locations, or its surroundings. An example of this would be a museum guide system that talks about the exhibition it is stationed in such as August, the system described earlier (Gustafson et al., 1999). August has no goal other than conversing. Today, task-oriented dialogue systems are the most common. Mostly because it is easy to measure errors and effectiveness of the systems since the goals are so clear (Gustafson, 2002). But the other two types are possible and would expand the possibilities of the dialogue systems endlessly. A more in-depth look into each of the components of dialogue systems will be explored below. 2.1 Speech Recognition Automatic speech recognition (ASR) is the task of converting speech to text that can then be parsed by the computer. Determining what type of recognizer to build is one of the first steps. Many types of recognizers exist. One distinction is based on whether the system has prior knowledge about the user s speech characteristics or not. Speaker-dependent (SD) systems are designed to understand speakers previously trained on the system, and speaker-independent (SI) systems are trained to respond to a large group of people where training for each individual would be impossible (O Shaughnessy, 2000). SD systems exist, for example, in mobile phones where the speech recognizer recognizes its owner s way of pronouncing a person in the phone book exclusively. SI systems are much harder to make successful considering the large variations in speech that need to be taken into consideration. Inter-speaker variability is the difference in speech between individuals. These differences include dialects, emotion in speech, sex of the speaker, and age of the speaker. For example, the accent of a person from the south of Sweden is very different compared to the accent of a person from the north of Sweden. A SI recognizer needs to account for these differences in order to understand a broader scope of people. Besides these differences even the emotion in a voice differs between speakers. For example, the level of excitement in a voice will also be different depending on the speaker. All of these differences and more need to be considered when building a SI system. Besides inter-speaker variability, intra-speaker variability exists. Intra-speaker variability is the variability of speech within one person. One person is unlikely to utter the same exact thing more than once. The combination of intonation, pauses and emphasis is difficult to repeat exactly. This effects both SI and SD systems. A speech recognizer needs to be broad enough to handle these subtle differences in speech and be able to recognize the words that are spoken, but it needs to be narrow enough so that it does not confuse similar words. 4

12 Besides the aspects of speech, the nonspeech aspects are important to consider as well. Background noise plays a huge factor for the recognition. If a person is sitting in a crowded restaurant or in an empty room, it will be more difficult for the recognizer to recognize the person in the restaurant because of all the noise in the background. Also channel distortion needs to be considered. If a person is interacting with a system via a telephone the connection can worsen the recognition because of bandwidth limitations in the telephone network. Mobile phone connections can be bad or if a person calls from overseas, the connection can be affected and make it more difficult for the recognizer to understand the caller. The perfect conditions for a speech recognizer is one person in a silent room interacting with the computer without a medium such as a telephone. These conditions are, of course, not that common. Once speech is recognized and the actual text is extracted, the computer parses the input in a couple of ways. Each speech recognizer is equipped with a linguistic component that will parse the text before it is sent to the dialogue manager. The simplest parser is a static grammar which means that the parser has an unchanging grammar that the input is matched to, to try to find the best match. These matches can be similar to one another and therefore lists can be made by the system listing the most similar match to the the least similar. In more complex recognizers, a lexicon or corpus with a much larger number of words along with a grammar interact to parse the meaning of the input (Gustafson, 2002). This allows for more possibilities when it is impossible to know exactly what inputs will be entered. A more complex linguistic component allows for a more robust system. Once speech is recognized and parsed so that the system can interpret it, it is sent to the next component, the dialogue manager. 2.2 Dialogue Management The dialogue manager in a dialogue system is the backbone of the system. Once a text is parsed by the recognizer, the dialogue manager has to decide what to do with the input it has received. There are several different aspects to consider in the design of the dialogue manager so that it can handle input correctly and a successful dialogue can be programmed. The first and most basic is which method of design the designer chooses Design Methods A few different ways to design a dialogue system exist. Design by inspiration, design by observation and design by simulation (Gustafson, 2002). Designing by inspiration is when a designer decides how he is going to design his dialogue without consulting any external party. This is a bit risky since one person cannot think of all the possibilities in a conversation and it relies solely on the linguistic competence of the designer (Gustafson, 2002). This can be considered an option in simple systems where the purpose is for the user to reach a goal. Here it works since the user can be trained on how he can reach his goal, and then the dialogue system can be considered a success. In more complex systems, it will most likely not give a good result. Designing by observation is when the designer observes communication between humans emulating the situation he wants to depict in his system and trys to incorporate aspects of that 5

13 communication into the system. Lastly is design by simulation (wizard-of-oz technique) which is when some or all parts of a system are simulated and thus different aspects of the dialogue can be tested (Gustafson, 2002). This is quite a useful strategy since it will make the system more realistic since it will be a human speaking to a simulated interface instead of a human speaking to a human. The type of system and the possibilities the designer has will decide which design strategy is best suited for the dialogue system. Once a design method is chosen it is important to consider certain principles that exist in human communication Human Communication In order for a successful dialogue to be designed, the designer needs to observe human dialogue and account for all the unwritten rules that exist in human conversation. Only by following these rules and principles will the designer be able to design a dialogue system that people find as natural as speaking to a human.these principles and rules are discussed below. Certain assumptions exist when humans communicate in order for a conversation to be satisfactory to all parties. Principles have been studied and defined so that communication can be more easily studied. Grice (1975) has famously written about four well-known maxims that govern all conversation and when they are not followed, a conversation can be considered unsatisfactory. These four maxims are listed below. Quality. This means that in a conversation a person should always be sincere. People expect to hear the truth and will therefore be surprised if this maxim is not followed. Quantity. This means a person should say neither too little nor too much. If a person doesn t say enough then it could lead to confusion and the same could happen if they say too much. Relevance. This is easily explained as what a person says should always be relevant in conversation. If a person starts speaking of something unrelated to the current subject then it will confuse the listeners. Manner. This means avoid ambiguity. Be clear and to the point otherwise it can lead to confusion. All of these maxims need to be upheld in a dialogue system if the user is to feel comfortable with the conversation. Besides underlying principles in conversations, the conversation structure is important to follow. Conversations between humans are structured in turn construction units (TCU). Each speech act by each partner is considered a TCU and these TCUs are surrounded by turn relevance places (TRPs) (Norrby, 1996). For example, if one person directs a question to another person, that is considered a TCU. The answer the other person gives is another TCU and the time in between the question and answer is a TRP. TRPs are extremely important because they signal when another party can take a turn. TRPs are the natural place to take a turn if you are participating in a conversation. They can be signalled by a longer pause, the intonation at the end of a TCU and other signals that humans perceive automatically. It is important for the dialogue 6

14 system to understand when a pause is a TRP or not, otherwise a conversation can be frustrating for the user. These TRPs can be easier to find if the role of initiative in the dialogue is clear. When one person starts a dialogue she has initiative. The initiative can switch between the different parties as the conversation moves along to keep it going forward. A conversation is considered single initiative if one party always takes initiative (Gustafson, 2002). For example, the Danish flight ticket reservation system is a mainly system-directed task oriented dialogue (Bernsen et al., 1997). Mixed initative is when either party can take initiative (Gustafson, 2002). This can be seen in a system where the user can prompt the system for an answer to a question and the system can do the same with the user. An example of such a system is the Waxholm system which gives boat information for the Stockholm archipelago and was designed to allow user initiative as well as system initiative (Carlson et al., 1995). These assumptions and underlying rules of conversation need to be taken into consideration when designing a dialogue manager. Otherwise it will most likely be unpleasing to the human user. The next step is programming the actual dialogue Design of Dialogue Once the design method is decided and conversation principles are considered, the designer is ready to program the type of dialogue the manager will understand and interpret. To help in the design process, the designer can gather examples of dialogues to base design on or if this is not a possibility, the designer can use scenarios (Gustafson, 2002). Scenarios are when a designer considers all the different types of dialogues that can occur with the system in order to form a successful design. Scenarios are very helpful in that they take the system through as many different dialogues as possible. With the help of the gathered examples or scenarios, a dialogue is designed. The dialogue manager can then be programmed to interact with human users in the limited way that the system was designed to. But in order for the system to reach a greater scope of information, the dialogue manager may interact with a database. A database stores all the information that could be relevant to the dialogue. For example, in a train booking system, where people call to book tickets, the dialogue manager must interact with the database in order to find out information about the trains that are relevant. The database may give input to what the acceptable output may be. Once the dialogue manager has processed the input, the appropriate output is sent to the next component, the output generator. 2.3 Generator Output can be generated in a few ways in a dialogue system. One way is through recorded prompts that are played back to the user. Another way is generated through a TTS system. Recorded prompts can be used when there are messages that are always played in every dialogue. They are chosen because it is a real voice instead of a computer generated voice since human voices could be considered more pleasing to human listeners. 7

15 TTS is used when the output can not be foreseen. TTS does not sound as natural as a human voice and therefore recorded prompts are sometimes preferred, but, in many systems, output is often unique which makes TTS extremely powerful. TTS systems generally synthesize speech from text using linguistic processing and concatenating small speech units. It converts input text into speech waveforms using algorithms and previously coded speech data (O Shaughnessy, 2000). Speech synthesizers can be characterized by the size of speech units they concatenate and by the method used to synthesize the speech (O Shaughnessy, 2000). Large speech units produce high-quality speech but requires a lot of memory while efficient coding reduces memory but also reduces speech quality. Most commercial synthesizers have been based on word or phone concatenation (O Shaughnessy, 2000). Two commercial applications exist for speech synthesizers, voice-response systems which handle input text of limited vocabulary and syntax, and TTS systems which accept all input text (O Shaughnessy, 2000). TTS systems construct speech from text using small speech units and much linguistic processing whereas voiceresponse systems simply concantenate speech from the large units the system has stored. TTS systems are the systems that are of interest for most spoken dialogue systems. Several different methods of synthesis exist for TTS systems which include formant synthesis, articulatory synthesis, linear predictive coding synthesis, and waveform synthesis. The highest-quality synthesized speech uses waveform coders and large memories (O Shaughnessy, 2000). These synthesizers can be considered quite advanced for certain systems. Two other types of synthesizers are terminal-analog synthesizers and articulatory synthesizers (O Shaughnessy, 2000). With articulatory synthesis, the sound is created by modelling the actual vocal tract shapes and movements. In terminal-analogue synthesis only the acoustic results of speech are modelled without taking the vocal tract into account. The choice of synthesizer is greatly influenced by the size of the vocabulary. For example, a system that requires a synthesizer that can produce unlimited text will generally be of lower quality than a system that has limited output. The generator makes up the last of the three components that a dialogue system consists of. Now I will discuss one possibility to implement a dialogue system. This is the implementation that will be used in this thesis. If you want to learn more about speech synthesis or speech recognition refer to (O Shaughnessy, 2000). For more information on dialogue systems refer to (Gustafson, 2002). 2.4 VoiceXML VoiceXML (Voice Extensible Markup Language) is a powerful markup language that descends from SGML (Standard Generalized Markup Language). VoiceXML has two older siblings, HTML and XML, which were developed as children of SGML (see Figure 2.2). Whereas HTML is considered a single SGML application, XML is a metalanguage just as SGML. A metalanguage is a language that is used to define other languages (Abbott, 2002). All the descendents of SGML are markup languages which means that information content is stored with tags that describe the meaning of the information content (Abbott, 2002). XML was developed by a designer to generalize the success of HTML and also allow for a broader user base than SGML 8

16 by taking away some of the complexities of its mother language (Abbott, 2002). VoiceXML can be considered a young sibling to HTML. SGML HTML XML VoiceXML Figure 2.2: The relationship between SGML, HTML, XML and VoiceXML Although it is a sibling it interacts differently with its users than HTML since in VoiceXML applications the user speaks to the computer whereas in HTML, the user communicates visually with the computer with their mouse or keyboard (Abbott, 2002). VoiceXML was developed after discussion between telephone companies to develop a common language to voice enable the web. The first version was released in August A simple example is seen in Figure 2.3. The output after running this example would be a TTS of the text Hello World. <?xml version="1.0"?> <vxml version="2.0" xmlns=" <form> <block>hello World!</block> </form> </vxml> Figure 2.3: A simple VoiceXML example VoiceXML can be seen as a complete dialogue system for telephony applications where the designer simply has to program the dialogue manager and build grammars for the system. This can be seen in the seven subsystems which are listed below and illustrated in Figure 2.4. Network Interface Allows HTTP to communicate with a web server. VoiceXML Interpreter Software that can be considered the dialogue manager. This is where the programming and construction of the dialogue takes place. TTS As discussed above translates text to speech. Audio Allows audio prompts to be played or recorded. 9

17 Speech Recognition As discussed above translates user utterances into text. Voice- XML uses speaker-independent speech recognition where the interactions are structured dialogs where the user is limited to a finite vocabulary. DTMF (dual tone multi-frequency) Translates keypad input into characters Telephony Interface Enables communication with telephone networks. Telephony Interface Speech Recognition VoiceXML Interpreter DTMF Audio TTS Network Interface Figure 2.4: The seven subsystems of VoiceXML (Abbott, 2002) By putting together speech recognition, speech synthesis, XML and the web in this one powerful language, VoiceXML is able to extend the reach of the web since it allows it to be accessed from anywhere. It makes the web easier to use especially for people with disabilities such as blindness or illiteracy. In addition, it increases the options for human-computer interfaces since it is an inexpensive option compared to other voice applications (Abbott, 2002). VoiceXML has taken the expensive highend technology of speech technology and combined it with markup language to make speech technology something that is available for even low-end systems. VoiceXML works by interpreting between the user and the web server. The Voice- XML code lies on a server and is accessed by the web or by a telephone number. The code is processed and able to form a dialogue with the caller. Although this is powerful in and of itself, it is not very exciting. It can be compared to a static web page, the results never change. In order to make it dynamic it can integrate with a web application server which allows it to connect to a database. One such application server is ColdFusion ColdFusion ColdFusion was created in 1995 to introduce dynamics onto the internet (Danesh and Motlagh, 2000). Coldfusion interprets commands given by the web and connects to the database to retrieve the necessary information. For example, a website that contains many articles uses an application server such as ColdFusion to access the articles in the database. Otherwise each article would have to have its own webpage. This is what makes the web dynamic. When ColdFusion integrates with VoiceXML it allows telephony applications to become dynamic. ColdFusion is responsible for getting information to and from the database in the same way it does with regular webpages, but with voice applications it is interpreted by the VoiceXML gateway in 10

18 order for the information to be processed and found in the database. ColdFusion code can be integrated into VoiceXML applications which makes it very simple and easy to learn. Simple SQL statements are used to retrieve the necessary information from the web and this information continues to be processed by the VoiceXML code. 11

19 3 Programming the Receptionist The receptionist is programmed using VoiceXML and ColdFusion. Since the other parts of a dialogue system are included in the VoiceXML system (see section 2.4), the focus of the implementation will be on the design and implementation of the program code. Designing the receptionist has several stages of development (as seen in Fig- Static Code Event Handlers Database Design Dynamic Code Statistics Figure 3.1: Stages of Development of Receptionist ure 3.1). The first stage involves designing a static receptionist where no dynamic information exists to make sure that the program can run with hard-coded information. The next step involves integrating event handlers that will handle misrecognitions and other events. Once these two pieces are working, a database is developed that will allow the information that the receptionist uses to be dynamic. After the database is done, the static receptionist is reprogrammed to include ColdFusion markup language (CFML) which will enable communication with the database. Once the dynamics are in place, I am able to program in statistical elements that are important for administrative purposes such as call length, time the call started, phone number that the user called from, and the number the user called. After this, a website is designed that will allow companies to submit, change, or delete information in the database. Each of these developments is discussed below. 12

20 3.1 Static Receptionist Design of Dialogue Before programming the receptionist, the dialogue is designed. Since it is a simple dialogue, it is designed by inspiration and some observation of receptionist situations. A dialogue needs to be designed that upholds Grice s four maxims as discussed above, where the turn relevance places (TRPs) are obvious to the caller and also makes the system s dialogue simple so that the user will model their dialogue to the system s. The best approach is to be direct and to the point in as few words as possible. The dialogue is designed to be single-initiative where the system will always direct the caller. Although more experienced users have the possibility to barge-in which interrupts the computer when it is speaking which makes the dialogue more efficient. An example dialogue can be seen in Figure 3.2. (1) Computer: Välkommen till företaget. Vem vill du prata med? Caller: Anna Matzon. Computer: Vill du prata med kundservice Anna Matzon? Caller: Ja. Computer: Vill du bli kopplad till jobbtelefon, mobilen eller hemtelefon? Caller: Jobbtelefon. Computer: Varsågod. Snälla vänta medans jag kopplar samtalet. (samtalet kopplas) (2) Translated into English Computer: Welcome to the Company! Who would you like to speak to? Caller: Anna Matzon. Computer: Would you like to speak to customer service Anna Matzon? Caller: Yes. Computer: Would you like to be connected to work, mobile, or homephone? Caller: Workphone. Computer: One moment. Please wait while I transfer your call. (call transfers) Figure 3.2: Example Dialogue In this conversation, quality is upheld since there is no false statement in the conversation and the system is therefore sincere. Quantity is also upheld since the questions are simple but informative so that the user knows what response is necessary. The conversation upholds the relevance maxim since all the questions directed by the system are related to the goal of connecting the caller to a callee. Since the questions are unambigious, the manner maxim is also upheld. And in this way, all four maxims are satisfied. Since the system mostly asks questions, the TRPs are also clear to the user since an obvious TRP is the end of a question. The user is placed in a single-initiative situation since the questions are always directed to the user, and the user should not feel a need to ask questions in return. The goal with the receptionist is not to have a long conversation, but to connect the caller to a callee as simply and quickly as possible. This dialogue succeeds on 13

21 that aspect while upholding the rules of human conversation. The implementation of this design is discussed below Basic Code The static receptionist where all values are hard-coded, is programmed solely with VoiceXML. In the static version, the program code consists of one document that is followed linearly to connect the caller to a fixed destination. This chain of events can be seen in Figure 3.3. Callee Name Confirm Callee Callee Number Transfer Call Caller Callee Figure 3.3: Receptionist Applications s chain of events In the first part of the code, speech synthesis is used to ask who the caller would like to speak to. The response the caller gives has to be a part of the active grammar in order for it to be accepted. The grammars are discussed more below. If the user gives a response recognized by the system, the system confirms the recognized person that the caller chose. If the person is confirmed, the user is then asked by a speech synthesis prompt which telephone number she would like to be connected to. This response is also directed by a grammar. In the static version, the computer asks every person if they want to be connected to home, work or mobile phone since no database exists with information if one employee has more than one number or not. If it is incorrect, the code starts from the beginning. Once the number is retrieved, it goes to the next section which is the transfer section. In this section the call is transferred to the phone number that the caller wants to be connected to. If the number is busy the caller is told that they have to call back and a similar response if no one answers. After the call has been transferred and has returned, the system has a simple last message before the call disconnects. But in the static code, the telephone number is always the same since it is hardcoded. Therefore, the static code is pretty uninteresting to use except as a base to build on. How this static code turns into a useful dynamic code is discussed later in this chapter, but first grammars and event handlers will be discussed Building Grammars for Use In building the grammar for the receptionist, the goal is to keep the accepted responses short and simple so that the dialogue will be efficient and at the same time, 14

22 the speech recognizer will be able to work easily with short phrases. As discussed earlier, VoiceXML is built up of seven subsystems. One of these subsystems is the speech recognizer. In order for the recognizer to recognize user input, it needs to be told what the accepted responses are so that it can try to match them with the user input. This is done with grammars. A grammar can be built in several ways in VoiceXML. It can be a simple list of options, an inline grammar that is placed where it is used, or an external grammar that is placed in another document. Examples of these three are found in Figure 3.4. For the static code, an external grammar is used for both grammars. The first grammar is all the acceptable names a user can ask for (name grammar) and the second is the different types of telephone numbers they could be connected to(number grammar). <!--Options List, here the grammar consists of three options, red, blue and green--> <option value="röd">röd</option><!-- red --> <option value="blå">blå</option><!-- blue --> <option value="grön">grön</option><!-- green --> <!--Grammar (external and internal), here the grammar consists of three items that are part of a rule which defines the grammar--> <rule id="number" scope="public"> <one-of> <item>jobbet</item> <!-- job --> <item>mobilen</item> <!-- mobile --> <item>hemma</item> <!-- home --> </one-of> </rule> Figure 3.4: Example of different types of VoiceXML grammars As seen in Figure 3.4, the external grammar is identical to the in-line grammar, the only difference being that an external grammar is placed in another document instead of in the code. They are composed of rules that are defined by listing the possibilities. The options grammar is a bit different since there are no rules, instead a field has a set of options that defines the grammar. An external grammar is chosen for both grammars in the static code since it is neater and does not clutter the code. Since it is an external grammar, the rules can be more expansive as well. Since these grammars are what the speech recognizer will try to match to the user input, the text is written as say-as text which is similar to orthographic transcription. For example, Matzon is written matson since the z is pronounced as an s when spoken. Although it is written as it sounds, it is not phonetically transcribed. Once the grammars are implemented, the system recognizes an accepted name and connects the caller to the static phone number. But what happens with input that is not included in the grammar? Event handling is discussed in the next section. 15

23 3.1.4 Integrating Error Handling in the Code Error handling is necessary in order to handle exceptions in a way that is pleasing to the user. Errors introduced by imperfect recognition is a large problem facing dialogue systems (Choularton, 2004). Two general approaches exist to tackle this problem, error avoidance and error handling (Choularton, 2004). VoiceXML has built-in error handling for certain exceptions such as nomatch and noinput. Nomatch is when a person s response does not match any items in the specified grammars whereas noinput is when the user gives no audible response. In VoiceXML, by default, both of these are handled with a simple error message with a TTS voice and then reprompting the user for a response. This is a potentially frustrating scenario for a user since they would hear the same error message every time they give an unacceptable response. It is important that the exceptions are handled differently depending on the number of times the user has given an unacceptable response. Since the system wants to be natural, repeating the same question again and again is not desirable. According to Shin et al. (2002), user behavior when met with an error is to rephrase or repeat their response. This user behavior can be modelled in dialogue systems to manage dialogue when errors are introduced (Choularton, 2004). This way, the user is prompted once to repeat their answer and the second time they are given more specific instructions to rephrase their response. This approach follows the most normal way of handling errors even if it is not the most desirable since the information from the user s first response is discarded (Gorrell, 2003). For example, if the user responds with an unrecognized response one time, the message to the user will be different than if it is the third time. An example conversation with error handling is seen in Figure 3.5. (3) Computer: Välkommen till företaget. Vem vill du prata med? Caller: ehm, jag vet inte. Computer: Jag är ledsen. Jag förstod inte. Vem vill du prata med? Caller: ehm, jag vet inte. Computer: Jag känner inte igen det namnet. Du kan säga namnet eller funktionen av personen du vill prata med. Caller: Jag kommer inte ihåg. Computer: Tyvärr så förstod jag inte. Jag kopplar dig till kundtjänst. (4) Translated to English Computer: Welcome to the company. Who do you want to speak to? Caller: ummm, I don t know Computer: I m sorry I did not understand you, who would you like to speak to? Caller: Umm, I don t know Computer: I don t recognize that name. You can say the name or position of the person you would like to speak to. Caller: I don t remember. Computer: Unfortunately I did not understand. I will connect you to customer service. Figure 3.5: Example of error handling in a dialogue 16

24 Strategies that take longer but produce fewer errors and corrections are preferred by users (Hirschberg et al., 2000). As seen in the example above, if the system is unable to recognize an accepted answer three times in a row, the system connects the caller to customer service that can help them. This is a simple way of handling errors where after three attempts general help is given to the user (Gorrell, 2003). I choose to do this after three times since it gives the caller three opportunites to get to their desired person each time with slightly more specific instructions. If they are still unsuccessful after the third time, there is obviously a problem. More advanced techniques in error handling exist which take many aspects of the conversation into consideration as seen in Higgins - a dialogue system for investigating error handling techniques (Carlson et al., 2004). I have not implemented unique error handling for the number grammar where the user can respond with one of three options: mobile, home, or workhphone since the options are listed for the user in the question. It is unnecessary since the error handling would be simply reprompting the user again. The number grammar and the name grammar are the only two grammars where error handling for the user response is necessary. Error handling is also necessary for events pertaining to the phonecall. For example, error handling is necessary if the call is transferred to a number that is busy or has no answer. This is handled in the static version by simply stating that the person is busy or isn t answering and thanking them for their call as seen in Figure 3.6. Once the dynamics are built in, the user is given the option of trying another number or another person. (5) Computer. Anna Matzon svarar inte. Tack för samtalet, prova gärna igen senare. Computer: Anna Matzon is not answering. Thank you for your call, please try again later. Figure 3.6: Static event handling for an unanswered call To summarize, the static code is coded in VoiceXML where a person calls in, asks for a person that is in the grammar, responds with the type of number they want to call and are connected to a static number. If their responses are unacceptable, special event handlers exist. Also if the number is busy/noanswer, they are informed. It is quite obvious that this code is not very powerful. The force comes when the code becomes dynamic. In order for it to be dynamic, it needs a database to hold all the necessary information. 3.2 Integrating Dynamics The first part to integrating dynamics to the static code is building a functional database. Once the database is successful, ColdFusion can be integrated with VoiceXML to connect the database to the program Building the Database An efficient database is necessary to build an acceptable system. Without a working database, the system is not functional which is why the database design is so import- 17

25 ant and central to the entire system. The database can be viewed in Appendix A. It consists of five tables which are listed below. Company Employee Tilltal InCall TransferCall The Company table holds information about each company. Each company has a unique id which is used to separate the information in the other tables between companies. The Employee table holds information about each individual employee including their telephone numbers and position at the company. Each employee has their own unique id which separates the employees in the Tilltal table as well. The Tilltal table is the source of the grammar for all the names. Here, each name that can be used to reach a person is registered with that employee s ID. The last two tables, the InCall and TransferCall tables hold information about the calls for administrative purposes. In order to test that these tables with the information included as above are efficient and functional, scenarios that can happen with a caller are designed and how these events effect the database are tested. A few scenarios are accounted for below. All the scenarios begin by a caller calling a certain telephone number which identifies the company in the database. Knowing which company it is, the system finds the appropriate welcome message and plays the message to the caller. After the welcome message, the system asks who the caller wants to speak to. The caller then responds with a name (in our example the name is Anna). The system then searches in the Tilltal table of the database with the id of the company as above to find an entry of the name Anna. It then finds an entry, connects it to the employee table with the employee ID, and finds the filename with the employee Anna s full name and asks the caller if he wants to speak to Anna Matzon. If the answer is yes, the caller is connected to one of the telephone numbers in the Employee table. If no, the system has to start from the beginning but this time eliminating the employee Anna Matzon as one of the options. In this way the system can search through the names in the Tilltal table to find a different result. This is done by eliminating the previous employee s ID from the search. One variation of the above scenario is when a caller wants to speak to a group, for example sales or customer service. If the caller asks for customer service then the computer is going to find the employee that has customer service as her position. The problem comes when the computer wants to confirm the callee with the caller. If the computer says the callee s actual name then the caller has no idea if it is correct or not. An example of this can be seen in Figure 3.7. A simple solution to this problem is that instead of simply having their names in the confirmation, the confirmation states their position along with their full name so that if the person calling does not know the callee s name they will still know they are being connected to the correct person. The next scenario is how the database should handle the calls that aren t connected. A first thought is that for the calls that aren t answered or are busy and aren t automatically connected to voic , the system could have a message system of 18

Thin Client Development and Wireless Markup Languages cont. VoiceXML and Voice Portals

Thin Client Development and Wireless Markup Languages cont. VoiceXML and Voice Portals Thin Client Development and Wireless Markup Languages cont. David Tipper Associate Professor Department of Information Science and Telecommunications University of Pittsburgh tipper@tele.pitt.edu http://www.sis.pitt.edu/~dtipper/2727.html

More information

Dialog planning in VoiceXML

Dialog planning in VoiceXML Dialog planning in VoiceXML Csapó Tamás Gábor 4 January 2011 2. VoiceXML Programming Guide VoiceXML is an XML format programming language, describing the interactions between human

More information

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms VoiceXML Tutorial Part 1: VoiceXML Basics and Simple Forms What is VoiceXML? XML Application W3C Standard Integration of Multiple Speech and Telephony Related Technologies Automated Speech Recognition

More information

Version 2.6. Virtual Receptionist Stepping Through the Basics

Version 2.6. Virtual Receptionist Stepping Through the Basics Version 2.6 Virtual Receptionist Stepping Through the Basics Contents What is a Virtual Receptionist?...3 About the Documentation...3 Ifbyphone on the Web...3 Setting Up a Virtual Receptionist...4 Logging

More information

An Introduction to VoiceXML

An Introduction to VoiceXML An Introduction to VoiceXML ART on Dialogue Models and Dialogue Systems François Mairesse University of Sheffield F.Mairesse@sheffield.ac.uk http://www.dcs.shef.ac.uk/~francois Outline What is it? Why

More information

! <?xml version="1.0">! <vxml version="2.0">!! <form>!!! <block>!!! <prompt>hello World!</prompt>!!! </block>!! </form>! </vxml>

! <?xml version=1.0>! <vxml version=2.0>!! <form>!!! <block>!!! <prompt>hello World!</prompt>!!! </block>!! </form>! </vxml> Using VoiceXML! Language spec 2.0! Includes support for VUI and for telephony applications (call forward, transfers, etc) " Has tags specific to voice application! Simple (and classic) example! !

More information

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International IVR CRM Integration Migrating the Call Center from Cost Center to Profit Rod Arends Cheryl Yaeger BenchMark Consulting International Today, more institutions are seeking ways to change their call center

More information

Traitement de la Parole

Traitement de la Parole Traitement de la Parole Cours 11: Systèmes à dialogues VoiceXML partie 1 06/06/2005 Traitement de la Parole SE 2005 1 jean.hennebert@unifr.ch, University of Fribourg Date Cours Exerc. Contenu 1 14/03/2005

More information

Voicemail. Advanced User s Guide. Version 2.0

Voicemail. Advanced User s Guide. Version 2.0 Advanced User s Guide Version 2.0 Contents Introduction to the Documentation... 3 About the Documentation... 3 Ifbyphone on the Web... 3 Logging in to your ifbyphone Account... 3 Setting Up a Voice Mailbox...

More information

Phone Routing Stepping Through the Basics

Phone Routing Stepping Through the Basics Ng is Phone Routing Stepping Through the Basics Version 2.6 Contents What is Phone Routing?...3 Logging in to your Ifbyphone Account...3 Configuring Different Phone Routing Functions...4 How do I purchase

More information

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML Speech Recognition of a Voice-Access Automotive Telematics System using VoiceXML Ing-Yi Chen Tsung-Chi Huang ichen@csie.ntut.edu.tw rick@ilab.csie.ntut.edu.tw Department of Computer Science and Information

More information

VoiceXML-Based Dialogue Systems

VoiceXML-Based Dialogue Systems VoiceXML-Based Dialogue Systems Pavel Cenek Laboratory of Speech and Dialogue Faculty of Informatics Masaryk University Brno Agenda Dialogue system (DS) VoiceXML Frame-based DS in general 2 Computer based

More information

Specialty Answering Service. All rights reserved.

Specialty Answering Service. All rights reserved. 0 Contents 1 Introduction... 2 1.1 Types of Dialog Systems... 2 2 Dialog Systems in Contact Centers... 4 2.1 Automated Call Centers... 4 3 History... 3 4 Designing Interactive Dialogs with Structured Data...

More information

1Building Communications Solutions with Microsoft Lync Server 2010

1Building Communications Solutions with Microsoft Lync Server 2010 1Building Communications Solutions with Microsoft Lync Server 2010 WHAT S IN THIS CHAPTER? What Is Lync? Using the Lync Controls to Integrate Lync Functionality into Your Applications Building Custom Communications

More information

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications Oxford University Press 2007. All rights reserved. 1 XML An extensible language The

More information

Hermes.Net IVR Designer Page 2 36

Hermes.Net IVR Designer Page 2 36 Hermes.Net IVR Designer Page 2 36 Summary 1. Introduction 4 1.1 IVR Features 4 2. The interface 5 2.1 Description of the Interface 6 2.1.1 Menus. Provides 6 2.1.2 Commands for IVR editions. 6 2.1.3 Commands

More information

Contents. Specialty Answering Service. All rights reserved.

Contents. Specialty Answering Service. All rights reserved. Contents 1 Abstract... 2 2 What Exactly Is IVR Technology?... 3 3 How to Choose an IVR Provider... 4 3.1 Standard Features of IVR Providers... 4 3.2 Definitions... 4 3.3 IVR Service Providers... 5 3.3.1

More information

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications Lerato Lerato, Maletšabisa Molapo and Lehlohonolo Khoase Dept. of Maths and Computer Science, National University of Lesotho Roma

More information

Voice Driven Animation System

Voice Driven Animation System Voice Driven Animation System Zhijin Wang Department of Computer Science University of British Columbia Abstract The goal of this term project is to develop a voice driven animation system that could take

More information

ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition

ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition The CU Communicator: An Architecture for Dialogue Systems 1 Bryan Pellom, Wayne Ward, Sameer Pradhan Center for Spoken Language Research University of Colorado, Boulder Boulder, Colorado 80309-0594, USA

More information

A Development Tool for VoiceXML-Based Interactive Voice Response Systems

A Development Tool for VoiceXML-Based Interactive Voice Response Systems A Development Tool for VoiceXML-Based Interactive Voice Response Systems Cheng-Hsiung Chen Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University Chiayi,

More information

Support and Compatibility

Support and Compatibility Version 1.0 Frequently Asked Questions General What is Voiyager? Voiyager is a productivity platform for VoiceXML applications with Version 1.0 of Voiyager focusing on the complete development and testing

More information

The ROI. of Speech Tuning

The ROI. of Speech Tuning The ROI of Speech Tuning Executive Summary: Speech tuning is a process of improving speech applications after they have been deployed by reviewing how users interact with the system and testing changes.

More information

Develop Software that Speaks and Listens

Develop Software that Speaks and Listens Develop Software that Speaks and Listens Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks or registered

More information

How To Develop A Voice Portal For A Business

How To Develop A Voice Portal For A Business VoiceMan Universal Voice Dialog Platform VoiceMan The Voice Portal with many purposes www.sikom.de Seite 2 Voice Computers manage to do ever more Modern voice portals can... extract key words from long

More information

E I M S - Interactive Voice Response System

E I M S - Interactive Voice Response System E I M S - Interactive Voice Response System Redox Technologies is a pioneer in computer telephony development and IVR service bureaus. We have developed, implemented and maintain no. of applications currently

More information

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS. Except where reference is made to the work of others, the work described in this thesis is.

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS. Except where reference is made to the work of others, the work described in this thesis is. VOICE INFORMATION RETRIEVAL FOR DOCUMENTS Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. Weihong

More information

customer care solutions

customer care solutions customer care solutions from Nuance white paper :: Understanding Natural Language Learning to speak customer-ese In recent years speech recognition systems have made impressive advances in their ability

More information

To help manage calls:

To help manage calls: Mobile Phone Feature Definitions To help manage calls: Call waiting and call hold Allows you to accept a second incoming call with out losing the original call, then switch back and forth between them.

More information

Enabling Speech Based Access to Information Management Systems over Wireless Network

Enabling Speech Based Access to Information Management Systems over Wireless Network Enabling Speech Based Access to Information Management Systems over Wireless Network M. Bagein, O. Pietquin, C. Ris and G. Wilfart 1 Faculté Polytechnique de Mons - TCTS Lab. Parc Initialis - Av. Copernic,

More information

How To Use Voicexml On A Computer Or Phone (Windows)

How To Use Voicexml On A Computer Or Phone (Windows) Workshop Spoken Language Dialog Systems VoiceXML Rolf Schwitter schwitt@ics.mq.edu.au Macquarie University 2004 1 PhD Scholarship at Macquarie University A Natural Language Interface to a Logic Teaching

More information

Hosted Fax Mail. Hosted Fax Mail. User Guide

Hosted Fax Mail. Hosted Fax Mail. User Guide Hosted Fax Mail Hosted Fax Mail User Guide Contents 1 About this Guide... 2 2 Hosted Fax Mail... 3 3 Getting Started... 4 3.1 Logging On to the Web Portal... 4 4 Web Portal Mailbox... 6 4.1 Checking Messages

More information

XML based Interactive Voice Response System

XML based Interactive Voice Response System XML based Interactive Voice Response System Sharad Kumar Singh PT PureTesting Software P Ltd. Noida, India ABSTRACT The paper presents the architecture of a web based interactive voice response system

More information

Table of Contents INTRODUCTION... 5 ADMINISTRATION... 6 MANAGING ACD GROUPS... 8

Table of Contents INTRODUCTION... 5 ADMINISTRATION... 6 MANAGING ACD GROUPS... 8 WorldSmart ACD Help Table of Contents INTRODUCTION... 5 OVERVIEW... 5 WHO CAN CREATE AND MANAGE ACD... 5 ADMINISTRATION... 6 CREATING A NEW GROUP... 6 ASSIGN PHONE NUMBER... 7 MANAGING ACD GROUPS... 8

More information

Email Signatures. Advanced User s Guide. Version 2.0

Email Signatures. Advanced User s Guide. Version 2.0 Advanced User s Guide Version 2.0 Contents Email Signatures... 3 About the Documentation... 3 Ifbyphone on the Web... 3 Copying Click-to-XyZ Code... 4 Logging in to your ifbyphone Account... 4 Web-Based

More information

VoiceXML. Erik Harborg SINTEF IKT. Presentasjon, 4. årskurs, NTNU, 2007-04-17 ICT

VoiceXML. Erik Harborg SINTEF IKT. Presentasjon, 4. årskurs, NTNU, 2007-04-17 ICT VoiceXML Erik Harborg SINTEF IKT Presentasjon, 4. årskurs, NTNU, 2007-04-17 1 Content Voice as the user interface What is VoiceXML? What type of applications can be implemented? Example applications VoiceXML

More information

VoiceXML. For: Professor Gerald Q. Maguire Jr. By: Andreas Ångström, it00_aan@it.kth.se and Johan Sverin, it00_jsv@it.kth.se Date: 2004-05-24

VoiceXML. For: Professor Gerald Q. Maguire Jr. By: Andreas Ångström, it00_aan@it.kth.se and Johan Sverin, it00_jsv@it.kth.se Date: 2004-05-24 Royal Institute of Technology, KTH IMIT Practical Voice over IP 2G1325 VoiceXML For: Professor Gerald Q. Maguire Jr. By: Andreas Ångström, it00_aan@it.kth.se and Johan Sverin, it00_jsv@it.kth.se Date:

More information

VIRTUAL RECEPTIONIST OVERVIEW. Cbeyond Virtual Receptionist Offers:

VIRTUAL RECEPTIONIST OVERVIEW. Cbeyond Virtual Receptionist Offers: VIRTUAL RECEPTIONIST OVERVIEW Cbeyond Virtual Receptionist Offers: MENU SETUP: Use Virtual Receptionist to create different main menus for when your company is open or closed. With Cbeyond's preconfigured

More information

Specialty Answering Service. All rights reserved.

Specialty Answering Service. All rights reserved. 0 Contents 1 Introduction... 3 2 Technology... 5 2.1 VoiceXML Architecture... 6 2.2 Related Standards... 7 2.2.1 SRGS and SISR... 7 2.2.2 SSML... 7 2.2.3 PLS... 7 2.2.4 CCXML... 7 2.2.5 MSML, MSCML, MediaCTRL...

More information

Voice Messaging. Reference Guide

Voice Messaging. Reference Guide Voice Messaging Reference Guide Table of Contents Voice Messaging 1 Getting Started 3 To Play a Message 4 To Answer a Message 5 To Make a Message 6 To Give a Message 7 Message Addressing Options 8 User

More information

CHAPTER 4 Enhanced Automated Attendant

CHAPTER 4 Enhanced Automated Attendant CHAPTER 4 Enhanced Automated Attendant 4 This chapter shows you how to design, configure and implement a multi-level auto attendant, using TFB s Enhanced Automated Attendant (Auto Attendant for short).

More information

Multimodality: The Next Wave of Mobile Interaction

Multimodality: The Next Wave of Mobile Interaction Multimodality: The Next Wave of Mobile Interaction White Paper Multimodality is exciting new technology that promises to dramatically enhance the mobile user experience by enabling network operators to

More information

Standard Languages for Developing Multimodal Applications

Standard Languages for Developing Multimodal Applications Standard Languages for Developing Multimodal Applications James A. Larson Intel Corporation 16055 SW Walker Rd, #402, Beaverton, OR 97006 USA jim@larson-tech.com Abstract The World Wide Web Consortium

More information

How To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3

How To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3 Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web. By C.Moreno, A. Antolin and F.Diaz-de-Maria. Summary By Maheshwar Jayaraman 1 1. Introduction Voice Over IP is

More information

Text-To-Speech Technologies for Mobile Telephony Services

Text-To-Speech Technologies for Mobile Telephony Services Text-To-Speech Technologies for Mobile Telephony Services Paulseph-John Farrugia Department of Computer Science and AI, University of Malta Abstract. Text-To-Speech (TTS) systems aim to transform arbitrary

More information

VoiceXML Data Logging Overview

VoiceXML Data Logging Overview Data Logging Overview - Draft 0.3-20 August 2007 Page 1 Data Logging Overview Forum Tools Committee Draft 0.3-20 August 2007 Data Logging Overview - Draft 0.3-20 August 2007 Page 1 About the Forum: Founded

More information

VoiceXML Overview. James A. Larson Intel Corporation jim@larson-tech.com. (c) 2007 Larson Technical Services 1

VoiceXML Overview. James A. Larson Intel Corporation jim@larson-tech.com. (c) 2007 Larson Technical Services 1 VoiceXML Overview James A. Larson Intel Corporation jim@larson-tech.com (c) 2007 Larson Technical Services 1 Outline Motivation for VoiceXML W3C Speech Interface Framework Languages Dialog VoiceXML 2.0

More information

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,

More information

Intellect Platform - The Workflow Engine Basic HelpDesk Troubleticket System - A102

Intellect Platform - The Workflow Engine Basic HelpDesk Troubleticket System - A102 Intellect Platform - The Workflow Engine Basic HelpDesk Troubleticket System - A102 Interneer, Inc. Updated on 2/22/2012 Created by Erika Keresztyen Fahey 2 Workflow - A102 - Basic HelpDesk Ticketing System

More information

Enterprise Messaging, Basic Voice Mail, and Embedded Voice Mail Card

Enterprise Messaging, Basic Voice Mail, and Embedded Voice Mail Card MITEL Enterprise Messaging, Basic Voice Mail, and Embedded Voice Mail Card User Guide Notice This guide is released by Mitel Networks Corporation and provides information necessary to use Mitel voice

More information

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV Université de Technologie de Compiègne UTC +(8',$6

More information

VoiceXML Programmer s Guide

VoiceXML Programmer s Guide VoiceXML Programmer s Guide VOICEXML PROGRAMMER S GUIDE 1 BeVocal, Inc. 685 Clyde Avenue Mountain View, CA 94043 Part No. 520-0001-02 Copyright 2005. BeVocal, Inc. All rights reserved. 2 VOICEXML PROGRAMMER

More information

How To Write A Powerpoint Powerpoint Gsl In A Html Document In A Wordpress 3.5.2 (Html) Or A Microsoft Powerpoint (Html5) (Html3) (Powerpoint) (Web) (Www

How To Write A Powerpoint Powerpoint Gsl In A Html Document In A Wordpress 3.5.2 (Html) Or A Microsoft Powerpoint (Html5) (Html3) (Powerpoint) (Web) (Www VoiceXML Tutorial BeVocal, Inc. 685 Clyde Avenue Mountain View, CA 94043 Part No. 520-0002-02 Copyright 2005. BeVocal, Inc. All rights reserved. 2 VOICEXML TUTORIAL Table of Contents Preface...............................................................1

More information

Interfaces de voz avanzadas con VoiceXML

Interfaces de voz avanzadas con VoiceXML Interfaces de voz avanzadas con VoiceXML Digital Revolution is coming Self driving cars Self voice services Autopilot for CAR Speaker Automatic Speech Recognition ASR DTMF keypad SIP / VoIP or TDM Micro

More information

Combining VoiceXML with CCXML

Combining VoiceXML with CCXML Combining VoiceXML with CCXML A Comparative Study Daniel Amyot and Renato Simoes School of Information Technology and Engineering University of Ottawa Ottawa, Canada damyot@site.uottawa.ca, renatops@yahoo.com

More information

Avaya Aura Orchestration Designer

Avaya Aura Orchestration Designer Avaya Aura Orchestration Designer Avaya Aura Orchestration Designer is a unified service creation environment for faster, lower cost design and deployment of voice and multimedia applications and agent

More information

Introduction to Python

Introduction to Python WEEK ONE Introduction to Python Python is such a simple language to learn that we can throw away the manual and start with an example. Traditionally, the first program to write in any programming language

More information

Moving Enterprise Applications into VoiceXML. May 2002

Moving Enterprise Applications into VoiceXML. May 2002 Moving Enterprise Applications into VoiceXML May 2002 ViaFone Overview ViaFone connects mobile employees to to enterprise systems to to improve overall business performance. Enterprise Application Focus;

More information

Voice based email system for blinds

Voice based email system for blinds Voice based email system for blinds T.Shabana 1, A.Anam 2, A.Rafiya 3, K.Aisha 4 Assistant Professor, Computer Engineering, M.H. Saboo Siddik College of Engineering, Mumbai, India 1 UG Student, Computer

More information

VoiceXML Discussion. http://www.w3.org/tr/voicexml20/

VoiceXML Discussion. http://www.w3.org/tr/voicexml20/ VoiceXML Discussion http://www.w3.org/tr/voicexml20/ Voice Extensible Markup Language (VoiceXML) o is a markup-based, declarative, programming language for creating speechbased telephony applications o

More information

SVMi-4 & SVM-400. Voice Mail System. System Administration Manual

SVMi-4 & SVM-400. Voice Mail System. System Administration Manual SVMi-4 & SVM-400 Voice Mail System System Administration Manual Contents About this Book 3 How to use this online manual 4 How to print this online manual 5 Feature Descriptions 6 SYSTEM FEATURES 6 AUTO

More information

Design and Data Collection for Spoken Polish Dialogs Database

Design and Data Collection for Spoken Polish Dialogs Database Design and Data Collection for Spoken Polish Dialogs Database Krzysztof Marasek, Ryszard Gubrynowicz Department of Multimedia Polish-Japanese Institute of Information Technology Koszykowa st., 86, 02-008

More information

Abstract. Avaya Solution & Interoperability Test Lab

Abstract. Avaya Solution & Interoperability Test Lab Avaya Solution & Interoperability Test Lab Application Notes for LumenVox Automated Speech Recognizer, LumenVox Text-to-Speech Server and Call Progress Analysis with Avaya Aura Experience Portal Issue

More information

Norstar Voice Mail 4.0 Reference Guide

Norstar Voice Mail 4.0 Reference Guide Norstar Voice Mail 4.0 Reference Guide Norstar is a trademark of Northern Telecom Copyright Northern Telecom 1998 1-800-4 NORTEL www.nortel.com/norstar Norstar is a trademark of Northern Telecom. P0886602

More information

A secure face tracking system

A secure face tracking system International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 10 (2014), pp. 959-964 International Research Publications House http://www. irphouse.com A secure face tracking

More information

Speech Analytics. Whitepaper

Speech Analytics. Whitepaper Speech Analytics Whitepaper This document is property of ASC telecom AG. All rights reserved. Distribution or copying of this document is forbidden without permission of ASC. 1 Introduction Hearing the

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com

More information

VOICEXML TUTORIAL AN INTRODUCTION TO VOICEXML

VOICEXML TUTORIAL AN INTRODUCTION TO VOICEXML VOICEXML TUTORIAL AN INTRODUCTION TO VOICEXML Contents Chapter 1 - Introduction... 3 Voice Access to the Web... 3 Developing an Application... 4 Basics of VoiceXML... 4 Conclusion... 7 Chapter 2 - A Basic

More information

VoIP Quick Start Guide

VoIP Quick Start Guide VoIP Quick Start Guide VoIP is made up of three elements: The Phone The Software (optional) The Web Version of the software (optional) Your new voice mail can be accessed by calling (971-722) 8988. Or,

More information

UM8000 MAIL USER GUIDE

UM8000 MAIL USER GUIDE UM8000 MAIL USER GUIDE INT-2076 (UNIV) Issue 1.0 INTRODUCTION Welcome to UM8000 Mail User Guide. The UM8000 Mail is a simple yet powerful voice messaging system that can greet your callers and record your

More information

Virtual Receptionist Manual

Virtual Receptionist Manual Virtual Receptionist Manual This manual is meant to be a guide to help you set up your PhoneFusion One Virtual Receptionist phone number, and some tips and shortcuts for some of your favorite features.

More information

Call Recorder Oygo Manual. Version 1.001.11

Call Recorder Oygo Manual. Version 1.001.11 Call Recorder Oygo Manual Version 1.001.11 Contents 1 Introduction...4 2 Getting started...5 2.1 Hardware installation...5 2.2 Software installation...6 2.2.1 Software configuration... 7 3 Options menu...8

More information

Project CONNECT. Guidance on Using The Phone System

Project CONNECT. Guidance on Using The Phone System Project CONNECT Guidance on Using The Phone System SOME PRINCIPLES Effective use of the telephone is one of our most important activities. Since the telephone may be the first or only contact we have with

More information

MiVoice Integration for Salesforce

MiVoice Integration for Salesforce MiVoice Integration for Salesforce USER GUIDE MiVoice Integration for Salesforce User Guide, Version 1, April 2014. Part number 58014124 Mitel is a registered trademark of Mitel Networks Corporation. Salesforce

More information

Call Answer/Message Manager with Aliant Voicemail Online

Call Answer/Message Manager with Aliant Voicemail Online Call Answer/Message Manager with Aliant Voicemail Online Call Answer/Message Manager automatically takes a message when you re away, on the telephone or on Dial up Internet. You ll enjoy the convenience

More information

Cisco IOS VoiceXML Browser

Cisco IOS VoiceXML Browser Cisco IOS VoiceXML Browser Q. What is VoiceXML? A. Voice Extensible Markup Language (VoiceXML) is an XML-based creation environment for voice applications including user interfaces for use with automatic-speech-recognition

More information

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn VoiceXML and VoIP Architectural Elements of Next-Generation Telephone Services RJ Auburn Chief Network Architect, Voxeo Corporation Editor, CCXML Version 1.0, W3C Ken Rehor Software Architect, Nuance Communications

More information

Sistel Call Center - IVR Module Guide. Sistel Call Center. IVR Module. www.smartsoft-eg.com

Sistel Call Center - IVR Module Guide. Sistel Call Center. IVR Module. www.smartsoft-eg.com Sistel Call Center IVR Module Introduction The need for a human operator to handle a high volume of simple repetitive phone calls is a thing of the past. Today, Computer Telephony Integration (CTI) leverages

More information

Electra Elite and InfoSet are registered trademarks of NEC America, Inc.

Electra Elite and InfoSet are registered trademarks of NEC America, Inc. reserves the right to change the specifications, functions, or features, at any time, without notice. has prepared this document for the use by its employees and customers. The information contained herein

More information

A Comparative Analysis of Speech Recognition Platforms

A Comparative Analysis of Speech Recognition Platforms Communications of the IIMA Volume 9 Issue 3 Article 2 2009 A Comparative Analysis of Speech Recognition Platforms Ore A. Iona College Follow this and additional works at: http://scholarworks.lib.csusb.edu/ciima

More information

Web page creation using VoiceXML as Slot filling task. Ravi M H

Web page creation using VoiceXML as Slot filling task. Ravi M H Web page creation using VoiceXML as Slot filling task Ravi M H Agenda > Voice XML > Slot filling task > Web page creation for user profile > Applications > Results and statistics > Language model > Future

More information

Quick Start Guide: Iridium GO! Advanced Portal

Quick Start Guide: Iridium GO! Advanced Portal Quick Start Guide: Iridium GO! Advanced Portal Contents Set-Up... 3 Overview... 4 Main Tab 1: General... 5 Status.... 5 Settings... 8 Audio.... 8 GPS.... 9 Tab 2: Communication... 9 Wi-Fi... 9 Satellite...

More information

The Problem with Faxing over VoIP Channels

The Problem with Faxing over VoIP Channels The Problem with Faxing over VoIP Channels Lower your phone bill! is one of many slogans used today by popular Voice over IP (VoIP) providers. Indeed, you may certainly save money by leveraging an existing

More information

1. Login to www.ifbyphone.com with your User ID and password. Select Virtual Receptionist from the Basic Services tab.

1. Login to www.ifbyphone.com with your User ID and password. Select Virtual Receptionist from the Basic Services tab. Virtual Receptionist Virtual Receptionist is a hosted PBX auto attendant service with intelligent routing that automatically greets and routes phone calls based on your office schedule. It gives your company

More information

Getting Started. Getting Started with Time Warner Cable Business Class. Voice Manager. A Guide for Administrators and Users

Getting Started. Getting Started with Time Warner Cable Business Class. Voice Manager. A Guide for Administrators and Users Getting Started Getting Started with Time Warner Cable Business Class Voice Manager A Guide for Administrators and Users Table of Contents Table of Contents... 2 How to Use This Guide... 3 Administrators...

More information

Welcome to Cogeco Business Digital Phone Service

Welcome to Cogeco Business Digital Phone Service Welcome Welcome to Cogeco Business Digital Phone Service Congratulations on choosing Cogeco Business Digital Phone Service. 1 Your decision to subscribe to our digital quality phone service is a smart

More information

The preliminary design of a wearable computer for supporting Construction Progress Monitoring

The preliminary design of a wearable computer for supporting Construction Progress Monitoring The preliminary design of a wearable computer for supporting Construction Progress Monitoring 1 Introduction Jan Reinhardt, TU - Dresden Prof. James H. Garrett,Jr., Carnegie Mellon University Prof. Raimar

More information

Avaya one-x Mobile User Guide for iphone

Avaya one-x Mobile User Guide for iphone Avaya one-x Mobile User Guide for iphone Release 5.2 January 2010 0.3 2009 Avaya Inc. All Rights Reserved. Notice While reasonable efforts were made to ensure that the information in this document was

More information

To access your mailbox by computer. For assistance, call:

To access your mailbox by computer. For assistance, call: User Guide 2002 Active Voice, LLC. All rights reserved. First edition 2002. Repartee and TeLANophy are trademarks of Active Voice LLC. To access your mailbox by computer 1. Launch Mailbox Manager. 2. When

More information

Australian Standard. Interactive voice response systems user interface Speech recognition AS 5061 2008 AS 5061 2008

Australian Standard. Interactive voice response systems user interface Speech recognition AS 5061 2008 AS 5061 2008 AS 5061 2008 AS 5061 2008 Australian Standard Interactive voice response systems user interface Speech recognition This Australian Standard was prepared by Committee IT-022, Interactive Voice Response

More information

Contact Center Discovery Exercise

Contact Center Discovery Exercise Contact Center Discovery Exercise Introduction The County is currently planning to implement a new telephone system, based on Voice over IP (VoIP), including a new Contact Center solution. VoIP is a proven

More information

Interavtive Voice Response System

Interavtive Voice Response System Interavtive Voice Response System Ms.Rashmi Janbandhu Rajiv Gandhi College Of Engineering & Reasearch rashmi.janbandhu@gmail.com M s.divya Jawle Rajiv Gandhi College Of Engineering & Reasearch djawl3e@gmail.com

More information

Enhanced VoIP Based Virtual PC Troubleshooting

Enhanced VoIP Based Virtual PC Troubleshooting International Journal of Scientific & Engineering Research, Volume 3, Issue 4, April-2012 1 Enhanced VoIP Based Virtual PC Troubleshooting U.K.D.S.N. Dayananda, W.N.K.L. Abeyrathna, S.M.C.D. Samarakoon,

More information

Dragon Solutions Enterprise Profile Management

Dragon Solutions Enterprise Profile Management Dragon Solutions Enterprise Profile Management summary Simplifying System Administration and Profile Management for Enterprise Dragon Deployments In a distributed enterprise, IT professionals are responsible

More information

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper Reliable Customer Service and Automation is the key for Success in Hosted Interactive Voice Response Speech Enabled

More information

Feature Reference. Features: Call Forwarding Call Waiting Conference Calling Outbound Caller ID Block Last Call Return VoiceMail

Feature Reference. Features: Call Forwarding Call Waiting Conference Calling Outbound Caller ID Block Last Call Return VoiceMail Feature Reference This document will provide you with information on and how to use the following features of your phone service with Standard Broadband. Features: Call Forwarding Call Waiting Conference

More information

LiveTalk Call Center solution

LiveTalk Call Center solution LiveTalk Call Center solution I. Introduction LiveTalk enables real-time interaction between callers and a pool of technical and customer support or sales agents via a completely web based interface. With

More information

Personal Voice Call Assistant: VoiceXML and SIP in a Distributed Environment

Personal Voice Call Assistant: VoiceXML and SIP in a Distributed Environment Personal Voice Call Assistant: VoiceXML and SIP in a Distributed Environment Michael Pucher +43/1/5052830-98 pucher@ftw.at Julia Tertyshnaya +43/1/5052830-45 tertyshnaya@ftw.at Florian Wegscheider +43/1/5052830-45

More information

Christian Leibold CMU Communicator 12.07.2005. CMU Communicator. Overview. Vorlesung Spracherkennung und Dialogsysteme. LMU Institut für Informatik

Christian Leibold CMU Communicator 12.07.2005. CMU Communicator. Overview. Vorlesung Spracherkennung und Dialogsysteme. LMU Institut für Informatik CMU Communicator Overview Content Gentner/Gentner Emulator Sphinx/Listener Phoenix Helios Dialog Manager Datetime ABE Profile Rosetta Festival Gentner/Gentner Emulator Assistive Listening Systems (ALS)

More information