Spoken Document Retrieval from Call-Center Conversations
|
|
|
- Sherilyn Caldwell
- 10 years ago
- Views:
Transcription
1 Spoken Document Retrieval from Call-Center Conversations Jonathan Mamou, David Carmel, Ron Hoory IBM Haifa Research Labs Haifa 31905, Israel ABSTRACT We are interested in retrieving information from conversational speech corpora, such as call-center data. This data comprises spontaneous speech conversations with low recording quality, which makes automatic speech recognition (ASR) a highly difficult task. For typical call-center data, even state-of-the-art large vocabulary continuous speech recognition systems produce a transcript with word error rate of 30% or higher. In addition to the output transcript, advanced systems provide word confusion networks (WCNs), a compact representation of word lattices associating each word hypothesis with its posterior probability. Our work exploits the information provided by WCNs in order to improve retrieval performance. In this paper, we show that the mean average precision (MAP) is improved using WCNs compared to the raw word transcripts. Finally, we analyze the effect of increasing ASR word error rate on search effectiveness. We show that MAP is still reasonable even under extremely high error rate. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval General Terms Algorithms 1. INTRODUCTION The role of call-centers is becoming increasingly central to corporates in recent years. There are two main reasons for that. The first one is the increase of the importance of individuals to companies, and the drive of the latter to acquire and keep customers for the long haul. The second relates to the rapid pace of advances in technology that creates an ever-growing gap between users and automated systems, prompting them to require more technical assistance. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGIR 06, August 6 10, 2006, Seattle, Washington, USA. Copyright 2006 ACM /06/ $5.00. Call-center is a general term for help desks, information lines and customer service centers operating via voice conversations; among the services typically provided by these centers are customer support, operator services and inbound and outbound telemarketing. A generalization of call-centers, known as contact-centers, provides any dialog-based support system in which a user receives a service by a professional. Some contact-centers provide additional means of conversing, e.g., web-based services, or on line chat services. Information retrieval (IR) from call-center conversations is very useful for call-center managers and supervisors. It allows them to find calls on specific topics. It supports mining capabilities for monitoring agent conduct and customer satisfaction. Furthermore, it enables detection up-sell/crosssell opportunities, and analysis of general trends. Agents can also benefit from having efficient information retrieval capabilities, for example by searching previous calls, which provide the solution to the caller s problem [12]. There are two main approaches for IR on speech data. The first approach converts the speech to a phonetic transcript and represents the query as a string of phonemes. The retrieval is based on searching the string of phonemes representing the query in the phonetic transcript, as described by Clements et al. [6]. The second approach converts the speech to a word transcript using large vocabulary continuous speech recognition tool and applies standard text retrieval methods over the transcript when processing queries. This approach, combined with the improvements in speech recognition accuracy and IR techniques, opens new opportunities in this field. In past years, most of the research effort in spoken document retrieval (SDR) has been conducted in the framework of NIST SDR tracks [7]. These tracks focused on retrieval from word transcripts of a corpus of broadcast news stories spoken by professionals. One of the conclusions of those tracks was that the effectiveness of retrieval mostly depends on the accuracy of the transcripts. While the accuracy of ASR systems depend on the scenario and environment, state-of-the-art systems achieved better than 90% accuracy in transcription of such data. In 2000, Garofolo et al. have concluded that SDR is a solved problem [7]. In this paper we argue that SDR is far from being solved, especially for noisy data such as call-center telephone calls. We follow the SDR approach over call-center conversations using the call transcriptions. We experiment with call-center calls, comprising conversations of speakers from a large population, that are much harder to transcribe automatically. State-of-the-art transcription systems achieve around 60%-
2 70% accuracy in transcription of such noisy spontaneous telephone calls. In other words, for this kind of data, approximately one out of three words is mis-recognized. In some circumstances like noisy channel, foreign accent, under trained engines etc., accuracy may fall to 50% or even less. The low accuracy of the data can have a dramatic effect on the precision and the recall of the search. We present a novel scheme for information retrieval over noisy transcripts, that uses additional output from the transcription system in order to reduce the effect of recognition errors in the word transcripts. Our approach takes into consideration all word hypotheses provided by the transcription server as well as their posterior probabilities. We analyze the retrieval effectiveness at different word error levels and when different ranking models are used. We show that even for word error rate of about 50% our retrieval approach is able to perform reasonably well. The paper is organized as follows. We describe the ASR engine and its output in Section 2. The Retrieval methods are presented in Section 3. Experimental setup and results are given in Section 4. In Section 5, we give an overview of related works. Finally, we conclude in Section AUTOMATIC SPEECH RECOGNITION (ASR) SYSTEM We use the IBM research prototype ASR system, described in [16], for transcribing call-center data consisting of a single channel 6KHz speech (agent and caller are mixed). The output words are selected from a large US English vocabulary, which has a good coverage of the spoken language. The ASR engine works in speaker-independent mode, applying the same models for all speakers (agents and callers) 1. For best recognition results, a speaker-independent acoustic model and a language model are trained in advance on data with similar characteristics. Typically, ASR generates word lattices that can be considered as directed acyclic graphs. Each vertex is associated with a timestamp and each edge (u, v) is labeled with a word hypothesis and its prior probability, which is the probability of the signal delimited by the timestamps of the vertices u and v, given the word hypothesis. The 1-best path transcript is obtained from the word lattice using dynamic programming techniques. Mangu et al. [11] and Hakkani-Tur et al. [9] propose a compact representation of a word lattice called word confusion network (WCN). Each edge (u, v) is labeled with a word hypothesis and its posterior probability, i.e., the probability of the word given the signal. One of the main advantages of WCN is that it also provides an alignment for all of the words in the lattice. As explained in [9], the three main steps for building a WCN from a word lattice are as follows: 1. Compute the posterior probabilities for all edges in the word lattice. 2. Extract a path from the word lattice (which can be the 1-best, the longest or any random path), and call it the pivot path of the alignment. 3. Traverse the word lattice, and align all the transitions with the pivot, merging the transitions that correspond to the same word (or label) and occur in the 1 Some ASR systems train a specific speaker-dependent model, for each of the system agents. same time interval by summing their posterior probabilities. The 1-best path of a WCN is obtained from the path containing the best hypotheses. As stated in [11], although WCNs are more compact than word lattices, in general the 1-best path obtained from WCN has a better word accuracy than the 1-best path obtained from the corresponding word lattice. Typical structures of a word lattice and a WCN are given in Figure 1. An excerpt of the 1-best path automatic transcript of a call is shown in Figure 2. The corresponding manual transcript is also provided for reference. The automatic transcript clearly demonstrates the extremely noisy data to be handled by the SDR system. The WCN depicted in Figure 3 corresponds to the output of the ASR for the uppercase words of the last sentence of the automatic transcript. We can see that the different hypotheses appearing at the same offset have a certain acoustic similarity. Figure 1: Typical structures of a word lattice and a WCN. 3. RETRIEVAL MODEL The main problem with retrieving information from callcenter conversations is the low accuracy of the transcription. Generally, the accuracy of a transcript is characterized by its word error rate (WER). There are three kinds of errors that can occur in a transcript: substitution of a term that is part of the speech by another term deletion of a spoken term that is part of the speech insertion of a term that is not part of the speech Substitutions and deletions reflect the fact that an occurrence of a term in the speech signal is not recognized. These misses reduce the recall of the search. Substitutions and insertions reflect the fact that a term which is not part of the speech signal appears in the transcript. These misses reduce the precision of the search. 3.1 WCN indexing Search recall can be enhanced by expanding the transcript with extra words. These words can be taken from the other
3 operator: thanks for calling the ibm customer service center this is john am i speaking with mark caller: yes operator: hey mark what s going on caller: well i m trying to connect to the at and t net client and i got an error it came back and gave me error one twenty it says invalid access list and i m not sure what it means by that operator: one twenty invalid access list caller: uhhuh operator: let s go and take a look at your setup real quick and see see what s going on there caller: ok... caller: well it ah trying to ah oh i got something um the t and t at and t net client got A GRAPHIC ON MY SCREEN so it s connecting and it s counting the connect time so let s see it s got a little downer i think i can minimise it by clicking on it oops ah that s not it ah yeah i can minimise it ok great and i ll just try to get into lotus notes and see what happens operator: you for calling i b m customer service center this is john to bookmark caller: yes operator: hey mark what s going on caller: well i m trying to connect to the uh a t n t net client and uh i got an error or came back and gave me an error one twenty it says invalid access list and i m not sure what it means by that operator: ok one twenty invalid access list caller: ok operator: it s gonna take a look at your setup real quick and see what see what s going on there caller: ok... caller: i with to trying to install ok i ve got something uhhuh at a and t a t n t net client HAVE GLASSES ON MY SCREEN fast so it s connecting and it s counting the connect time off so let s see if that ll down there at the ticket number a second notes left that s not it yeah i can minimize ok great and i ll just try to get into uh lotus notes and see what happened Figure 2: Transcripts of a excerpt of a call. Left - a manual transcript. Right - an automatic 1-best path transcript of the same call, with 36.75% WER level. alternatives provided by the WCN. Consequently, these alternatives may have been spoken but were not the top choice of the ASR. Such an expansion might improve recall but will probably reduce precision. In order to improve the recall without decreasing the precision, we exploit two pieces of information provided by WCN concerning the occurrences of a term: its posterior probability and its rank among the other hypotheses. This additional information is used in order to improve the precision of the results and consequently to improve search effectiveness as measured by mean average precision (MAP) and precision at k The posterior probability reflects the confidence of the ASR in the hypothesis given the signal, thus the retrieval process will boost calls for which the query term occurs with higher probability. The rank of the hypothesis among other alternatives reflects the importance of the term relatively to other alternatives, thus, a call containing a query term that is ranked higher should be preferred over a call where the same term is ranked lower. Let D be a call-center conversation modeled by a WCN. We denote by P r(t o, D) the posterior probability of a term t at the offset o in the WCN of D. We denote by rank(t o, D) the rank of a term t at the offset o, where all hypotheses at offset o are sorted in decreasing order according to their posterior probabilities. When terms are stemmed, the posterior probability of a stem t at offset o is the sum of all posterior probabilities of the terms having the same stem t at offset o. The rank of the stem is also reevaluated according to the new posterior probabilities. For example, in figure 3, the terms graphic and graphics both are stemmed to graphic; hence the posterior probability of the stem is = 0.35 and its rank is 1. The probability for the term glass is 0.27 and its rank is 2. It is interesting to note that the term graphic does not occur in the automatic 1-best path transcript although it occurs in the manual transcript (see Figure 2), thus the automatic 1-best path wrongly transcribes the speech to have glasses on my screen rather than a graphics on my screen. After stemming, the term graphic is ranked first and appears in the automatic stemmed 1-best path. To conclude, each occurrence of a term t in the WCN of a call D is indexed with the following information: the word offset o of t in D, the confidence level of the occurrence of t that is evaluated by its posterior probability P r(t o, D), the rank of t among the other hypotheses at offset o, rank(t o, D). 3.2 Ranking the search results We use the classical tf-idf vector space model for ranking the search results. Usually, term frequency is evaluated according to the number of occurrences of the term in the document. For WCNs, we readjust the term frequency of a term in the call: it is evaluated by summing the posterior probabilities of all of its occurrences over the conversation and is boosted by the rank of the term among the other hypotheses. Consider a query Q, associated with a boosting vector B = (B 1,..., B l ). This vector associates a boosting factor to each rank of the different hypotheses. If the rank r is larger than l we assume B r = 0. Let occ(t, D) = (o 1, o 2,..., o n) be the sequence of all the occurrences of t in D. We define the term frequency of a term t in a conversation D, tf(t, D), by the following formula: tf(t, D) = X occ(t,d) i=1 B rank(t oi,d) P r(t o i, D) There are some interesting configurations for the boosting vector B:
4 Figure 3: A fragment of the WCN of the call appeared in Figure 2. Timestamps have been omitted for clarity. B = (1, 0,..., 0) corresponds to the case we consider only the 1-best path transcript for retrieval, but take the confidence level of the terms into consideration. B = (1, 1,..., 1) corresponds to the case we consider the first l hypotheses for retrieval but ignore the rank of a term among the other alternatives. B = (b 1, b 2,..., b l ), where i, b i > b i+1 corresponds to the case that the first l hypotheses are considered but higher ranked hypotheses contribute more to the document score. Our search system also consider lexical affinities [3] during query evaluation. Lexical affinities are closely related query terms found close to each other in the document. The confidence level of a lexical affinity (t 1, t 2), is determined by the multiplication of the confidence levels of its terms: P r((t 1, t 2) o, D) = P r(t 1 o 1, D) P r(t 2 o 2, D). The rank of the lexical affinity (required for setting its boost) is determined by the maximal rank of its terms: rank ((t 1, t 2) o, D) = max (rank(t 1 o 1, D), rank(t 2 o 2, D)). The inverse document frequency (idf) of a term is a measure of the relative importance of the term in the corpus. Usually, the inverse document frequency is evaluated by the ratio of the number of all the documents in the corpus to the number of documents in the corpus where the term appears. With WCNs, the occurrence of a term in a call is associated with a confidence level. We define idf(t), the inverse document frequency of a term t in the corpus C by the following formula: idf(t) = log O O t where O t reflects the number of occurrences of t in the corpus and is estimated by: O t = X D C X occ(t,d) i=1 P r(t o i, D) and O reflects the overall number of occurrence in the corpus and is estimated by: O = X t C O t 4. EXPERIMENTS 4.1 Experimental setup Our corpus consists of 2236 calls made to the IBM internal customer support service during The calls deal with a large range of software and hardware problems. Manual transcripts and speaker level segmentation of the calls are also provided. The average length of a call is 9 minutes. We divide the call-center data that we have into two sets, a training set to build the models of the ASR engine, and a test set to test the retrieval performance. When we reduce the number of calls in the training set, we increase the WER of the calls of the test set. We use the Juru [2] search engine to index and retrieve the calls. Calls WCNs were indexed as described in Subsection 3.1. Search results were ranked as described in Subsection 3.2. For the experiments, we have processed 120 queries that deal with typical topics of call center s discussions. We measure precision and recall by comparing the results obtained over the automatic transcripts to the results obtained over the reference manual transcripts. Our aim is to evaluate the ability of the suggested retrieval approach to handle noisy data. Thus, the closer the automatic results to the manual results, the better the search effectiveness over the automatic transcripts. The top results returned from the manual index for a given query are considered relevant and are expected to be retrieved at the top from the automatic index. This approach for measuring search effectiveness using manual data as a reference is very common in SDR research [15, 13, 4, 5]. For evaluating retrieval effectiveness, we use the standard IR measures of precision, recall, mean average precision (MAP), precision at 10 (P@10), and precision at R (P@R) where R is the number of relevant results. 4.2 WER analysis We use the WER in order to characterize the accuracy of the collection of calls. WER is defined as follows: S + D + I 100 N where N is the total number of words in the test set, and S, I, and D are the total number of substitution, insertion, and deletion errors, respectively. The substitution error rate
5 (SUBR) is defined by S S + D + I 100. Deletion error rate (DELR) and insertion error rate (INSR) are defined in a similar manner. Table 1 gives the distribution of the error types over 1- best path transcripts extracted from automatic WCNs at different WERs. Different WER levels were achieved by providing different number of training examples to the ASR. The smaller the number of the training calls, the larger the error rate. The first row represents a limit on the lowest error rate that can be achieved by our system when all data is used for training. The WER of the different corpora was also measured after stop-words filtering and stemming; these corpora are denoted by stem-stpw. Note that in our collection, around 70% of the terms are stop-words both in the manual and the automatic transcripts. This is typical for discussion transcripts where stop-words are much more frequent than in typical written text 2. Note also the decrease in errors after stop-word filtering and stemming. The reason is that many errors relate to stop-words since shorter terms are harder to recognize. Furthermore, stemming reduces errors related to a mishmash between similar terms (e.g. table and tables). Additionally, the table shows that while the WER increases with smaller training data, the ratio between the different error types is preserved. Around 60% of the errors are substitution errors in all collections. The rest of the errors are split differently between insertion errors and deletion errors according to the global WER. When the WER increases, DELR increases and INSR decreases. If the ASR engine is under-trained, it does not output enough words so more deletions are expected and the chance to have extra words inserted is smaller. If the ASR engine is over-trained, it outputs many alternatives and more insertion errors than deletion errors are expceted. 4.3 Lattice-based versus WCN-based retrieval As stated in [11], the 1-best path transcript obtained from a WCN is usually more accurate than the 1-best path transcript obtained from the corresponding word lattice. However, the distribution of the different types of errors is very similar between the two 1-best path transcripts. In order to compare the retrieval effectiveness between these two representations, we have indexed the 1-best paths of all calls as extracted from the corresponding lattices, and the 1-best paths extracted from the WCNs, and run the same 120 queries against the two indices. The results were compared to the results obtained from an index of the manual transcripts. For this experiment, the term frequency reflects the number of occurrences of the term in the transcript. Table 2 shows the MAP and P@10 of the results at two different WER levels. We note that the retrieval effectiveness decreases with the increase in WER for both models. The results are very similar between 1-best paths extracted from lattices and WCNs. This can be explained by similar WER levels. However, a WCN provides much more information that can be exploited. In the following section, we show how the extra 2 Using the same set of stop-words, only about 50% of the terms in the TREC-8 corpus of textual documents are marked as stop-words. information extracted from a WCN improve search effectiveness. 4.4 Retrieval effectiveness using WCN In this section we test the effectiveness of the ranking model presented in Section 3.2. The following list provides a description of the different models that we compare: 1-best WCN TF: the 1-best path obtained from the WCNs has been indexed. The term frequency is the number of occurrences of the term in the path. all WCN TF: all the WCN hypotheses have been indexed as explained in Section 3.1. However, the confidence level and the rank of an occurrence of a term is ignored. In other words, we index all the terms appearing in the WCN without taking into account their confidence level. The term frequency is the number of occurrences of the term in the WCN. This model is expected to improve recall while reducing precision. 1-best WCN CL: the 1-best path obtained from the WCN has been indexed. Each term has been indexed with its confidence level. By ranking according to the confidence levels, the model distinguishes between terms detected with high confidence compared to low confidence detected terms. It is achieved by using the boosting vector B = (1, 0,..., 0). all WCN CL: all the hypotheses in the WCN have been indexed, with their confidence level, as explained in Section 3.1. The ranking model corresponds to the case we consider all alternatives with their confidence level but do not consider the rank of the terms compared to the other hypotheses. It is achieved by using the boosting vector B = (1, 1,..., 1). all WCN CL boost: the index is the same as for all WCN CL. The ranking model is based on the confidence levels as explained in Section 3.2 with the boosting vector B = (10, 9, 8, 7, 6, 5, 4, 3, 2, 1), in order to boost higher ranked terms Index size Table 3 compares the size of the different indices built at two different WER levels. While extracting all data from the WCN rather than only the 1-best paths increases the corpus size by a factor of 20, the index size of the full data is only three times larger than the index of the 1-best paths Retrieval measures We would like to test the search effectiveness when using WCNs for indexing: what is the effectiveness of expanding the 1-best transcript with all the terms of the WCN? what is the effectiveness of using confidence level boosted by term rank for all hypotheses? The average precision and recall of the 120 queries, at different WER levels, are respectively given in Figure 4 and Figure 5. The WER values on the x-axis are linked to the WER values for the raw transcript presented in Table 1. Both recall and precision are reduced in higher WER. It is clear that the recall is significantly improved in all error
6 corpus training set test set WER SUBR DELR INSR raw transcript stem-stpw raw transcript stem-stpw raw transcript stem-stpw raw transcript stem-stpw raw transcript stem-stpw raw transcript stem-stpw Table 1: Distribution of the error types over 1-best path extracted from WCNs at different WER levels. In each row, the first line relates to the raw transcript and the second line to the transcript after stemming and stop-words filtering. Corpus WER(%) MAP P@10 1-best WCN best lattice best WCN best lattice Table 2: MAP and P@10 of 1-best path transcripts extracted from word lattices vs. 1-best path transcripts extracted from WCNs at different WER levels. Figure 4: Precision results at different error levels. Figure 5: Recall results at different error levels. levels, when all the hypotheses provided by the WCNs are indexed, however, precision is significantly reduced due to the added noise. The MAP and P@R of the search results are presented in Figure 6. The graphs show that all WCN CL boost always outperforms the other models, especially when the WER increases. Furthermore, recall and precision are the same for the 1-best WCN CL and 1-best WCN TF models, however, the first outperforms the second in MAP measure due to the usage of the confidence levels. A comparison between the MAP performance of all WCN CL and all WCN TF supports this finding. This leads to the conclusion that using confidence levels improves the retrieval effectiveness. For WER less than 42%, 1-best WCN CL slighlty outperforms all WCN CL in MAP measure. However, when using the boosting factors, all WCN CL boost outperforms 1-best WCN CL in MAP measure. Despite the low precision of all WCN as shown in Figure 4, the ranking model based on confidence level and boosting, all WCN CL boost, results in the highest MAP. The conclusion is that extending the 1-best transcript with all the terms of the WCN and ranking the results according to confidence levels boosted by term rank, improves the effectiveness of the retrieval.
7 1-best all WER corpus WCN TF WCN CL corpus WCN TF WCN CL (%) index index index index Table 3: Corpus and index size (in MB) at different WER levels. Figure 6: MAP and at different error levels. 5. RELATED WORK In the past decade, most of the research efforts on spoken data retrieval have focused on extending classical IR techniques to spoken documents. Some of these works are described by Garofolo et al. [7]. An ASR system is used to generate the transcription of the speech. This transcription is generally a 1-best path transcript. Most systems index the transcription as clean text and successfully apply a generic IR system over the text as described by Brown et al. [1] and James [10]. This strategy works well for transcripts like broadcast news stories that have a low WER (in the range of 15%-30%) and are redundant by nature (the same piece of information is spoken several times in different manners). Moreover, the algorithms have been mostly tested over long queries stated in plain English and retrieval for such queries is more robust against speech recognition errors. An alternative approach consists of using word lattices in order to improve the effectiveness of SDR. Singhal et al. [14, 15] propose to add some terms to the transcript in order to alleviate the retrieval failures due to ASR errors. A classical way to bring new terms from an IR perspective is document expansion using a similar corpus. Their approach consists of using word lattices in order to determine which words returned by a document expansion algorithm should be added to the original transcript. The necessity to use a document expansion algorithm was justified by the fact that the word lattices they worked with, lack information about word probabilities. Saraclar and Sproat in [13] show improvement in word spotting accuracy, using phonemes and word lattices, where a confidence measure of a word or a sub-word in a speech document can be derived from the lattice. Their experiments also concern telephone conversations. Similarly to our approach, they used lattices augmented with confidence measure. However, contrarily to WCNs, lattices do not provide alignment of the terms. Consequently, the offsets of the words and the sub-words are not stored in the index and proximity information between query terms cannot be exploited during retrieval. Likewise, their ranking model is only based on the confidence measure and the rank of term among the other hypotheses is not used since this piece of information is not easily retrievable from lattices. Moreover, their experiments were carried out only on a telephone conversations corpus at 40% WER and there is no analysis on retrieval effectiveness at different WER levels. Chelba and Acero in [4, 5] propose a more compact word lattice, the position specific posterior lattice (PSPL). This data structure is similar to WCN and leads to a more compact index. The offset of the terms in the speech documents is also stored in the index. However, the evaluation framework is carried out on lectures that are relatively planned, in contrast to conversational speech. Furthermore, they do not test any ranking of the results based on the proximity between query terms in the speech documents. Their ranking model is based on the term confidence level but does not take into consideration the rank of the term among the other hypotheses. Other research has been focused on analysis of call-center data. Hakkani-Tur et al. in [17, 8] have studied the problem of spoken language understanding. More specifically, they have used WCNs for entity extraction and call classification. Mishne et al. in [12] propose a system that assists callcenter agents. The topic of the call is detected by a topic detection algorithm and then potential solutions are retrieved from a Q&A knowledge base. The system also monitors the calls by detecting off-topic segments.
8 6. CONCLUSIONS This work studies how SDR can be performed efficiently over very noisy call-center data. This data comprises spontaneous speech conversations with low recording quality, which makes automatic speech recognition (ASR) a highly difficult task. For typical call-center data, even state-of-the-art large vocabulary continuous speech recognition systems produce a transcript with word error rate of 30% or higher. Our work exploits the additional information provided by a WCN in order to improve retrieval performance. In a WCN, a compact representation of the ASR s word lattice, each word is associated with a confidence level reflecting its posterior probability given the speech signal. By taking the terms confidence levels into consideration, and by considering the relative rank of the different hypotheses, our system is able to improve the search effectiveness. Our experiments over true call-center conversations demonstrate the effect of increasing WER level on search effectiveness. As expected, a higher WER level hurts search results. The search recall is significantly improved by indexing all the hypotheses provided by the WCN. While the precision is decreased as expected, the MAP is improved compared to the 1-best path transcript, due to the confidence level consideration. Using our indexing scheme of WCN, the MAP is still reasonable even under extremely high error rate level, and thus effective search can be achieved. One of problems of ASR is out of vocabulary (OOV) words. OOV words are missing from the ASR system vocabulary and are replaced in the output transcript by alternatives that are probable, given the recognition acoustic model and the language model. In the experiments presented in this work all query terms are in-vocabulary so this problem has been ignored. However, in real practice, OOV queries require special treatment. This is left for further research. 7. ACKNOWLEDGEMENTS We are grateful to Olivier Siohan from the IBM T.J. Watson research center for providing the required data and for assistance on ASR topics. 8. REFERENCES [1] M. Brown, J. Foote, G. Jones, K. Jones, and S. Young. Open-vocabulary speech indexing for voice and video mail retrieval. In Proceedings ACM Multimedia 96, pages , Hong-Kong, November [2] D. Carmel, E. Amitay, M. Herscovici, Y. S. Maarek, Y. Petruschka, and A. Soffer. Juru at TREC 10 - Experiments with Index Pruning. In Proceedings of the Tenth Text Retrieval Conference (TREC-10). National Institute of Standards and Technology. NIST, [3] D. Carmel, E. Farchi, Y. Petruschka, and A. Soffer. Automatic query refinement using lexical affinities with maximal information gain. In SIGIR 02: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, pages , New York, NY, USA, ACM Press. [4] C. Chelba and A. Acero. Indexing uncertainty for spoken document search. In Interspeech 2005, pages 61 64, Lisbon, Portugal, [5] C. Chelba and A. Acero. Position specific posterior lattices for indexing speech. In Proceedings of the 43rd Annual Conference of the Association for Computational Linguistics (ACL-05), Ann Arbor, MI, ACL [6] M. Clements, S. Robertson, and M. Miller. Phonetic searching applied to on-line distance learning modules. In Digital Signal Processing Workshop, 2002 and the 2nd Signal Processing Education Workshop. Proceedings of 2002 IEEE 10th, pages , [7] J. Garofolo, G. Auzanne, and E. Voorhees. The TREC spoken document retrieval track: A success story. In Proceedings of the Ninth Text Retrieval Conference (TREC-9). National Institute of Standards and Technology. NIST, [8] D. Hakkani-Tur, F. Bechet, G. Riccardi, and G. Tur. Beyond ASR 1-best: Using word confusion networks in spoken language understanding. In Journal of Computer Speech and Language, [9] D. Hakkani-Tur and G. Riccardi. A general algorithm for word graph matrix decomposition. In Proceedings of the IEEE Internation Conference on Acoustics, Speech and Signal Processing (ICASSP), pages , Hong-Kong, [10] D. James. The application of classical information retrieval techniques to spoken documents. PhD thesis, University of Cambridge, Downing College, [11] L. Mangu, E. Brill, and A. Stolcke. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer Speech and Language, 14(4): , [12] G. Mishne, D. Carmel, R. Hoory, A. Roytman, and A. Soffer. Automatic analysis of call-center conversations. In CIKM 05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages , New York, NY, USA, ACM Press. [13] M. Saraclar and R. Sproat. Lattice-based search for spoken utterance retrieval. In HLT-NAACL 2004: Main Proceedings, pages , Boston, Massachusetts, USA, [14] A. Singhal, J. Choi, D. Hindle, D. Lewis, and F. Pereira. AT&T at TREC-7. In Proceedings of the Seventh Text Retrieval Conference (TREC-7). National Institute of Standards and Technology. NIST, [15] A. Singhal and F. Pereira. Document expansion for speech retrieval. In SIGIR 99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, pages 34 41, New York, NY, USA, ACM Press. [16] H. Soltau, B. Kingsbury, L. Mangu, D. Povey, G. Saon, and G. Zweig. The IBM 2004 conversational telephony system for rich transcription. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March [17] G. Tur, D. Hakkani-Tur, and G. Riccardi. Extending boosting for call classification using word confusion networks. In ICASSP-2004, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004.
Automatic Analysis of Call-center Conversations
Automatic Analysis of Call-center Conversations Gilad Mishne Informatics Institute, University of Amsterdam Kruislaan 403, 1098SJ Amsterdam The Netherlands [email protected] David Carmel, Ron Hoory,
Turkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey [email protected], [email protected]
Knowledge Management and Speech Recognition
Knowledge Management and Speech Recognition by James Allan Knowledge Management (KM) generally refers to techniques that allow an organization to capture information and practices of its members and customers,
CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.
CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:
C E D A T 8 5. Innovating services and technologies for speech content management
C E D A T 8 5 Innovating services and technologies for speech content management Company profile 25 years experience in the market of transcription/reporting services; Cedat 85 Group: Cedat 85 srl Subtitle
D2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
Web Document Clustering
Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
Segmentation and Classification of Online Chats
Segmentation and Classification of Online Chats Justin Weisz Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 [email protected] Abstract One method for analyzing textual chat
Term extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
Presentation Video Retrieval using Automatically Recovered Slide and Spoken Text
Presentation Video Retrieval using Automatically Recovered Slide and Spoken Text Matthew Cooper FX Palo Alto Laboratory Palo Alto, CA 94034 USA [email protected] ABSTRACT Video is becoming a prevalent medium
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 [email protected] 2 [email protected] Abstract A vast amount of assorted
Incorporating Window-Based Passage-Level Evidence in Document Retrieval
Incorporating -Based Passage-Level Evidence in Document Retrieval Wensi Xi, Richard Xu-Rong, Christopher S.G. Khoo Center for Advanced Information Systems School of Applied Science Nanyang Technological
Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations
Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations C. Wright, L. Ballard, S. Coull, F. Monrose, G. Masson Talk held by Goran Doychev Selected Topics in Information Security and
Electronic Document Management Using Inverted Files System
EPJ Web of Conferences 68, 0 00 04 (2014) DOI: 10.1051/ epjconf/ 20146800004 C Owned by the authors, published by EDP Sciences, 2014 Electronic Document Management Using Inverted Files System Derwin Suhartono,
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1
Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
Data Mining in Web Search Engine Optimization and User Assisted Rank Results
Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management
Automatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition
, Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology
Named Entity Recognition in Broadcast News Using Similar Written Texts
Named Entity Recognition in Broadcast News Using Similar Written Texts Niraj Shrestha Ivan Vulić KU Leuven, Belgium KU Leuven, Belgium [email protected] ivan.vulic@@cs.kuleuven.be Abstract
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Classroom Training Schedule
Course Date and Location Classroom Training Schedule 4.4 System Administration Date Location With this course you ll receive a comprehensive overview of the Vocera communications system, with details on
An Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
The University of Lisbon at CLEF 2006 Ad-Hoc Task
The University of Lisbon at CLEF 2006 Ad-Hoc Task Nuno Cardoso, Mário J. Silva and Bruno Martins Faculty of Sciences, University of Lisbon {ncardoso,mjs,bmartins}@xldb.di.fc.ul.pt Abstract This paper reports
Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN
PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,
INTRODUCTION TO SYNTHESYS
INTRODUCTION TO SYNTHESYS i All rights reserved The contents of this documentation (and other documentation and training materials provided), is the property of Noetica and is strictly confidential. You
Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs
Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs Ryosuke Tsuchiya 1, Hironori Washizaki 1, Yoshiaki Fukazawa 1, Keishi Oshima 2, and Ryota Mibe
Robust Methods for Automatic Transcription and Alignment of Speech Signals
Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist ([email protected]) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background
Practical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
M3039 MPEG 97/ January 1998
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039
Call Recorder Oygo Manual. Version 1.001.11
Call Recorder Oygo Manual Version 1.001.11 Contents 1 Introduction...4 2 Getting started...5 2.1 Hardware installation...5 2.2 Software installation...6 2.2.1 Software configuration... 7 3 Options menu...8
Generating Training Data for Medical Dictations
Generating Training Data for Medical Dictations Sergey Pakhomov University of Minnesota, MN [email protected] Michael Schonwetter Linguistech Consortium, NJ [email protected] Joan Bachenko
A System for Labeling Self-Repairs in Speech 1
A System for Labeling Self-Repairs in Speech 1 John Bear, John Dowding, Elizabeth Shriberg, Patti Price 1. Introduction This document outlines a system for labeling self-repairs in spontaneous speech.
SEARCHING THE AUDIO NOTEBOOK: KEYWORD SEARCH IN RECORDED CONVERSATIONS
SEARCHING THE AUDIO NOTEBOOK: KEYWORD SEARCH IN RECORDED CONVERSATIONS Peng Yu, Kaijiang Chen, Lie Lu, and Frank Seide Microsoft Research Asia, 5F Beijing Sigma Center, 49 Zhichun Rd., 100080 Beijing,
Web Analytics Definitions Approved August 16, 2007
Web Analytics Definitions Approved August 16, 2007 Web Analytics Association 2300 M Street, Suite 800 Washington DC 20037 [email protected] 1-800-349-1070 Licensed under a Creative
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Automated News Item Categorization
Automated News Item Categorization Hrvoje Bacan, Igor S. Pandzic* Department of Telecommunications, Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia {Hrvoje.Bacan,Igor.Pandzic}@fer.hr
Chapter 8. Final Results on Dutch Senseval-2 Test Data
Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised
ACCESS 2007. Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818) 677-1700
Information Technology MS Access 2007 Users Guide ACCESS 2007 Importing and Exporting Data Files IT Training & Development (818) 677-1700 [email protected] TABLE OF CONTENTS Introduction... 1 Import Excel
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
Analysis of Data Mining Concepts in Higher Education with Needs to Najran University
590 Analysis of Data Mining Concepts in Higher Education with Needs to Najran University Mohamed Hussain Tawarish 1, Farooqui Waseemuddin 2 Department of Computer Science, Najran Community College. Najran
IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International
IVR CRM Integration Migrating the Call Center from Cost Center to Profit Rod Arends Cheryl Yaeger BenchMark Consulting International Today, more institutions are seeking ways to change their call center
Data Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,
ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition
The CU Communicator: An Architecture for Dialogue Systems 1 Bryan Pellom, Wayne Ward, Sameer Pradhan Center for Spoken Language Research University of Colorado, Boulder Boulder, Colorado 80309-0594, USA
AUDIMUS.media: A Broadcast News Speech Recognition System for the European Portuguese Language
AUDIMUS.media: A Broadcast News Speech Recognition System for the European Portuguese Language Hugo Meinedo, Diamantino Caseiro, João Neto, and Isabel Trancoso L 2 F Spoken Language Systems Lab INESC-ID
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
Search Engines. Stephen Shaw <[email protected]> 18th of February, 2014. Netsoc
Search Engines Stephen Shaw Netsoc 18th of February, 2014 Me M.Sc. Artificial Intelligence, University of Edinburgh Would recommend B.A. (Mod.) Computer Science, Linguistics, French,
Unified Communicator Advanced Training Handout
Unified Communicator Advanced Training Handout About Unified Communicator Advanced (UCA) Video Summary (must have access to the internet for this to launch) http://www.mitel.tv/videos/mitel_unified_communicator_advanced_na
Effect of Captioning Lecture Videos For Learning in Foreign Language 外 国 語 ( 英 語 ) 講 義 映 像 に 対 する 字 幕 提 示 の 理 解 度 効 果
Effect of Captioning Lecture Videos For Learning in Foreign Language VERI FERDIANSYAH 1 SEIICHI NAKAGAWA 1 Toyohashi University of Technology, Tenpaku-cho, Toyohashi 441-858 Japan E-mail: {veri, nakagawa}@slp.cs.tut.ac.jp
2009 Springer. This document is published in: Adaptive Multimedia Retrieval. LNCS 6535 (2009) pp. 12 23 DOI: 10.1007/978-3-642-18449-9_2
Institutional Repository This document is published in: Adaptive Multimedia Retrieval. LNCS 6535 (2009) pp. 12 23 DOI: 10.1007/978-3-642-18449-9_2 2009 Springer Some Experiments in Evaluating ASR Systems
Utilizing Automatic Speech Recognition to Improve Deaf Accessibility on the Web
Utilizing Automatic Speech Recognition to Improve Deaf Accessibility on the Web Brent Shiver DePaul University [email protected] Abstract Internet technologies have expanded rapidly over the past two
TCM: Transactional Completeness Measure based vulnerability analysis for Business Intelligence Support
Vol. 3 Issue 1, January-2014, pp: (32-36), Impact Factor: 1.252, Available online at: www.erpublications.com TCM: Transactional Completeness Measure based vulnerability analysis for Business Intelligence
Understanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
A Non-Linear Schema Theorem for Genetic Algorithms
A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland
Index Terms Domain name, Firewall, Packet, Phishing, URL.
BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet
Server Load Prediction
Server Load Prediction Suthee Chaidaroon ([email protected]) Joon Yeong Kim ([email protected]) Jonghan Seo ([email protected]) Abstract Estimating server load average is one of the methods that
Specialty Answering Service. All rights reserved.
0 Contents 1 Introduction... 2 1.1 Types of Dialog Systems... 2 2 Dialog Systems in Contact Centers... 4 2.1 Automated Call Centers... 4 3 History... 3 4 Designing Interactive Dialogs with Structured Data...
Search Query and Matching Approach of Information Retrieval in Cloud Computing
International Journal of Advances in Electrical and Electronics Engineering 99 Available online at www.ijaeee.com & www.sestindia.org ISSN: 2319-1112 Search Query and Matching Approach of Information Retrieval
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
Analysis and Synthesis of Help-desk Responses
Analysis and Synthesis of Help-desk s Yuval Marom and Ingrid Zukerman School of Computer Science and Software Engineering Monash University Clayton, VICTORIA 3800, AUSTRALIA {yuvalm,ingrid}@csse.monash.edu.au
Data Deduplication in Slovak Corpora
Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings Haojin Yang, Christoph Oehlke, Christoph Meinel Hasso Plattner Institut (HPI), University of Potsdam P.O. Box
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
SPRACH - WP 6 & 8: Software engineering work at ICSI
SPRACH - WP 6 & 8: Software engineering work at ICSI March 1998 Dan Ellis International Computer Science Institute, Berkeley CA 1 2 3 Hardware: MultiSPERT Software: speech & visualization
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
Comparing Support Vector Machines, Recurrent Networks and Finite State Transducers for Classifying Spoken Utterances
Comparing Support Vector Machines, Recurrent Networks and Finite State Transducers for Classifying Spoken Utterances Sheila Garfield and Stefan Wermter University of Sunderland, School of Computing and
