IN terms of streaming media, audiovisual recordings are
|
|
- Gerald Mason
- 8 years ago
- Views:
Transcription
1 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER Browsing within Lecture Videos Based on the Chain Index of Speech Transcription Stephan Repp, Andreas Groß, and Christoph Meinel, Member, IEEE Abstract The number of digital lecture video recordings has increased dramatically since recording technology became easier to use. The accessibility and ability to search within this large archive is limited and difficult. Additionally, detailed browsing in videos is not supported due to the lack of an explicit annotation. Manual annotation and segmentation is time-consuming and therefore useless. A promising approach is based on using the audio layer of a lecture recording to obtain information about the lecture s contents. In this paper, we present an indexing method for computer science courses based on their existing recorded videos. The transcriptions from a speech-recognition engine (SRE) are sufficient to create a chain index for detailed browsing inside a lecture video. The index structure and the evaluation of the supplied keywords are presented. The user interface for dynamic browsing of the e-learning contents concludes this paper. Index Terms Multimedia retrieval, speech recognition, index, semantic, user interface, browsing, recorded lecture videos. Ç 1 INTRODUCTION IN terms of streaming media, audiovisual recordings are used more and more frequently for correspondencecourse institutions where, independent of time and place, learners can access libraries of recorded lectures. Fig. 1 illustrates an example of such a system 1 that delivers three main parts of the lecture: the desktop-recording (the captured desktop of the presenter s laptop) on the right side, a manual annotation in the lower left corner, and a video of the speaker in the upper left corner. In fact, such recorded lecture videos are ideal for correspondence courses and could be regarded as a complement to traditional classroom courses [1]. However, the accessibility and traceability of the content from large lecture archives for further use is rather limited. Two major challenges arise while preparing recorded lectures for content-based retrieval: the automated indexing of multimedia videos and the retrieval of semantically appropriate information from a lecture knowledge base. It is evident that the rapid growth of multimedia data available in e-learning systems requires more efficient methods for content-based browsing and retrieval of video data. The requested information is often covered by only a few minutes of the lecture recording and is therefore hidden within a full 90-minute recording stored among thousands of others. It is often not the problem to find the proper lecture in the archive but rather to find the proper position inside the video stream. It is not practical for 1. Tele-Task System, The authors are with the Hasso-Plattner-Institut für Softwaresystemtechnik GmbH (HPI), PO Box , D Potsdam, Germany. {repp, gross, meinel}@hpi.uni-potsdam.de. Manuscript received 8 Mar. 2008; revised 15 Sept. 2008; accepted 12 Dec. 2008; published online 18 Dec For information on obtaining reprints of this article, please send to: lt@computer.org, and reference IEEECS Log Number TLT Digital Object Identifier no /TLT learners to watch the whole video to get the desired information inside the lecture video. The problem becomes how to retrieve the appropriate information in a large lecture-video database more efficiently. Segmentation of video lectures into smaller units, each segment related to a specific topic, is a highly necessary approach to finding the desired piece of information. Traditional video retrieval based on feature extraction cannot be efficiently applied to lecture recordings. Lecture recordings are characterized by a homogeneous scene composition. Most of the time, the lecturer is in focus, presenting a topic which is not visible. Thus, image analysis of lecture videos fails even if the producer tries to loosen the scene with creative camera trajectories. A promising approach is based on using the audio layer of a lecture recording to gain information about the lecture s contents (e.g., [2], [3], [4], [5], [6]). Transcriptions of lectures recorded live consist of unscripted and spontaneous speech. Thus, lecture data has much in common with casual or natural speech data, including false starts, extraneous filler words, and nonlexically filled pauses [7]. Furthermore, a transcript is a stream of words without punctuation marks. Of course, there is also great variation between one tutor s speech and another s. For example, one may speak accurately, the next completely differently with many grammatical errors, for example. One can also easily observe that the colloquial nature of the data is dramatically different in style from the same presentation of this material in a textbook. In other words, the textual format is typically more concise and better organized. Furthermore, speech-recognition software produces outcomes prone to error, approximately 20 to 30 percent of the detected words are incorrect. The not-in-the-vocabulary problem is a dilemma of the recognition software as well [8]. The software needs a database of all words used by the lecturer for the transcription process. If a word that occurs in the speech of the lecturer is not in the vocabulary, the wrong word is /08/$25.00 ß 2008 IEEE Published by the IEEE CS & ES
2 146 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER 2008 Fig. 1. Information resources in recorded lecture videos. transcribed by the engine. Also, changing the language in a presentation may lead the software to unsolvable problems. A lecture presented in German may sometimes use English terminology. This causes false words to be included in the transcript. Another issue is that the analysis of the audio layer of a lecture video recording provides two types of information: the spoken words (transliteration) and prosody. Prosodic cues such as pitch, duration, pauses, and energy distribution mark topic changes and points of interest [9]. A state-of-the-art speech-recognition engine (SRE) does not support this prosodic information and, further, most lecture recordings do not provide optimal sound quality. It has been shown in [8], however, that a keyword-based search in an imperfect transcript yields reliable results. Some universities 2 offer a keyword-based search for their lecture-video archive. However, such solutions fail if word-sense disambiguation is required, i.e., in the case of words with multiple uses or meanings. Our focus is on recorded lectures from computer science. These kinds of lectures have a typical internal structure. The tutor talks about section topics. The section topics are parts of the main topic the tutor is talking about. For example, the tutor talks successively about the section topics startopology, bus-topology, and tree-topology. The main topic is topology in this case. The lecture structure in other courses (e.g., in social science, history of art, etc.) and the lecture structure between lecturers could vary and is not analyzed in our research. Further, the spoken language is a 2. For example, monologue and not a dialogue as it is in a seminar or in classroom courses. In this article, we present our efforts at putting together results from different fields and projects in order to create a user interface for browsing educational lecture videos based on a chain index. Our proposed algorithm requires neither semantically annotated lecture videos nor an external knowledge source such as WordNet [10]. The automatically supplied keywords are advantageous for dissolving word sense ambiguity. The structure of this article is as follows: Section 2 analyzes the current state of the indexing of lecture videos and we show that a new dynamic indexing technique is required. Section 3 qualifies our indexing and describes the advantage of such an index. To obtain an estimate of the accuracy of the retrieval, an exploratory study is presented in Section 4. A browsing system based on the index created, as well as the appropriate interface, is presented in Section 5. 2 STATUS OF LECTURE VIDEO INDEXING The process of establishing access points to facilitate the retrieval of information is called indexing [11]. Indexing of a recorded lecture video requires two main steps. First, the video is split into coherent segments (topic segmentation), i.e., each segment represents a topic and stands for a specific assertion. Second, the segments have to specify the topic with descriptors, keywords, a summary, or a semantic description. The indexing is a part of an Information
3 REPP ET AL.: BROWSING WITHIN LECTURE VIDEOS BASED ON THE CHAIN INDEX OF SPEECH TRANSCRIPTION 147 Retrieval System. Information Retrieval (IR) is the representation, storage, organization, and access for information items [12]. IR in lecture videos is a very active, multidisciplinary research area. Spoken language, any kind of action (e.g., demonstrations, experiments, writing on a blackboard, mouse clicks, etc.), the gestures of the speaker, social tagging, manual annotations, desktop-recordings (e.g., GUI of a demo program, slides, formula, etc.), or the presented slides in document form could act as a source for better access to the contents of the lecture. Fig. 1 gives an overview of the resources in a recorded lecture video. It is clear that all resources are not available in every recorded lecture. In this article we will not have a look at systems which store the presentation in proprietary formats and have additional information about the activity in the lecture, such as [13] or [14], [15], [16]. For instance, Chu and Chen have already presented an approach for a cross-media synchronization [13]. They match audio recordings with text transliterations of these recordings based on dynamic programming methods [17]. Chu and Chen make use of explicitly encoded events for synchronization instead of implicitly automated annotation. However, most lecture recordings that are accessible on the Web ordinarily consist of spoken speech and slides that are stored in a common video format (mpg, rm). The indexing of these lecture videos is highly important for further access. There are four main indexing opportunities for this kind of lecture recording. The first is that the audiovisual lecture recording is indexing manually after the lecture. A different approach is that the users of the videos do the indexing (annotation) themselves with tags, which is called tagging. The slides can also be used as a source for the automatic indexing and the implication in this method is that a slide is a summary of the presenter s speech and thus a good source of data for the search. A fourth option is that the speech can be used as a resource for automatic indexing. 2.1 Manual Here, both the segmentation and the annotation are done manually. Typically, a professional who is familiar with the issue does segmentation and marks each segment with descriptors or other descriptions. This is done after the lecture in a time consuming and expensive process. Maybe this is useful for a small lecture archive but it is not realistic for an existing large archive. It is clear that this solution obtains the best results depending on the accuracy and the diligence of the professional. As a complex example that is limited to manageable videos segments, we will present a system for a semantic search based on Description Logics (DL) [18]. It has been recognized that a digital library benefits from having its contents understandable and available in a machineprocessable form, and it is widely agreed that ontologies will play a key role in providing much of the enabling infrastructure to achieve this goal. A fundamental part of the system is a common domain ontology. The ontology is basically composed of a hierarchy of concepts (taxonomy) and a language. In the former case, a list of semantically Fig. 2. Example of terminology concerning learning objects. relevant words is created with regards to a domain, e.g., internetworking, and organized hierarchically. In the latter case, the Description Logics are used to formalize the semantic annotations. Description Logics [19] are a family of knowledge representation formalisms that allow the knowledge of an application domain to be represented in a structured way and to be reasoned through. In DL, the conceptual knowledge of an application domain is represented in terms of concepts such as IPAddress and roles such as 9composedOf. Concepts denote sets of individuals and roles denote binary relations between individuals. Complex descriptions are built inductively using concept constructors which rely on basic concepts and role names. Concept descriptions are used to specify terminologies that define the intentional knowledge of an application domain. For example, the following imposes that a router is a network component that uses at least one IP address: Routerv NetCompu9uses.IPAddress. Definitions allow meaningful names to be given to concept descriptions such as LO 1 IPAdress u9composedof.hostid. To index a video, the video is split up into segments. One segment stands for one Learning Object (LO) where the lecturer speaks about one topic. The semantic annotation of the four LOs is shown in Fig. 2, describing the following contents:. LO 1 : general explanation of IP addresses,. LO 2 : explanation that IP addresses are used in the protocol TCP/IP,. LO 3 : explanation that an IP-address is composed of a host identifier, and. LO 4 : explanation that an IP-address is composed of a network identifier. Some advantages of using DL are that DL terminologies can be serialized as OWL (Semantic Web Ontology Language) [20], a machine-readable and standardized format for semantically annotating resources. Second, DL allows the definition of detailed semantic descriptions about resources and logical inference from these descriptions [19]. Finally, the link between DL and natural language (NL) has already been shown [21]. The query of the user represents an OWL description as well. The way the NL processing works is described in detail in [22], [18]. The query DL OWL and all LO OWL descriptions are the input of the semantic search engine. This checks over the LO and calculates the suitable ones. In tests, an MMR 5 -value 3 (Medium Reciprocal Rank of the answers [23]) of 75 percent is reached [24]. This system has been successfully used in a school in Luxembourg [25] for the domain fraction arithmetic. 3. MMR is a standard evaluation measure for Question/Answering Systems.
4 148 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER 2008 When using DL, four types of manual tasks have to be done for the manual indexing:. Creating the taxonomy manually or using an existing one.. Creating dictionaries which map NL words to the concepts and roles.. Segmenting the videos into parts (LO).. Annotating the video segments with concepts and rules. In contrast, a pure manual indexing is based only on the segmentation of the videos into parts (LO) and the enrichment of the parts with the basics contents. Certainly, the manual process is very time-consuming and not really practical for the indexing of a large video library. 2.2 Tagging The recent emergence and success of folksonomies and tagging have shown the great potential of this simple approach to collecting metadata about resources. Unlike traditional categorization systems, the process of tagging is nothing more than annotating documents with an unstructured list of keywords. Although the amount of research on tagging is still comparatively low, several studies have already analyzed the semantic aspects of this process and why it is so popular in practice [26], [27], [28]. Tagging strikes a balance between the individual and the community: The cost of participation is low for the individual and tagging a document benefits both the individual and the community. An approach to video annotation is described in [29]. There the user is involved in the annotation process by deploying collaborative tagging for the generation and enrichment of video metadata annotation to support content-based video retrieval. When a user bookmarks a position in a lecture video, he/she stores it for later use. Users only bookmark documents that are valuable or relevant to them. When a document is of no interest to a specific user, it is unlikely that he/she would bookmark it, and when users actually do store bookmarks in order to find them again later, they have an incentive to add meaningful metadata to them. Finding word associations for describing a document in the form of tags is a subjective user task. For the annotation of lecture videos, the timeline is a further dimension not present in text documents. Surely the words are also ordered in the text document but the exact time position of the word is lost. In order for the annotation to be useful and exact, the user has to annotate all the segments in the lecture video, not just the one(s) relevant to him/her. The quality and the quantity of tagging are improved by the square of the numbers of users if the quantity of users increases (Metcalfe s law 4 ), but it is very doubtful that enough learners tag one lesson so that the quality of the annotation is appropriate under the consideration that many universities lecture videos cover similar topics To the best of our knowledge, no research exists on the quality of the annotation of lecture videos, nor do studies exist on the quality of the tagging information of lecturer videos compared with a reference data set. The main disadvantage is that this annotation needs a lot of time and depends on many learners who have to watch and put to use the ability to annotate. 2.3 Slides Slides (e.g., PowerPoint slides) represent the main information in video segments, particularly in computer science [8], [30]. Hürst evaluated that the slides carry the most information for a keyword search. He evaluated a retrieval performance based on slides of a 40 percent precision value and a 63 percent recall value compared to a precision of 33 percent and a recall of 54 percent based on corrected transcriptions. The recall and the precision value are standard evaluation measures in Information Retrieval [12]. One problem is the synchronization of the slides and the video stream. An approach is described in [31] for synchronizing presentation slides by maintaining a log file during the presentation that keeps track of slide changes. Sack and Waitelonis [31] mention optical character recognition (OCR) for the identification and synchronization of the presentation slide currently being shown within a desktoprecording. If the log files and the slides are given, then these annotations have good retrieval results based on keywords. But, OCR recognition is erroneous and only adapted to the special video format and to the PowerPoint format. Another solution is to synchronize the existing PowerPoint slides with the speech of the lecture recordings [32]. Repp et al. synchronized with a difference of plus or minus one slide the speech transcription. If slides exist for the lectures and the time stamps of each slide are available, then this source is one of the best for access to the lecture contents [30]. However, most lecture recordings available neither support desktop recordings nor maintain a dedicated log file with the appropriate slides. So, the speech itself remains the only reliable source of information despite positive results from these methods. 2.4 Transcripts of Speech Recognition Engines The spoken language is one of the main information resources in lectures. The lecturer speaks about a topic with a characteristic vocabulary. Each transition of the vocabulary (or other features in the spoken language, e.g., a pause) could mark a segment boundary and so the video could be classified into video sections. The task is to partition the text into a sequence of topically coherent segments and thereby induce a content structure (called topic segmentation). For each video segment, a index could be created with several standard indexing methods [11], [12], e.g., the distribution of words in the video segments is used for the automatic indexing process. Topic segmentation has been extensively used in text information retrieval and text summarization [33], [34], [9], [35], [36]. The user would prefer a document passage in which the occurrence of the word or topic is concentrated in
5 REPP ET AL.: BROWSING WITHIN LECTURE VIDEOS BASED ON THE CHAIN INDEX OF SPEECH TRANSCRIPTION 149 TABLE 1 Summary of the Lecture Series Archive one or two passages. The development of text segmentation algorithms is a central concern in natural language processing. This can be stated as the issue of detecting the boundaries with standard text segmentation algorithms. For the first step in automatically indexing the lecture video based on the transcripts, a small test is arranged. Standard text segmentation algorithms are implemented to show whether it is possible to recognize the segments based on spoken language. A more detailed study about this research is presented in [37] Test Setup After a preprocessing step which deletes all the stop words (i.e., words without semantic relevancy) and transforms all words into their canonical form (stemming) [38], the following algorithms are implemented:. Linear. A linear distribution of the topics during the presentation time is implemented. The algorithm assumes the number of topics is given.. Pause. The duration of silence (pause in speech) is used as a feature for a segment boundary. The time stamps of the longest silences are used as the boundaries. The algorithm assumes the number of topics is given.. C99 and LCseg. The boundaries number is given and a sentence length of 10 words is defined. A segment boundary is assumed to be a text boundary and can be detected by the C99 [39] and the LCseg [40] algorithms. The description and the implementations of the algorithms can be downloaded from the appropriate author pages [39], [40]. In this way the number of segments is provided to the algorithms. A stop-word list and a stemmer, both for the German language, are adapted to C99 and LCseg.. SlidingWindow (SW). The SlidingWindow is based on the TextTiling [41] algorithm and on the research of [42]. A sliding window system is implemented. This window (120 words) moves across the text stream of the transcript over a certain interval (20 words) and compares the neighboring windows with each other using the cosine-measure. After postprocessing, the points with the lowest similarity become a boundary Metrics The WindowDiff [43] measurement (WinDiff) is used as a standard evaluation measure for text segmentation. The implementation of the WindowDiff is used from [44]. 5 The WindowDiff measurement does not take into account the time delay between the calculated time and the real boundary time [32]. For this reason, the mean error rate and the standard deviation are used. The mean error rates (x) are calculated as the difference between the point in time of the reference and the point in time achieved by the algorithm for each topic boundary in the course. The mean of these values is the medial of all differences (i.e., the offset of the time shift). Additionally, we calculate the mean of all absolute differences (y). Further, the standard deviation (SD) points out how much the data are spread out from the mean. If the data sets are close to the mean, the standard deviation is small and the algorithm matches the boundaries very well First Test, Segment Boundaries Are the Slide Transitions The text boundaries are assumed as slide transitions. The question posed is how standard text segmentation algorithms can segment the erroneous transcript into coherent segments without any additional resources like the Power- Point slides. The count of segment boundaries (slide numbers) is given for each lecture. Table 1 shows a summary of the data set s contents. The data set for the first test includes two different speakers, two different languages (German and English), and three different topics (WWW, Semantic, and Security). For the first two minutes of each lecture, the word accuracy is determined manually as depicted in [45]. Table 2 shows the results as the mean, the standard deviation, and the WindowDiff of the first test. Sliding- Window has the best mean value. Table 2 shows further that the Linear segmentation yields better time results (y and the standard deviation) compared to the C99, LCseg, SlidingWindow, and Pause algorithm for the data set. For the WindowDiff measurement, the simple Pause algorithm seems to be the best one Second Test, on Real Segment Boundaries The problem with relying primarily on the slides for creating segments is that these segments will always be TABLE 2 Results (in Seconds) of the First Test Based on the Data Set of Table 1, Segment Boundaries Are the Slide Transitions 5.
6 150 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER 2008 TABLE 3 Results (in Seconds) of the Second Test Based on an Improved Transcript with a WER 0 Percent partially wrong due to the fact that speech is dynamic. The tutor uses his freedom to discuss topics not classified by the slides. Taking this into consideration, three people (a lecturer and two PhD students) were asked to discuss the gold standard of boundaries. One lecture is randomly selected from Mr. Meinel s WWW course and a perfect transcript (corrected by one person) is generated. This lecture of approximately 100 minutes in duration consists of 12,608 words and Word Error Rate is nearly 0 percent. The first test scenario A is based on the 62 slide transitions as boundaries and the second scenario B is based on 42 real boundaries. Table 3 shows the mean, the standard deviation and the WindowDiff of this second test. It is surprising that the simple pause algorithm out-performs the WindowDiff measure for the data set (boundaries are generated by persons). Most algorithms achieve better results for the data set with the real boundaries compared with the slide transition. Hence, the results show that slide transition is not a good segmentation for topic boundaries Conclusion The overall results are very unfruitful as they show that detecting the topic segment is nearly impossible. Detecting the topic segment (slide transition or real boundaries) is practically impossible. A time shift between +/- 2 to 3 minutes (SD of C99 algorithm, Second Test) occurs. If the segment count isn t given to the algorithm, the results are much worse because then the algorithms have to additionally determine the boundary count. Further, the definition of a topic segment is not clear. Malioutov and Barzilay [44] show that the three lecturers chosen to segment the lecture did so differently. They defined between 8.9 and 13 topic segments per lecture. This discrepancy in the segmentation results is highly critical for information retrieval. 2.5 Current Approaches versus the Ideal Approach All current approaches, such as manual indexing, using only the slide, tagging, or using the speech transcripts in an ordinary information search, have serious disadvantages. It is clear that manual indexing achieves the best retrieval results but it is too time consuming and too expensive for the libraries. Further, although the slides as a resource are good, the problem of segmentation within the lecture is not resolved (see test 2). The speaker is usually on the same topic for one or more slides, and so one slide is not a topic, but rather a topic consists of more than two slides [44]. Additionally, the question of what a topic consists of is highly debatable. Moreover, most lecture recordings available neither support desktop recordings nor maintain a dedicated log file, meaning this also poses problems. Tagging, on the other hand, doesn t ensure a consistent annotation of the video streams if the videos are only watched by a few learners. They do some tags, but not objectively and automatically. In the end, a classical retrieval (segmenting, extracting keywords) of the transcription fails on the first segmenting step. It is not possible to segment or to find the topic boundaries (slide transitions) in the lecture video in a serious way. Thus, we need a better method, without an explicit definition of a segment, in order to give users the freedom to browse the lecture and find the desired point in the lecture. 3 INDEXING In this chapter, we describe our solution for the indexing of lecture videos based on the speech transcript. First, we explain our indexing technique, called chaining. Second, we explain why this enhances the search within lecture videos. 3.1 Chaining Clustering is used to detect cohesive areas we call them chains in the transcript. Linguistic research has shown that word repetition in a text is a hint for creating thematic cohesion. A change in the lexical distributions is usually a signal for topic transitions [46], [47]. The word stream of the raw data can contain all parts of speech, such as a noun, verb, number, etc. A term is any stemmed word within the lecture. From that stream, the distinct terms T are stored in a term list L. In other words, the list L consists of all distinct word stems detected by the SRE of the lecture (without stop words). n is the count of the distinct terms. L ¼fT 1 ;T 2 ;T 3 ; ::::; T n g: A chain is constructed to consist of all repetitions ranging from the first to the last appearance of the term in the lecture. The chain is divided into subparts when there is a long break (time distance d) between the terms. The chain is a segment of accumulated appearances of equal terms. The process works as follows: 1. take the term T 1 from the term-list L, 2. build clusters we call them chains so that the distance between two adjacent terms T i is not more than a distance d, count the occurrences (TN), and set the start time and end time for the chain, 3. store the chain data in the database as a chain index for the video, and 4. take the next distinct term T iþ1 from the term-list L and go to 2). ð1þ
7 REPP ET AL.: BROWSING WITHIN LECTURE VIDEOS BASED ON THE CHAIN INDEX OF SPEECH TRANSCRIPTION 151 TABLE 4 Example of the CHAIN INDEX The chains can overlap because they have been separately generated for each term T i used in the course. For all chains that have been identified for the terms, a weighting scheme is used. Chains containing more repeated terms receive higher scores. The term number TN is the ranking value for the chain; the higher the value of TN, the higher the relevance of the chain. In a preliminary experiment, we had a precision of 88 percent for the first relevant chain [48]. In Table 4 and in Fig. 3 is an example of the chain index. For example, in the first chain Topology, the stem topology occurred 10 times and the chain has a starttime at 1600s and an end-time at 2400s in the specific lecture. Further, there is a chain Topology that also starts at 2600s and ends at 3300s, but the chain only has a word repetition of four, so it is not given as high a ranking as the first chain. 3.2 Resolving Word Disambiguation The chain index supports a dissolving of any ambiguity. In Fig. 4 the term topology is used in the context of network and the term ring is used in the context of topology. The sense of this word is clear from the context it is used in. Moreover, the index supports a keyword register for each video segment. In Fig. 4, the search query of the user is IP and the chain with the highest TN and their inside chains (Address, UDP, Suffix, Header, TCP) are returned. The inside chains represent the content of the video segment IP. In fact, a chain has a before, after, and superordinate area, too. These areas, before, inside, after, and superordinate, consist themselves of areas. The user has the opportunity to browse through these areas to find the semantically proper position in the video. Fig. 3. Chains in a chronology sequence. Fig. 4. The Inside, Before, After, and Superordinate Area for the Query IP. Furthermore, the problem of composite words can be solved with the help of the chain index. The query IP Address leads to the intersection of the chains IP and Address being returned to the user. 4 EVALUATION In this chapter, we describe our evaluation of the index. First, we explain our experimental setup. Second, we evaluate in two tests the accuracy of the chaining. 4.1 Experimental Setup The database of the videos consists of the course Einführung in das WWW, held in the German language in the first semester 2006 at the Hasso Plattner Institut in Potsdam. The second row of the Table 1 summarizes the features of the data set. The data set consists of 24 lectures; each lecture is approximately 90 minutes in length. The video corpus has an overall length of approximately 1,860 minutes and is stored as RealMedia files. The SRE needs a training phase for adapting the microphone to the SRE (10 minutes). The dictionary of the SRE is supplemented with an existing domain lexicon or, if they exist, the keywords from the slides. The domain words (in our case, the keywords from the PowerPoint slides) are trained with a standard tool in 20 minutes. So, the training phase for the SRE is approximately 30 minutes long. Please take into consideration that this training phase is done once for the lecture series. Our purpose is neither to enhance existing SRE nor to develop new speaker independent SRE; our purpose is to use the existing SRE practically. If no training is done by the lecturer, the accuracy is lower. Hürst analyzed the dependency between training phase and word accuracy [31]. 4.2 The Distance d To find the proper d break for the chaining, we vary d between 0.5 to 10 minutes. and measure the accuracy of 10 keywords. The ten keywords are randomly selected from the domain lexicon. Fig. 5 shows the results of this test. In this figure the C(orrect) means that the video segment retrieved is correct and is within +/- 30 seconds. It is debatable whether this interval (between 2 minutes and 8 minutes) is an acceptable value, in part because only 10 words were tested and the measurement involved only
8 152 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER 2008 Fig. 6. Classification of the results. Fig. 5. Accuracy of the results dependent of the distance factor d. one speaker, but it is a first clue of that value. Also, the question of whether there is a new segment or not does not arise because in the topic returned, the keyword is definitely used by the speaker. The second test provides more details. 4.3 Accuracy of the Chaining One hundred and fifty three keywords (topic words) are chosen for the evaluation (e.g., XML, DSL, ISDN, ATM, Token, SGML, topology, mpeg, security, exam...). These words occur at least once in the transcript of the course and the words are different from those in the first test. A detailed study of keyword searches in speech transcripts of lecture recordings is presented in [30]. We evaluated whether our generated chain represents the topic for the search word. For this, we took only the chain with the majority of the topic words TN and decided whether this chain represents the subject accurately and, further, if this is the best section in the whole course and how long the time shift between the start of the calculated chain and the start of the topic in the lecture is. We evaluated whether each hit in the result set: 1. is correct and is within +/-30 seconds (C), 2. is correct and is around the area of +/- 120 seconds (CA), 3. is similar to the area of the topic and is within +/- 30 seconds (S), 4. is similar to the area of the topic and is around the area of +/- 120 seconds (SA), and 5. is not correct and not in the area of the beginning of the chain (W). Fig. 6 shows this classification in detail. Similar to the area of the topic means that the lecturer speaks about a similar topic or he/she speaks about a topic that is related. For example, the search keyword is topology and, typically, several positions exist in the lecture series where this is discussed. The correct position in the lecture series is the description of topology and not a similar area such as ATM topology or anything else related to that. Surely, the decision of whether it is the correct position or not is a subjective task, but put into the perspective that this is a special kind of lecture series, the task becomes less subjective. The lecture is a basic tutorial about internetworking and, so, each lesson consists of several new terms and definitions and each is presented only once in the whole lecture series. Certainly, the lecturer mentions the terms on several occasions, however, the decision is only whether this is the position where the lecturer explains the terms and definitions. A few retrieved positions are not easy to decide, for example, Address. The speaker discusses Address in the context of Mobile Address, Internet Address, Mailing Address, etc. In this case, we decided that these positions are all correct positions. Table 5 depicts the results of the evaluation percent of all hits are correct and only 13.1 percent are completely wrong. If we sum up all correct hits with all similar hits (similar to the topic), then we obtain a result of 86.8 percent correct hits. Comparing these results with the results of the test from the section Transcriptions of Speech-Recognition Engine is complex. In that section, the segmentation problem leads to very imprecise results (about +/- 2.5 minutes for given segment boundaries; this value is higher when segment boundaries are not given). Additionally, the retrieval within these segments and the represented keywords of the segment are not perfect. One error leads to further errors. Finding the right time position in the video stream could not be done better than with our method, although our retrieval performance of the best chain depended on the accuracy of the SRE. This dependency is evaluated by [30]. 5 BROWSING Browsing is a subjective selective process for filtering a large object set for navigation by the user [49]. It is the activity of engaging in a series of glimpses, each of which exposes the browser to objects of potential interest. Depending on the interest of the user, the browser may or may not examine more closely one or more of the objects. Depending on the interest, this may or may not lead the browser to acquire the objects [50]. In contrast with that, TABLE 5 Results of the Accuracy of the Chaining with d ¼ 180 Seconds
9 REPP ET AL.: BROWSING WITHIN LECTURE VIDEOS BASED ON THE CHAIN INDEX OF SPEECH TRANSCRIPTION 153 Fig. 7. Components of the system. retrieval is the process of recalling and finding information and sending it to the user. The above system based on the chain index has the ability to retrieve topic words from the text stream for indentifying relevant positions in the lecture video. The user has to select from the supported list the appropriate information he/she is searching for and, further, has the opportunity to browse within the objects. In this section, we present a schematic overview of our components for the browsing system illustrated by Fig. 7. The system is organized in three basic functional components, described in the following sections. 5.1 Speech Component Lectures are recorded in a multimedia form, for example, as RealMedia files (.rm) in our implementation. The conversion of the audio data into text and the preprocessing of that text is the task of this component. An out-of-the-box SRE is used to generate text data with a time-stamp on each word [6], [32]. The SRE needs a training phase for adapting the microphone to it. The dictionary of the SRE is supplemented with an existing domain lexicon or, if they exist, the keywords from the slides. The training phase s usefulness for obtaining accurate transcripts certainly depends on the user s motivation. The transcript consists of a list of words with the corresponding point in time when the word was spotted in the speaker s flow of words. A preprocessing step deletes all the stop words (i.e., words without semantic relevancy) and transforms all words into their canonical form (stemming) [38]. Then, the resulting words with their time stamp we call them raw data are stored in a database. 5.2 Chaining Component The automatic clustering is the task of this component. The input is the raw data from the speech component and the output data is an index of the term chains. This component is presented in Section Web Access This well-known component consists of a Web-server. The Web-server uses the chain index for requests from the Web-browsers. The implementation of the system is done with the Django-Framework. 6 Django implements the Model-View-Controller Concept. The model represents the data of the application (chain index), the view manages the elements of the user interface, and the controller manages the communication to the model of user actions. The out-of-the-box SRE needs approximately 3.5 hours (dual core 2.4 MHz, 2 GB) for the transcription process of a lecture (100 minutes, 12,000 words) and less than 3 seconds for the chaining. The calculation is done once after the lesson and the later search process is not time critical. The search process only has access to the chain index that is based on integer values. The user interface (Fig. 8) consists of three regions (see also [51], [52]): 1. The first region is set for the input of the search query and, additionally, if the information is available, the restriction of the language and the related lecture. 2. The second region is the result of the first search. It illustrates the results with the time information and exemplifies the results for the most relevant chains for a lecture. 3. The third region shows the summary of the before, after, and inside areas and their chains. After a click on an adequate chain, an external player plays the video from the starting-point and, additionally, the third region of our site is expanded. The third region consists of the new before, after, and inside areas of the played chain. 6 DISCUSSION It is clear that the results produced by such a search tool depend on the accuracy of the SRE. Some incorrectly detected words and incorrect compound words are not a problem for our system and user interface. The term occurs very often in relevant chains and, so, a high redundancy exists. Thus, some wrongly detected words have no influence on the generated and final results [8]. But, the main problem is that even a state-of-the-art SRE cannot recognize words from different languages in the transcription process. Lectures in German, especially lectures in informatics, contain some English words and phrases. An example of this problem is the English word Source, for which the SRE detects the German word Soße. To avoid this, an editor tool was developed for the correction of the main chains in the areas before, inside, after, and superordinate. The editor tool (Fig. 9) has an input box for selecting the keyword. After the selection, the chains are expanded to an initial timeline and the working area ( Working Frame ). In some cases, the video does not start at the beginning of the sought-after video segment. The user can watch the appropriate area and now has the opportunity 6.
10 154 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER 2008 Fig. 8. User interface with time information. Fig. 9. Editor tool for the correction of the chains. to change the start and end time of the chain. They also have the opportunity to change the word if the word is incorrectly detected by the SRE for the chain. The editor also has the opportunity to browse inside the chain and to have a detailed look at it. After the editing process, the user can store the changes. After that, the save modus ( Approved Frame ) is depicted under the working area. A further option is when a chain (or a word) is corrected. Then, all terms of the raw data will be corrected and the indexing (chaining) starts again. The intention is not to correct the whole transcript that would be too timeconsuming but to correct the main and fundamental
11 REPP ET AL.: BROWSING WITHIN LECTURE VIDEOS BASED ON THE CHAIN INDEX OF SPEECH TRANSCRIPTION 155 chains and words. This editor can be used as a multifunctional annotation tool as well. If the speaker-independent SRE had a better performance, then the training phase, the additional domain words, and the editing process would not be necessary, so our system would not necessarily need any additional resources. 7 CONCLUSION AND FURTHER WORKS In this paper, we have evaluated a system that allows browsing in sections of videos from a multimedia knowledge base. The results show that it is possible to add data to the result set that supports the learners with the helpful information they are searching for. Regardless of the issues presented by imperfect SRE, this user interface allows an exact, easy, and fast navigation in the video archive. It also allows the disambiguation of words. The results of the tests demonstrate how the browsing and word-disambiguation are effective and that learners are indeed able to retrieve the results they are searching for. Additionally, we are planning a usability study with learners to support the results we have already obtained. The searching process and the user interface will be evaluated during this test. Furthermore, we are working on improving the ranking. The part of speech (noun, verb, number, etc.), the time-duration of the chain, and the sum of chains for a term and the occurrence of the term in different lectures could be helpful parameters for a new ranking algorithm. We are also planning to embed a MPEG-7 annotation in an MPEG-4 video container [53]. With MPEG-4 Binary Format for Scenes (BIFS), it is possible to create a scene description with navigational elements and special search facilities which require annotation. This is a compact representation of audiovisual information in a single container with all information needed to compose, consume, share, and distribute this data. This offers a wide range of applications in video and audio retrieval, such as semantic search and content classification. The application of our algorithm is not limited to indexing university lectures or presentations in general. All activity applications, e.g., newscasts, theater plays, video material, or any kind of speech complemented by textual data, could be analyzed and annotated with the help of the proposed algorithm. ACKNOWLEDGMENTS This project was developed in the context of the Web University project, 7 which aims to explore novel Internet and IT technologies in order to enhance university teaching and research. Special thanks to Carol Ebbert for correcting the language. Also to the student Johannes Köhler for the implementation of the interface components REFERENCES [1] S. Linckels, S. Repp, N. Karam, and C. Meinel, The Virtual Tele- Task Professor: Semantic Search in Recorded Lectures, Proc. 38th SIGCSE Technical Symp. Computer Science Education (SIGCSE 07), pp , [2] C.-W. Ngo, F. Wang, and T.-C. Pong, Structuring Lecture Videos for Distance Learning Applications, Proc. Multimedia Software Eng., pp , [3] N. Yamamoto, J. Ogata, and Y. Ariki, Topic Segmentation and Retrieval System for Lecture Videos Based on Spontaneous Speech Recognition, Proc. Eighth European Conf. Speech Comm. and Technology, pp , [4] A. Haubold and J.R. Kender, Augmented Segmentation and Visualization for Presentation Videos, Proc. 13th ACM Int l Conf. Multimedia, pp , [5] L. Tang and J.R. Kender, Designing an Intelligent User Interface for Instructional Video Indexing and Browsing, Proc. Int l Conf. Intelligent User Interfaces (IUI 06), pp , [6] S. Repp and C. Meinel, Segmenting of Recorded Lecture Videos - The Algorithm VoiceSeg, Proc. Conf. Signal Processing and Multimedia Applications (SIGMAP 06), pp , [7] J. Glass, T.J. Hazen, L. Hetherington, and C. Wang, Analysis and Processing of Lecture Audio Data: Preliminary Investigations, Proc. Workshop Interdisciplinary Approaches to Speech Indexing and Retrieval (HLT-NAACL 04), pp. 9-12, [8] W. Hürst, T. Kreuzer, and M. Wiesenhütter, A Qualitative Study towards Using Large Vocabulary Automatic Speech Recognition to Index Recorded Presentations for Search and Access over the Web, Proc. IADIS Int l Conf. WWW/Internet (ICWI 02), pp , [9] G. Tür, A. Stolcke, D. Hakkani-Tür, and E. Shriberg, Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation, Computational Linguistics, vol. 27, no. 1, pp , [10] G. Miller, An On-Line Lexical Database, Int l J. Lexicography, vol. 3, no. 4, pp , [11] H. Nohr, Grundlagen der automatischen Indexierung. Logos Verlag, [12] R.A. Baeza-Yates and B.A. Ribeiro-Neto, Modern Information Retrieval. ACM/Addison-Wesley, [13] W.-T. Chu and H.-Y. Chen, Cross-Media Correlation: A Case Study of Navigated Hypermedia Documents, Proc. 10th ACM Int l Cong. Multimedia (MULTIMEDIA 02), pp , [14] G.D. Abowd, C.G. Atkeson, J.A. Brotherton, T. Enqvist, P. Gulley, and J. LeMon, Investigating the Capture, Integration and Access Problem of Ubiquitous Computing in an Educational Setting, Proc. Conf. Human Factors in Computing Systems (SIGCHI 98), pp , [15] S. Mukhopadhyay and B. Smith, Passive Capture and Structuring of Lectures, Proc. ACM Int l Conf. Multimedia (Part 1), pp , [16] R. Müller and T. Ottmann, The Authoring on the Fly System for Automated Recording and Replay of (Tele)presentations, Multimedia System, vol. 8, no. 3, pp , [17] H. Ney and S. Ortmanns, Progress in Dynamic Programming Search for lvcsr, Proc. IEEE, vol. 88, no. 8, pp , [18] S. Linckels and C. Meinel, Applications of Description Logics to Improve Multimedia Information Retrieval for Efficient Educational Tools, Proc. ACM SIGMM Int l Conf. Multimedia Information Retrieval (MIR 08), [19] The Description Logic Handbook: Theory, Implementation, and Applications, F. Baader, D. Calvanese, D.L. McGuinness, D. Nardi, and P.F. Patel-Schneider, eds. Cambridge Univ. Press, [20] W.W.W.C. W3C, OWL Web Ontology Language, w3.org/tr/owl-features, [21] R.A. Schmidt, Terminological Representation, Natural Language & Relation Algebra, Proc. 16th German Conf. Artificial Intelligence (GWAI 93), pp , [22] S. Linckels and C. Meinel, Resolving Ambiguities in the Semantic Interpretation of Natural Language Questions, Proc. Seventh Int l Conf. Intelligent Data Eng. and Automated Learning (IDEAL 06), pp , Sept [23] E.M. Voorhees, The TREC-8 Question Answering Track Report, Proc. Text REtrieval Conf. (TREC-8), [24] S. Repp, S. Linckels, and C. Meinel, Question Answering from Lecture Videos Based on Automatically-Generated Learning Objects, Proc. Seventh Int l Conf. Web-Based Learning (ICWL 08), pp , 2008.
12 156 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 1, NO. 3, JULY-SEPTEMBER 2008 [25] S. Linckels, C. Dording, and C. Meinel, Better Results in Mathematics Lessons with a Virtual Personal Teacher, Proc. 34th Ann. ACM SIGUCCS Conf. User Services, pp , [26] M.G. Noll and C. Meinel, Web Search Personalization via Social Bookmarking and Tagging, The Semantic Web, Proc. Sixth Int l Semantic Web Conf., Second Asian Semantic Web Conf. (ISWC/ASWC 07), pp , [27] S. Golder and B.A. Huberman, Usage Patterns of Collaborative Tagging Systems, J. Information Science, vol. 32, no. 2, pp , [28] C. Marlow, M. Naaman, D. Boyd, and M. Davis, Ht06, Tagging Paper, Taxonomy, Flickr, Academic Article, to Read, Proc. 17th ACM Conf. Hypertext and Hypermedia (HHYPERTEXT 06), pp , [29] H. Sack and J. Waitelonis, Integrating Social Tagging and Document Annotation for Content-Based Search in Multimedia Data, Proc. First Semantic Authoring and Annotation Workshop (SAAW 06), [30] W. Hürst, Multimediale Informationssuche in Vortrags- und Vorlesungsaufzeichnungen, PhD dissertation, Fakultät füur Angewandte Wissenschaften, Universität Freiburg, [31] H. Sack and J. Waitelonis, Automated Annotations of Synchronized Multimedia Presentations, Proc. Workshop Mastering the Gap: From Information Extraction to Semantic Representation (ESWC 06), [32] S. Repp, J. Waitelonis, H. Sack, and C. Meinel, Segmentation and Annotation of Audiovisual Recordings Based on Automated Speech Recognition, Proc. Eighth Int l Conf. Intelligent Data Eng. and Automated Learning (IDEAL 07), pp , [33] M.A. Hearst, Multi-Paragraph Segmentation of Expository Text, Proc. 32nd Ann. Meeting Assoc. for Computational Linguistics, pp. 9-16, [34] D. Beeferman, A. Berger, and J.D. Lafferty, Statistical Models for Text Segmentation, Machine Learning, vol. 34, nos. 1-3, pp , [35] M. Utiyama and H. Isahara, A Statistical Model for Domain- Independent Text Segmentation, Proc. Conf. 39th Ann. Meeting of the Assoc. for Computational Linguistic and 10th Conf. European Chapter, pp , [36] F.Y.Y. Choi, P. Wiemer-Hastings, and J. Moore, Latent Semantic Analysis for Text Segmentation, Proc. Conf. Empirical Methods on Natural Language Processing (EMNLP 01), pp , [37] S. Repp and C. Meinel, Segmentation of Lecture Videos Based on Spontaneous Speech Recognition, Proc. IEEE Int l Symp. Multimedia Workshops (ISMW 08), pp , [38] M. Porter, An Algorithm for Suffix Stripping, Program, vol. 14, no. 3, pp , [39] F.Y.Y. Choi, Advances in Domain Independent Linear Text Segmentation, Proc. First Conf. North American Chapter of the Assoc. for Computational Linguistics, pp , [40] M. Galley, K. McKeown, E. Fosler-Lussier, and H. Jing, Discourse Segmentation of Multi-Party Conversation, Proc. 41st Ann. Meeting of the Assoc. for Computational Linguistics (ACL 03), pp , [41] M.A. Hearst, TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages, Computational Linguistics, vol. 23, no. 1, pp , [42] M. Lin, J. Nunamaker, M. Chau, and H. Chen, Segmentation of Lecture Videos Based on Text: A Method Combining Multiple Linguistic Features, Proc. 37th Ann. Hawaii Int l Conf. System Sciences (HICSS 04) Track 1, p c, [43] L. Pevzner and M.A. Hearst, A Critique and Improvement of an Evaluation Metric for Text Segmentation, Computational Linguistics, vol. 28, no. 1, pp , [44] I. Malioutov and R. Barzilay, Minimum Cut Model for Spoken Lecture Segmentation, Proc. 21st Int l Conf. Computational Linguistics and 44th Ann. Meeting of the Assoc. for Computational Linguistics (ACL 06), pp , [45] A. Hauptmann and H. Wactlar, Indexing and Search of Multimodal Information, Proc. IEEE Int l Conf. Acoustics, Speech, and Signal Processing (ICASSP 97), p. 195, [46] M. Halliday and R. Hasan, Cohesion in English. Longman, [47] J. Reynar, Topic Segmentation: Algorithm and Applications, PhD dissertation, Univ. of Pennsylvania, [48] S. Repp and C. Meinel, Semantic Indexing for Recorded Educational Lecture Videos, Proc. Fourth IEEE Conf. Pervasive Computing and Comm. Workshops (PerCom 06), pp , [49] R.V. Cox, B.G. Haskell, Y. LeCun, B. Shahraray, and L. Rabiner, On the Applications of Multimedia Processing to Telecommunications, Proc. Int l Conf. Image Processing (ICIP 97), vol. 1, pp. 5-8, [50] M.J. Bates, What Is Browsing Really? A Model Drawing from Behavioural Science Research, Information Research, vol. 12, no. 4, Oct [51] S. Repp, A. Groß, and C. Meinel, Dynamic Browsing of Audiovisual Lecture Recordings Based on Automated Speech Recognition, Proc. Ninth Int l Conf. Intelligent Tutoring Systems (ITS 08), pp , [52] S. Repp, A. Groß, and C. Meinel, Webbasierte Suche in Vorlesungsvideos auf Basis der Transkripte eines Spracherkenners, e-learning Fachtagung Informatik (DeLFI), pp , [53] S.F. Chang, T. Sikora, and A. Puri, Overview of the MPEG-7 Standard, IEEE Trans. Circuits and Systems for Video Technology, vol. 11, no. 6, pp , Stephan Repp received the telecommunication engineering degree from the University of Applied Sciences Trier, Germany, in He received the master s degree in system design from the Hochschule Darmstadt, Germany, in He worked as an IT project manager in the data warehouse project of the Deutsche Post AG. He received the state examination from the Staatliches Studienseminar Trier in teaching informatics and electrical engineering in He is currently a PhD student at the Hasso-Plattner-Institute (HPI) for IT- Systems Engineering at the University of Potsdam and a teacher for informatics at the Berufsbildende Schule für Gewerbe und Technik in Trier. His current research interests revolve around information retrieval from recorded audiovisual lecture videos. He is involved in the Web- University project of the HPI. Andreas Groß received the computer science engineering degree from the University Rostock, Germany, in He is a scientific coworker and PhD student at the Hasso-Plattner-Institute for Software Systems Engineering in Potsdam, Germany. His current research interests revolve around information retrieval from recorded audiovisual lecture videos. He is working as the chair for Internet technologies and systems with Dr. Christoph Meinel and is involved in the development of the Web-University project space enhancement of the tele-task video recording system. Christoph Meinel studied mathematics and computer science at Humboldt University in Berlin. He received the doctorate degree in 1981 and was habilitated in After visiting positions at the University of Paderborn and the Max-Planck-Institute for computer science in Saarbrücken, he became a full professor of computer science at the University of Trier. He is now the president and CEO of the Hasso- Plattner-Institute for IT-Systems Engineering at the University of Potsdam. He is a full professor of computer science with a chair in Internet technology and systems. His research focuses on IT-security engineering, teleteaching, and telemedicine. He is the author of more than 300 peer-reviewed scientific papers, the chief editor of ECCC Electronic Colloquium on Computational Complexity and IT- Gipfelblog, the chairman of the German IPv6 council, and a member of various scientific boards and program committees. In that time, his research focus was on complexity theory and on BDD-based data structures for VLSI design. Later, he became interested in Internet research, particularly in Internet and information security, as well as in innovative forms of teleteaching. From 1998 to 2002, he was the founding director of the Institut für Telematik, e.v. in Trier. He is a member of the IEEE.
Question Answering from Lecture Videos Based on Automatically-Generated Learning Objects
Question Answering from Lecture Videos Based on Automatically-Generated Learning Objects Stephan Repp, Serge Linckels, and Christoph Meinel Hasso Plattner Institut (HPI), University of Potsdam P.O. Box
More informationSurvey: Retrieval of Video Using Content (Speech &Text) Information
Survey: Retrieval of Video Using Content (Speech &Text) Information Bhagwant B. Handge 1, Prof. N.R.Wankhade 2 1. M.E Student, Kalyani Charitable Trust s Late G.N. Sapkal College of Engineering, Nashik
More informationSEMANTIC VIDEO ANNOTATION IN E-LEARNING FRAMEWORK
SEMANTIC VIDEO ANNOTATION IN E-LEARNING FRAMEWORK Antonella Carbonaro, Rodolfo Ferrini Department of Computer Science University of Bologna Mura Anteo Zamboni 7, I-40127 Bologna, Italy Tel.: +39 0547 338830
More informationInvestigating the effectiveness of audio capture and integration with other resources to support student revision and review of classroom activities
Case Study Investigating the effectiveness of audio capture and integration with other resources to support student revision and review of classroom activities Iain Stewart, Willie McKee School of Engineering
More informationMIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationSELF-TEST INTEGRATION IN LECTURE VIDEO ARCHIVES
SELF-TEST INTEGRATION IN LECTURE VIDEO ARCHIVES Martin Malchow, Matthias Bauer, Christoph Meinel Hasso Plattner Institute (GERMANY) Abstract Lecture video archives offer hundreds of lectures. Students
More informationAn Automated Analysis and Indexing Framework for Lecture Video Portal
An Automated Analysis and Indexing Framework for Lecture Video Portal Haojin Yang, Christoph Oehlke, and Christoph Meinel Hasso Plattner Institute (HPI), University of Potsdam, Germany {Haojin.Yang,Meinel}@hpi.uni-potsdam.de,
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationNatural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationEstablishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
More informationA Method of Caption Detection in News Video
3rd International Conference on Multimedia Technology(ICMT 3) A Method of Caption Detection in News Video He HUANG, Ping SHI Abstract. News video is one of the most important media for people to get information.
More informationAutomatic Timeline Construction For Computer Forensics Purposes
Automatic Timeline Construction For Computer Forensics Purposes Yoan Chabot, Aurélie Bertaux, Christophe Nicolle and Tahar Kechadi CheckSem Team, Laboratoire Le2i, UMR CNRS 6306 Faculté des sciences Mirande,
More informationUnderstanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
More information2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
More informationGerman Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings Haojin Yang, Christoph Oehlke, Christoph Meinel Hasso Plattner Institut (HPI), University of Potsdam P.O. Box
More informationExperiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
More informationOptimization of Image Search from Photo Sharing Websites Using Personal Data
Optimization of Image Search from Photo Sharing Websites Using Personal Data Mr. Naeem Naik Walchand Institute of Technology, Solapur, India Abstract The present research aims at optimizing the image search
More informationLightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationPresentation Video Retrieval using Automatically Recovered Slide and Spoken Text
Presentation Video Retrieval using Automatically Recovered Slide and Spoken Text Matthew Cooper FX Palo Alto Laboratory Palo Alto, CA 94034 USA cooper@fxpal.com ABSTRACT Video is becoming a prevalent medium
More informationDiagnosis of Students Online Learning Portfolios
Diagnosis of Students Online Learning Portfolios Chien-Ming Chen 1, Chao-Yi Li 2, Te-Yi Chan 3, Bin-Shyan Jong 4, and Tsong-Wuu Lin 5 Abstract - Online learning is different from the instruction provided
More informationEffective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationSemantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
More informationSemantic Web based e-learning System for Sports Domain
Semantic Web based e-learning System for Sports Domain S.Muthu lakshmi Research Scholar Dept.of Information Science & Technology Anna University, Chennai G.V.Uma Professor & Research Supervisor Dept.of
More informationLOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com
LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA
More informationTurkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr
More informationTerm extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
More informationEfficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
More informationisecure: Integrating Learning Resources for Information Security Research and Education The isecure team
isecure: Integrating Learning Resources for Information Security Research and Education The isecure team 1 isecure NSF-funded collaborative project (2012-2015) Faculty NJIT Vincent Oria Jim Geller Reza
More informationAnalysis of Data Mining Concepts in Higher Education with Needs to Najran University
590 Analysis of Data Mining Concepts in Higher Education with Needs to Najran University Mohamed Hussain Tawarish 1, Farooqui Waseemuddin 2 Department of Computer Science, Najran Community College. Najran
More informationUsing News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 gyozo@cs.ucsd.edu 21, June 15,
More informationM3039 MPEG 97/ January 1998
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039
More informationONTOLOGY BASED FEEDBACK GENERATION IN DESIGN- ORIENTED E-LEARNING SYSTEMS
ONTOLOGY BASED FEEDBACK GENERATION IN DESIGN- ORIENTED E-LEARNING SYSTEMS Harrie Passier and Johan Jeuring Faculty of Computer Science, Open University of the Netherlands Valkenburgerweg 177, 6419 AT Heerlen,
More informationEfficient Query Optimizing System for Searching Using Data Mining Technique
Vol.1, Issue.2, pp-347-351 ISSN: 2249-6645 Efficient Query Optimizing System for Searching Using Data Mining Technique Velmurugan.N Vijayaraj.A Assistant Professor, Department of MCA, Associate Professor,
More informationHow the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
More informationAn Online Automated Scoring System for Java Programming Assignments
An Online Automated Scoring System for Java Programming Assignments Hiroki Kitaya and Ushio Inoue Abstract This paper proposes a web-based system that automatically scores programming assignments for students.
More informationCHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages 8-10. CHARTE NIVEAU B2 Pages 11-14
CHARTES D'ANGLAIS SOMMAIRE CHARTE NIVEAU A1 Pages 2-4 CHARTE NIVEAU A2 Pages 5-7 CHARTE NIVEAU B1 Pages 8-10 CHARTE NIVEAU B2 Pages 11-14 CHARTE NIVEAU C1 Pages 15-17 MAJ, le 11 juin 2014 A1 Skills-based
More informationMastering CALL: is there a role for computer-wiseness? Jeannine Gerbault, University of Bordeaux 3, France. Abstract
Mastering CALL: is there a role for computer-wiseness? Jeannine Gerbault, University of Bordeaux 3, France Abstract One of the characteristics of CALL is that the use of ICT allows for autonomous learning.
More informationA Platform for Managing Term Dictionaries for Utilizing Distributed Interview Archives
1102 Web Information Systems Modeling A Platform for Managing Term Dictionaries for Utilizing Distributed Interview Archives Kenro Aihara and Atsuhiro Takasu National Institute of Informatics 2-1-2 Hitotsubashi,
More informationSemantically Enhanced Web Personalization Approaches and Techniques
Semantically Enhanced Web Personalization Approaches and Techniques Dario Vuljani, Lidia Rovan, Mirta Baranovi Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, HR-10000 Zagreb,
More informationEXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION
EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION Anna Goy and Diego Magro Dipartimento di Informatica, Università di Torino C. Svizzera, 185, I-10149 Italy ABSTRACT This paper proposes
More informationBuilding a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
More informationWeek 3. COM1030. Requirements Elicitation techniques. 1. Researching the business background
Aims of the lecture: 1. Introduce the issue of a systems requirements. 2. Discuss problems in establishing requirements of a system. 3. Consider some practical methods of doing this. 4. Relate the material
More informationExam in course TDT4215 Web Intelligence - Solutions and guidelines -
English Student no:... Page 1 of 12 Contact during the exam: Geir Solskinnsbakk Phone: 94218 Exam in course TDT4215 Web Intelligence - Solutions and guidelines - Friday May 21, 2010 Time: 0900-1300 Allowed
More informationCollecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
More informationReviewed by Ok s a n a Afitska, University of Bristol
Vol. 3, No. 2 (December2009), pp. 226-235 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4441 Transana 2.30 from Wisconsin Center for Education Research Reviewed by Ok s a n a Afitska, University
More informationTraining Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object
Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object Anne Monceaux 1, Joanna Guss 1 1 EADS-CCR, Centreda 1, 4 Avenue Didier Daurat 31700 Blagnac France
More informationISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining
A Review: Image Retrieval Using Web Multimedia Satish Bansal*, K K Yadav** *, **Assistant Professor Prestige Institute Of Management, Gwalior (MP), India Abstract Multimedia object include audio, video,
More informationWeb Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it
Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content
More informationPersonalization of Web Search With Protected Privacy
Personalization of Web Search With Protected Privacy S.S DIVYA, R.RUBINI,P.EZHIL Final year, Information Technology,KarpagaVinayaga College Engineering and Technology, Kanchipuram [D.t] Final year, Information
More informationAN ONTOLOGICAL APPROACH TO WEB APPLICATION DESIGN USING W2000 METHODOLOGY
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume L, Number 2, 2005 AN ONTOLOGICAL APPROACH TO WEB APPLICATION DESIGN USING W2000 METHODOLOGY ANNA LISA GUIDO, ROBERTO PAIANO, AND ANDREA PANDURINO Abstract.
More informationA HUMAN RESOURCE ONTOLOGY FOR RECRUITMENT PROCESS
A HUMAN RESOURCE ONTOLOGY FOR RECRUITMENT PROCESS Ionela MANIU Lucian Blaga University Sibiu, Romania Faculty of Sciences mocanionela@yahoo.com George MANIU Spiru Haret University Bucharest, Romania Faculty
More informationConTag: Conceptual Tag Clouds Video Browsing in e-learning
ConTag: Conceptual Tag Clouds Video Browsing in e-learning 1 Ahmad Nurzid Rosli, 2 Kee-Sung Lee, 3 Ivan A. Supandi, 4 Geun-Sik Jo 1, First Author Department of Information Technology, Inha University,
More informationIncorporating Window-Based Passage-Level Evidence in Document Retrieval
Incorporating -Based Passage-Level Evidence in Document Retrieval Wensi Xi, Richard Xu-Rong, Christopher S.G. Khoo Center for Advanced Information Systems School of Applied Science Nanyang Technological
More informationC o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER
INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process
More informationData and Analysis. Informatics 1 School of Informatics, University of Edinburgh. Part III Unstructured Data. Ian Stark. Staff-Student Liaison Meeting
Inf1-DA 2010 2011 III: 1 / 89 Informatics 1 School of Informatics, University of Edinburgh Data and Analysis Part III Unstructured Data Ian Stark February 2011 Inf1-DA 2010 2011 III: 2 / 89 Part III Unstructured
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationMultimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.
Multimedia Databases Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 0 Organizational Issues Lecture 21.10.2014 03.02.2015
More informationSingle Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results
, pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department
More informationAmerican Journal of Engineering Research (AJER) 2013 American Journal of Engineering Research (AJER) e-issn: 2320-0847 p-issn : 2320-0936 Volume-2, Issue-4, pp-39-43 www.ajer.us Research Paper Open Access
More informationPedagogical Use of Tablet PC for Active and Collaborative Learning
Pedagogical Use of Tablet PC for Active and Collaborative Learning Oscar Martinez Bonastre oscar.martinez@umh.es Antonio Peñalver Benavent a.penalver@umh.es Francisco Nortes Belmonte francisco.nortes@alu.umh.es
More informationSustaining Privacy Protection in Personalized Web Search with Temporal Behavior
Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior N.Jagatheshwaran 1 R.Menaka 2 1 Final B.Tech (IT), jagatheshwaran.n@gmail.com, Velalar College of Engineering and Technology,
More informationLeveraging Video Interaction Data and Content Analysis to Improve Video Learning
Leveraging Video Interaction Data and Content Analysis to Improve Video Learning Juho Kim juhokim@mit.edu Shang-Wen (Daniel) Li swli@mit.edu Carrie J. Cai cjcai@mit.edu Krzysztof Z. Gajos Harvard SEAS
More informationDoes it fit? KOS evaluation using the ICE-Map Visualization.
Does it fit? KOS evaluation using the ICE-Map Visualization. Kai Eckert 1, Dominique Ritze 1, and Magnus Pfeffer 2 1 University of Mannheim University Library Mannheim, Germany {kai.eckert,dominique.ritze}@bib.uni-mannheim.de
More informationVisualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR ccfong@umac.mo Abstract An e-government
More informationUtilising Ontology-based Modelling for Learning Content Management
Utilising -based Modelling for Learning Content Management Claus Pahl, Muhammad Javed, Yalemisew M. Abgaz Centre for Next Generation Localization (CNGL), School of Computing, Dublin City University, Dublin
More informationExtend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia
More informationText Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com
Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around
More informationBroadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
More informationEnhancing the relativity between Content, Title and Meta Tags Based on Term Frequency in Lexical and Semantic Aspects
Enhancing the relativity between Content, Title and Meta Tags Based on Term Frequency in Lexical and Semantic Aspects Mohammad Farahmand, Abu Bakar MD Sultan, Masrah Azrifah Azmi Murad, Fatimah Sidi me@shahroozfarahmand.com
More informationSearch Query and Matching Approach of Information Retrieval in Cloud Computing
International Journal of Advances in Electrical and Electronics Engineering 99 Available online at www.ijaeee.com & www.sestindia.org ISSN: 2319-1112 Search Query and Matching Approach of Information Retrieval
More informationFolksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
More informationCONCEPTCLASSIFIER FOR SHAREPOINT
CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running
More informationSimulating a File-Sharing P2P Network
Simulating a File-Sharing P2P Network Mario T. Schlosser, Tyson E. Condie, and Sepandar D. Kamvar Department of Computer Science Stanford University, Stanford, CA 94305, USA Abstract. Assessing the performance
More informationApproaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval
Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information
More informationCloud Storage-based Intelligent Document Archiving for the Management of Big Data
Cloud Storage-based Intelligent Document Archiving for the Management of Big Data Keedong Yoo Dept. of Management Information Systems Dankook University Cheonan, Republic of Korea Abstract : The cloud
More informationWeb Analytics Definitions Approved August 16, 2007
Web Analytics Definitions Approved August 16, 2007 Web Analytics Association 2300 M Street, Suite 800 Washington DC 20037 standards@webanalyticsassociation.org 1-800-349-1070 Licensed under a Creative
More informationA MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS
A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS Charanma.P 1, P. Ganesh Kumar 2, 1 PG Scholar, 2 Assistant Professor,Department of Information Technology, Anna University
More informationProjektgruppe. Categorization of text documents via classification
Projektgruppe Steffen Beringer Categorization of text documents via classification 4. Juni 2010 Content Motivation Text categorization Classification in the machine learning Document indexing Construction
More informationOntology based Recruitment Process
Ontology based Recruitment Process Malgorzata Mochol Radoslaw Oldakowski Institut für Informatik AG Netzbasierte Informationssysteme Freie Universität Berlin Takustr. 9, 14195 Berlin, Germany mochol@inf.fu-berlin.de
More informationLDA Based Security in Personalized Web Search
LDA Based Security in Personalized Web Search R. Dhivya 1 / PG Scholar, B. Vinodhini 2 /Assistant Professor, S. Karthik 3 /Prof & Dean Department of Computer Science & Engineering SNS College of Technology
More informationConference Navigator 2.0: Community-Based Recommendation for Academic Conferences
Conference Navigator 2.0: Community-Based Recommendation for Academic Conferences Chirayu Wongchokprasitti chw20@pitt.edu Peter Brusilovsky peterb@pitt.edu Denis Para dap89@pitt.edu ABSTRACT As the sheer
More informationINTRUSION PREVENTION AND EXPERT SYSTEMS
INTRUSION PREVENTION AND EXPERT SYSTEMS By Avi Chesla avic@v-secure.com Introduction Over the past few years, the market has developed new expectations from the security industry, especially from the intrusion
More informationCollaborative Development of Knowledge Bases in Distributed Requirements Elicitation
Collaborative Development of Knowledge Bases in Distributed s Elicitation Steffen Lohmann 1, Thomas Riechert 2, Sören Auer 2, Jürgen Ziegler 1 1 University of Duisburg-Essen Department of Informatics and
More informationAutomatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
More informationFiltering Noisy Contents in Online Social Network by using Rule Based Filtering System
Filtering Noisy Contents in Online Social Network by using Rule Based Filtering System Bala Kumari P 1, Bercelin Rose Mary W 2 and Devi Mareeswari M 3 1, 2, 3 M.TECH / IT, Dr.Sivanthi Aditanar College
More informationORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for
More informationPSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
More informationMining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
More informationSupporting Active Database Learning and Training through Interactive Multimedia
Supporting Active Database Learning and Training through Interactive Multimedia Claus Pahl ++353 +1 700 5620 cpahl@computing.dcu.ie Ronan Barrett ++353 +1 700 8616 rbarrett@computing.dcu.ie Claire Kenny
More informationpsychology and its role in comprehension of the text has been explored and employed
2 The role of background knowledge in language comprehension has been formalized as schema theory, any text, either spoken or written, does not by itself carry meaning. Rather, according to schema theory,
More informationThe multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
More informationBest Practices in Online Course Design
Best Practices in Online Course Design For the Instructor Mark Timbrook Minot State University, Office of Instructional Technology 4/24/2014 Best Practices in Online Course Design Best Practices in Online
More information2 AIMS: an Agent-based Intelligent Tool for Informational Support
Aroyo, L. & Dicheva, D. (2000). Domain and user knowledge in a web-based courseware engineering course, knowledge-based software engineering. In T. Hruska, M. Hashimoto (Eds.) Joint Conference knowledge-based
More informationInternational Journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online http://www.ijoer.
REVIEW ARTICLE ISSN: 2321-7758 UPS EFFICIENT SEARCH ENGINE BASED ON WEB-SNIPPET HIERARCHICAL CLUSTERING MS.MANISHA DESHMUKH, PROF. UMESH KULKARNI Department of Computer Engineering, ARMIET, Department
More informationOntology and automatic code generation on modeling and simulation
Ontology and automatic code generation on modeling and simulation Youcef Gheraibia Computing Department University Md Messadia Souk Ahras, 41000, Algeria youcef.gheraibia@gmail.com Abdelhabib Bourouis
More information