IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY

Size: px
Start display at page:

Download "IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY"

Transcription

1 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY Adaptive Recognition of Chinese Characters: Imitation of Psychological Process in Machine Recognition Yuan-Yuan Yang Abstract This paper is focused on imitation of human psychological process in machine recognition of Chinese characters. Some results of research on human Chinese character recognition have been discussed and unified into a compound mechanism with an adaptive and self-developing nature. Comparisons reveal that the two categories of approaches to machine Chinese character recognition based on global pattern processing and subpattern analysis are very similar to their corresponding human recognition routines. Their peculiarities are quite suitable to constructing an associated system with adaptive and selfdeveloping abilities. A machine imitation model has thus been proposed for Chinese character recognition with different routines. By some simplification but with the crucial feature of the model being retained, an experimental system for handprinted Chinese character recognition based on the novel concept has been built. Experimental results have shown that the associated routines continuously improve their performance during their work even after supervised training is halted. The routine of the global pattern approach eventually learns most of the classes and the recognition process gradually shifts from the subpattern approach to the global pattern approach. Finally, most of the character samples are recognized by the global pattern approach, while the overall recognition rate of the system is dramatically increased. Index Terms Adaptive recognition, Chinese character recognition, human cognitive process, machine imitation model, machine recognition, self-development. I. INTRODUCTION THE Chinese language and Chinese characters are used by nearly one-quarter of the world s population. But the importance and the difficulty of machine Chinese character recognition reflect an intensive contrast. Most of the difficulties are especially associated with recognizing off-line handwritten Chinese characters. They arise because of the following facts. 1) There exists a tremendously large set of characters. A very basic set will contain about characters, while a complete set will include more than ones. Such a large number of categories will certainly introduce much complexity in their classification. 2) Most of the blockwise-constructed Chinese characters are rather complicated in their structures composed of Manuscript received March 16, 1996; revised May 6, This work was supported by the National Natural Science Foundation of China. The author is with the Department of Electrical Engineering, Zhejiang University, Hangzhou, Zhejiang , China ( ee_yyyang@ ema.zju.edu.cn). Publisher Item Identifier S (98) strokes. Strokes in a character can number up to 30, which always make blurring unavoidable to the character image. 3) Many of the characters are very similar to each other. Their differences may be so small that they can easily be missed in processing of machine recognition. 4) The style of handwritten Chinese characters varies not only from character to character, but also from person to person, depending on the individual habitual practice of writing, even if the characters are restricted to be written in regular style. In contrast to the above stated difficulties of machine Chinese character recognition, the amazing power of human recognition is a wonderful secret of intelligence. The mystery, however, is little known and its study is still in its very infancy. Though some research progress has been made, there will be a long way to go before the details of a given human recognition process can be described. In recent years, a number of results have been reported in cognitive psychology, although their conclusions appear to be somewhat different and even contrary to each other. The study of both human and machine recognition of Chinese characters is definitely important in the area of cognitive science and artificial intelligence. It will be quite meaningful to look into human recognition process and to imitate such a process if it is practical and necessary in respect to technology, so as to construct machine recognition systems for Chinese characters, especially for handwritten Chinese characters, as perfect as possible. In this paper effort is purposely made to investigate into some results of psychological studies, intended to reach a possible unified explanation for the seemingly contradictory discoveries and thus try to sketch the human process and its machine imitation in the scope of associated recognition routines and their coordination in recognizing Chinese characters. A machine imitation model with abilities of adaptation as well as self-development has thus been proposed and an experimental system for recognizing handwritten Chinese characters has been constructed with a fairly satisfying performance as expected. II. PSYCHOLOGICAL PROCESS In spite of the diverse conclusions obtained from distinct psychological experiments for human Chinese character /98$ IEEE

2 254 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY 1998 recognition two major tendencies can be abstracted from them as follows [1], [2], [4], [5]. 1) Some of the experimental results have revealed that there exists a so-called effect of global structure priority in human Chinese character recognition. The human recognition mechanism tends to process a Chinese character as a single processing unit and to resist any decomposition of the character structure as a whole [3] [5]. 2) On the other hand, some other researches reached a quite different conclusion. Their experiments have shown that the amount of time needed for recognizing a Chinese character increases as its number of strokes increases. Such a stroke-number effect implies that strokes play the role of processing units in human recognition of Chinese characters [4] [7]. Moreover, some experiments also give evidence that, in addition to strokes, other types of structural components, such as radicals, may be taken as processing units [8] [10]. The two distinct conclusions seem to be opposite to each other. But as a matter of fact, they may reasonably be unified in a single entity by considering the following additional results obtained by most of the above experiments. 1) The effect of global structure priority depends on the frequency of occurrence and structural complexity of the recognized character. For a character of higher frequency of occurrence or of simpler structure, such an effect is found to be more significant. 2) The stroke-number effect can also be influenced by the frequency of occurrence and structural complexity of the character. The effect is weakened if the frequency of occurrence is increased or the structure of the character is comparatively simpler. It can be understood that the frequency of occurrence of a character is generally related to the intensity of learning of the character for human mechanism, while the structural complexity of a character implies its degree of difficulty in recognition and learning. Both effects discovered in human psychological process of recognizing Chinese characters can therefore be considered logical and reasonable for different characters of different frequency or different structural complexity. In the sense, the two distinct effects are not at all contrary, but complementary to each other. The human recognition process for Chinese characters is neither a singleton nor a fixed routine, but is compounded from various recognition routines developed dynamically in learning and applied adaptively in reading according to the following two conditions: 1) the reader s current level of education and 2) the familiarity and structural complexity of the recognized character. The process may thus be changed now and then for a human being in different stage of his/her development from an initial learner to a skilled reader. For an initial learner, especially for a child at the very beginning of his/her learning, all Chinese characters are not familiar and have to be recognized by their structural decomposition. Thus the stroke-number effect will prevail in recognition. Furthermore, as the initial learner at that stage has never had concepts of strokes and radicals, stroke-elements have to be used as the simplest processing units. The so called stroke-element is here defined as a straight-line component of a stroke and will be simply termed as a stroxel (a word formed conditionally from the combined word stroke-element) in the following discussion. Once the idea of stroke and radical is completed, they will also be involved in recognition and the process will become more powerful. Following the accumulated learning and reading, the characters either constructed very simply or appearing very frequently will be so familiar that they can be recognized by their global structure information. As for very skilled readers, if there would exist any, it could be expected that all of the daily used Chinese characters might be familiar enough to be recognized by the peculiarities of the global characters with a strong effect of global structure priority to resist any decomposition of their structures. Nevertheless, for most of the ordinary readers, who have been considerably trained but are not so skilled, different recognition routines may adaptively be adopted for recognition of different characters during their reading according to their individual conditions. Generally speaking, recognition via structural decomposition serves as a short-cut for people to learn and read unfamiliar Chinese characters, while the recognition time needed is somewhat longer because the process will go from one component to another. On the contrary, recognition via global character information is a routine having to be perfected by and by along with accumulation of reading experience. However, it needs less processing time than recognition by structural decomposition and thus is an efficient way in the long run. Association of multiple recognition routines, adaptability of the recognition mechanism and self-developing ability of the distinct routines can be considered three specific functions of human Chinese character recognition. They will be further defined and discussed in more detail, so that they can be imitated in machine Chinese character recognition. III. ASSOCIATED MULTIPLE ROUTINES It has been concluded from the above discussion that the human psychological process of Chinese character recognition is most likely a process formed by the united recognition routines utilizing attributes of global characters and attributes of their structural components respectively. A Chinese character is hierarchically constructed from components of three levels: radicals, strokes, and stroxels. Being a structure from strokes, a radical is frequently occurred as a sub-structure of many characters. Some radicals themselves are also relatively simple Chinese characters. A stroke is a continuous ink trace in writing. Some strokes are unidirectional and formed by only one stroxel, but more strokes change their direction once or more times, and then consist of two or more stroxels. An example of such a structural hierarchy is shown in Fig. 1. Here, the Chinese character HONG (meaning great or enlarge in English) is composed of two radicals. The left radical itself is a simple character GONG (meaning bow ), however the right one is not a character at all. They can be further decomposed into five strokes and then ten stroxels.

3 YANG: ADAPTIVE RECOGNITION OF CHINESE CHARACTERS 255 Routine 3: Attributes of strokes stroke structures Radical structures Chinese character Routine 4: Attributes of strokes Stroke structures Chinese character Class 3: corresponding to initial learners level of recognition. All routines starting from the node attributes of stroxels belong in this class. They are Fig. 1. The hierarchical structure of a Chinese character HONG meaning great or enlarge in English. Fig. 2. The associated multiple routines for human recognition of Chinese characters. Accordingly, the associated multiple routines of human Chinese character recognition can now be defined by the directed graph illustrated in Fig. 2. Four structure nodes representing respectively the four hierarchical structural levels of a character, namely Chinese character, radical structures, stroke structures, and stroxel structures, are connected to each other with directed arcs. Besides, each of the four structure nodes is also linked to an attribute node representing the attributes of the given structure. Any of recognition routines will start at one of the four attribute nodes, go along one or more directed arcs through corresponding structure nodes and terminate at the node of Chinese character at last. Therefore, totally eight different routines may be associated together to form the human recognition mechanism as a whole [1]. The eight possible routines can be categorized into three classes corresponding to three different levels of a human being s reading skill. Class 1: corresponding to the skilled level of recognition ability for much familiar characters. The only routine included is Routine 1: Attributes of character Chinese character Class 2: corresponding to the not quite skilled level of recognition for not quite familiar characters. The recognition routines involved are as follows: Routine 2: Attributes of radicals Radical structures Chinese character Routine 5: Attributes of stroxels Stroxel structures Radical structures Chinese character Routine 6: Attributes of stroxels Stroxel structures Stroke structures Radical structures Chinese character Routine 7: Attributes of stroxels Stroxel structures Stroke structures Chinese character Routine 8: Attributes of stroxels Stroxel structures Chinese character. IV. ADAPTABILITY AND SELF-DEVELOPMENT Adaptability of the human recognition mechanism for Chinese characters is defined as its capability to choose from its multiple routines the most effective one in respect of both recognition speed and accuracy to recognize Chinese characters in different conditions. While self-developing ability of the mechanism is defined as its intelligent aptitude in developing the mechanism itself even if without teachers instruction. The self-development can be divided into two phases: self-organization for organizing new or more skilled recognition routines and self-learning for continuous improving the existing recognition routines. The recognition process of a human being observed at a certain time for recognizing a certain Chinese character in a peculiar condition is only a timely state of adaptation of the human mechanism in a certain stage of its development for the given condition. The human process can adapt itself to any condition at any time and also develop itself dynamically from an initial learner s level to a skilled reader s level. The process of self-development will never cease, unless the human mechanism could become so perfect that any Chinese character in any condition would be recognized correctly and efficiently by only a singleton of Routine 1 based on global character information processing. But nevertheless, it seems to be impossible that such a perfect state of human recognition can be reached in real world. The continuous self-development of the mechanism as well as its adaptability is the real virtue of human beings marvelous intelligence in learning and experiencing. Self-development guarantees human recognition mechanism to accumulate its experience, develop more efficient recognition routines from

4 256 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY 1998 Fig. 3. A sketch representing the associated multiple routines, adaptability, and self-development. *Interaction among associated routines. less efficient ones and continuously improve the perfectness of each of its processing routines. On the other hand, the ability of adaptation makes it possible to choose from its updated routines the most efficient one in respect of fast response among all the routines capable of recognizing correctly under the given condition. A. Sketching the Process How is the human recognition mechanism developed from an initial learner s level to a skilled reader s level? How do the distinct recognition routines exist together and coordinate their work smoothly? Nothing has been revealed about the dynamic cognitive processes in concern. Here we are trying to explain the interesting problem as reasonably as possible. The human psychological process for Chinese character recognition in respect of its associated multiroutines, self-development and adaptability is sketched as Fig. 3. It should be stressed that the sketch would not refer to the psychological problems of word recognition other than the three specific functions in the scope of our concern. A number of models for general psychological process of word recognition with interaction of information can be found in a series of corresponding references [11] [13]. Constructed directly from the above discussions, Fig. 3 is logical and reasonable enough for representing the interesting process. In spite of being lacking in more details, it can fairly well explain the principal functions of the recognition mechanism as follows. B. Self-Development The continuous development of the human mechanism is considered mainly due to the self-learning and selforganization functions with the aid of supervised training under teachers instructions. The reading skill of a human being is developed in the following three major stages. 1) At the very beginning of pupils learning Chinese characters they have nothing about the knowledge of Chinese characters and are taught by the teachers at first to decompose a character into stroxels in recognition and writing for speeding their learning. It means that the lowest-level recognition routine, Routine 8 in Class 3, will be formed under supervised learning. Afterwards, some concepts of strokes and radicals are gradually introduced by teacher s instructions as well as the function of self-organization. The initial learner now is capable of taking all components of three levels, i.e., stroxels, strokes, and radicals, to construct higher-level structures from lower-level structures. The recognition mechanism is now advanced to Routines 5 7 of Class 3. 2) After accumulation of self-learning in more reading and writing, the learner will master the attributes of strokes and some frequently used radicals very well. Then a significant progress will happen, such that strokes and radicals can be directly extracted and correctly recognized without any help of stroxels. Routines 2 4 of Class 2 are then gradually formed by self-organization and self-learning. The learner will thus get a big leap in his/her reading ability. During this stage, routines of Class 2 will undoubtedly become the major process substituting for routines via stroxel analysis in Class 3. However, why are routines of Class 2 unavoidable in human mechanism? The explanation is in the fact that the category number of either strokes or radicals is very much smaller than that of Chinese characters and on the other hand, their structures are much simpler than those of most characters, so their attributes are much easier to be mastered in self-learning. 3) The recognition routine via attributes of global structure has its own peculiarity in human recognition process. It is developed all along from the very beginning. This probably can be certified by the fact that preschool children can successfully recognize some Chinese characters in a very limited set without any idea of their structural decomposition. But their recognition ability is too poor to recognize characters in a large set and their learning efficiency is also rather low. Thus, Routine 1 of Class 1 will not become effective enough, until certain Chinese characters are considerably familiar or their attributes of global structures are well mastered to differ from others. In the sense, Routine 1 is always ready to be trained by its self-learning process at any time when Chinese characters are still recognized by other routines. The frequently used characters and the simple-structured ones can be easier mastered and earlier recognized by Routine 1, when others are still processed via structural decomposition. C. Adaptability For the adaptability of the mechanism, it may reasonably be explained by interaction among the associated routines. Since a routine of higher level is always faster than a routine of lower level, e.g., Routine 1 is faster than Routine 2, Routine 2 faster than Routine 3, etc., when the routine of higher level finishes its recognition process correctly, it may signal the routines of lower level to stop their work. The mechanism will in this manner always complete its work with the most effective and fastest routine in hand.

5 YANG: ADAPTIVE RECOGNITION OF CHINESE CHARACTERS 257 D. Some Comments Consequently, the sketch in Fig. 3 gives the following characteristics of the human recognition process. 1) Self-organization, self-learning, and supervised training as well are the important factors for developing human recognition mechanism, as they guarantee the advances in constructing and improving its recognition routines and transfer its work from a routine of lower level to another of higher level. 2) The recognition routines of initial learners level are the most efficient in respect of learning, because stroxels are the simplest processing units and their attributes are the simplest for mastering. This fact is absolutely important for early-stage development of human recognition. 3) It is a complement to each other that distinct routines are united together and interact among each other. The association and interaction make the adaptability possible for human recognition. Chinese characters are always recognized by an updated, adaptively adopted routine which is the fastest and the most effective for the given condition. V. MACHINE RECOGNITION For discussing imitation of the human psychological process in machine recognition of Chinese characters, comparisons have to be made between the human process and the current state of machine recognition. In machine recognition of Chinese characters, the statistical approach is now still the major methodology for recognizing printed Chinese characters. On the other hand, in addition to the statistical approach the structural approach becomes more and more to be studied for recognition of handwritten Chinese characters in recent years [14] [17]. The principal difference between statistical and structural approaches exists in the following essence: the former emphasizes information of the global characters, i.e., the attributes of global characters in the form of feature vectors, while the latter emphasizes structural decomposition [18] [20]. In a sense, they are comparable to the corresponding routines of recognition based on global character information or structural decomposition in human mechanism, respectively. As for the technology of Chinese character recognition using artificial neural nets, which has attracted a lot of study for recent years, it can, in principle, be considered as another form of recognition via global structure information of characters [21] [24]. For avoidance of confusion in terminology the two major categories of approaches for machine recognition are in this paper renamed as global pattern approach and subpattern approach, according as their strategy is based on global character information or structural decomposition. Further comparison of machine Chinese character recognition with human process will lead to the following comments. 1) Global pattern approaches to machine Chinese character recognition require intensive system learning. In statistical approaches, similarities are measured among feature vectors. For the sake of feature diversity from sample to sample, high-dimensional feature vectors are often needed for pattern separability. A large number of training samples are required for system learning, in order to describe the patterns statistically. Insufficient training samples or inadequate system learning will lead to erroneous recognition. It is just the case quite similar to what is in human recognition by global character information processing. 2) In subpattern approaches to machine recognition of Chinese characters, if the structural representation system is appropriate to describe the essence of the character structures, the influence of diversity from sample to sample will be largely decreased. Such an effect will not only advantage recognition itself, but also benefit the system learning, such that much less learning and much less amount of training samples can be required [17], [18]. On the other hand, the recognition time will become much longer than that of global pattern approaches, because structural components of a character have to be extracted and matched one after another with a rather time-consuming searching process of any strategy. The fact is quite the same as what is for human recognition via structural decomposition. 3) Theoretically, the structural components for subpattern approaches in machine recognition can also be radicals, strokes and stroxels. But, for the present, stroxels would generally be applied as the elementary processing units, rather than strokes and radicals. There exist a number of methods for extracting stroxels from character images [25] [31]. However it is quite difficult to extract strokes and radicals directly from character images without synthesizing from stroxels. Nevertheless, in recognition of on-line handwritten Chinese characters, strokes can be easily extracted without any problem, because the process of writing provides the necessary information for their segmentation. A suggested method for direct radical extraction from printed Chinese characters based on their morphological structure analysis [32] will be effective for clean printed characters, but there will be difficulties for noised and handwritten characters, for the sake of their serious morphological distortions. As an expectant strategy, some study of direct extraction of radicals with artificial neural nets is in progress [33]. Thus, up to now, any processing method practical for subpattern approach to machine recognition has to start at processing attributes of stroxels [34], except that online recognition can start at attributes of strokes. It is very interesting to notice that the subpattern approach routines suitable to off-line recognition of handwritten Chinese characters are completely the same as those of human recognition for initial learners. Fig. 4 shows all possible routines for machine recognition in comparison to Fig. 2. 4) For the present state of machine Chinese character recognition the existing systems are exclusively constructed with a fixed singleton of processing routine via global character information or structural decomposition or a hybrid of both. They have neither ability of selfdevelopment nor ability of adaptation. A fixed routine

6 258 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY 1998 in its singleton form is able to achieve success to a certain extent, but can hardly be considered adequate to achieving so perfect a performance in any condition as human mechanism can do, especially for recognizing off-line handwritten Chinese characters. Recognition method based on global character feature processing is not suitable to either characters of unfamiliar styles, or characters for which the system is not enough trained. On the other hand, recognition based on structural decomposition is rather time consuming and inefficient. Besides, the processing of structural decomposition may be troubled by blurred character images. All these illustrate that adaptability and self-development are necessary for building machine Chinese character recognition systems as perfect as possible. 5) Fortunately, the efficient learning of subpattern approach and the efficient recognition of global pattern approach complement each other. The two approaches can coordinate perfectly, if associated together in a proper manner. It will be very favorable for machine recognition of Chinese characters to imitate the human psychological process and construct systems with adaptive and selfdeveloping abilities. The idea will make a machine recognition system capable of working efficiently in both learning and recognition, while its recognition rate can also be wonderfully raised. VI. MACHINE IMITATION MODEL Now the key problem is how to construct a machine Chinese character recognition system with routines of both global pattern approach and subpattern approach associated together in order to provide it with abilities of adaptation and selfdevelopment, so that the system can always work in an optimal status with its best routine for the given condition and also can continuously improve the performance of different routines as well as the whole system. According to the sketch of psychological process in Fig. 3, the machine imitation of the human process can be realized without serious technical difficulties, except that the self-organization system, which functions to build the representation system for characters by machine learning, is a rather complicated problem for further research [35]. Fig. 5 illustrates a model specifically proposed for machine imitation of the human psychological process for Chinese character recognition. In the illustrated model recognition routines are shown in two categories, according as they belong to global pattern approach (i.e., Routine 1 of Class 1) or subpattern approach (including the routines of Class 2 and Class 3), respectively. For simplicity of sketching, each category of the routines is packed together as a whole unit instead of all its different routines to be shown separately. A complete subsystem consisting of its corresponding recognition process, self-organization, self-learning and supervised learning is constructed for each routine (although subsystems are shown simply in Fig. 5 for categories). Because it will be too complicated for practical application to provide recognition systems with the ability Fig. 4. Practical routines for machine recognition of Chinese characters. *Practical for on-line recognition. **Direct radical extraction under study. Fig. 5. A multiroutine machine imitation model with adaptive and selfdeveloping abilities. of self-organization, the pattern representation system can be built in advance instead of being built in the process of selforganization. The subsystems of the model are not isolated at all, but interact with each other. When a subsystem finishes its recognition with a believably correct result, the interaction will stop the processing of other subsystems and signal to the self-learning process for starting. In general, the processing routine of Class 1 is faster than those of Classes 2 and 3, and if the process of Class 1 is perfect enough, it will obtain correct recognition results, so the processes of Classes 2 and 3 will normally be banned. But if the process of Class 1 is not so perfect, characters insufficiently trained may be wrongly recognized. Then it will not flag down the processes of Classes 2 and 3. On the contrary, it will help them to continue their processing. In case all subsystems can not confirm whether their recognition results of a given character are correct or not, an additional general analyzer can be constructed to take a part in its final decision with some extra information provided, e.g., the contextual information of the text represented in syntactic and semantic form, etc. For realizing the interaction among different routines the recognition result of any routine for a Chinese character is estimated by a self-evaluation unit, which, for instance, can simply be a confidence function properly formulated. Two thresholds of the confidence function can be determined for identifying the recognition results, so that all of the results will be evaluated and grouped into three cases: believably true, false, and uncertain. If the value of the confidence function implies a false or uncertain result, decision should be

7 YANG: ADAPTIVE RECOGNITION OF CHINESE CHARACTERS 259 made or confirmed by some further investigation with another recognition routine or a general analyzer. The interaction will operate as follows. 1) When one of the recognition routines finishes its processing and its value of confidence function is favorable, the recognition result is identified as true. No sooner will it stop the processing of all other routines and trigger the self-learning unit of each recognition routine to start. The interaction like this is the major phase of self-developing ability of the imitation model. In such a case, the routine recognizing a character correctly will act as if it were an teacher giving instruction to each of the other routines. 2) If a routine gives out its recognition result earlier but the value of confidence function shows the result is false, it implies the routine is quite not perfect for the given character and can do nothing for its decision. The interaction unit will neither stop the work of other routines nor start the process of self-learning. Each of the other routines will therefore continue their processing, unless being flagged down at the moment when a correct result is obtained by any one of them. 3) When a recognition result given out by a routine is evaluated as uncertain, a case of dubiety appears. What the routine can do is to gather all its most expectant candidates and pass them down to the other routines for further verification. That means, the preceding routine will serve in this case as a preclassifier. 4) An awfully written character or a seriously blurred character may be unable to be recognized correctly by all of the recognition routines. When such a case happens, the interaction will signal to an additional processor, the general analyzer, for starting its work. Analysis will be introduced to processing all the candidates in doubt from the previous processes in order to make an appropriate decision. Here the contextual knowledge of the input text may be applied. A recognition failure will finally be output in case all the above measures cannot at all reach a result certified to be correct for the given character. The proposed imitation model for machine Chinese character recognition possesses all the peculiarities of human psychological mechanism, among which the association of different routines and the abilities of adaptation and selfdevelopment are most important for the novel suggestion. It will create a wonderful possibility for constructing Chinese character recognition systems with much flexibility and high performance. VII. AN EXPERIMENTAL SYSTEM Following the suggested machine imitation model, an experimental system has been constructed for recognizing off-line handprinted Chinese characters to verify the performance. As shown in Fig. 6, the system consists of only two different recognition routines, one of which is characteristically a routine of Class 1 built in principle of the global pattern approach and the other a routine of Class 3 based on a subpattern approach with stroxel analysis. It should be clarified that in Fig. 5 the recognition routines compete in parallel, while in the An experimental system for recognizing handwritten Chinese char- Fig. 6. acters. experimental system of Fig. 6 the two routines work one after another. The subpattern approach processes only the rejects of the global pattern approach. Such a modification will not change the virtue of adaptation and self-development of the system. For each of the two routines a subsystem is constructed with four units including a feature/structure extraction unit, a pattern classifier, a database for pattern representation and a unit for both self-learning and supervised learning. No selforganization ability is provided. All the character samples input to the system are processed at first by a nonlinear normalization unit (NLN) in order to equalize their strokebackground space distribution [36] [38]. Although a general analyzer is included in the experimental system, it makes its decision without any contextual information. Since the purpose of our experiments is to observe the performance of adaptive and self-developing abilities in Chinese character recognition, it will not be influenced by the absence of contextual information in the general analyzer. In the following the essentials of the experimental system are to be presented. A. Subsystem 1 The subsystem 1 of the global pattern approach is specifically a three-stage classifier with a typical representation system in feature vectors of 400 global pattern attributes describing for each Chinese character the characteristics of the peripheral structure, the distribution of strokes inside the character as well as the distribution of stroke directions over the character image. The representation system gathers its initial values of the attributes from the feature extraction unit through supervised learning system under the instruction of a teacher at the same time when the representation system of subsystem 2 for subpattern approach is initialized. Afterwards, at each time when a character is recognized correctly by any of the subsystems the data of the representation system will be updated to the new statistical mean value of each attribute by its self-learning process.

8 260 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY 1998 For the purpose of classification, Euclidean distances are applied to the dissimilarity measure of characters. Unknown characters input to the subsystem are matched by a nearest neighbor method. B. Subsystem 2 The subsystem 2 for the subpattern approach includes a two-stage classifier by means of stroxel analysis. The first stage plays the role of a preclassifier matching stroxels forming some designated key radicals, while the second stage matches the whole structure of a character and then complete the recognition. Both stages are built on the principle of a so-called structural semantic approach based on knowledge processing, in which Chinese character structures are modeled in terms of their stroxels specifically attributed and arranged with distinct relations [39], [40]. A stroxel extracting unit extracts all stroxels with their attributes and relations from each input character, then a learning system generalizes them to form the structural semantic knowledge representation system. A brief introduction to the specifically designed structural semantic knowledge representation system is as follows. The structural semantic representation for a character structure from stroxels is defined as a 2-tuple where is the set of stroxels in the character and the corresponding set of stroxel relations, i.e. The attribute vector of a stroxel has the form of (1) (2) (3) where and are the coordinates of the starting point and end point of the stroxel respectively. As for the stroxel relations they can be described in a more detailed form, but essentially they belong to any one of the major categories and An -type relation implies two stroxels and are connected at one end of each, a -type relation means an end of touches at a nonend point, an -type relation is a crossing and an -type without any connection. The structural semantic knowledge of a Chinese character constructed from stroxels attributed by and arranged with relations can be represented in a form of the first order predicate logic STROXEL RELATION (4) (5) CHARACTER (6) However, in the experimental system frame representations are adopted for the structural semantic knowledge in order to include more detailed descriptions for distinguishing among very similar characters as well as some instructions for convenience of matching [41]. The structural semantic representation system of Chinese characters describes the real essence of the characters and therefore has the peculiarity of high learning efficiency. Thus the initial training required for the system can be minimized. A model-driven learning system implements the training under teachers supervision. In the process of recognition, stroxels are matched one after another by a strategy based on the principle of constraint satisfaction of the attributes and the stroxel-relations to search for the well-matched stroxels. A heuristic function in terms of the distances of stroxel attributes and the deviations of stroxel relations is applied, so that an unknown character input to the subsystem can be correctly recognized along an optimal path of the well pruned searching tree to get its result [40]. C. Self-Evaluation For both subsystems a confidence function in terms of distance where is expressed if (7) if otherwise (8) The values of coefficients and depend on the distance distribution of characters. They are determined as follows. For distances of the first candidates two thresholds and are designated, so that in respect of their probabilities can be considered true if and will be considered false if. Correspondingly, two thresholds and of the confidence function are conditionally chosen to be for lower bound of true and for upper bound of false. Then the coefficients and can be determined by setting and. When the value of the confidence function falls in the region of (i.e., when, uncertain cases will occur if the distance of the second candidates have values. In order to reject such uncertain results, a threshold is defined for the lower bound of as a guarantee for the correctness of (9) (10) The coefficients and are also determined by the distance distribution. According to and, the evaluation can be any of the following three cases. Case 1: A recognition result is evaluated to be true, if the first and second candidates and have their values of confidence function fulfilling the following conditions: or (11) (12)

9 YANG: ADAPTIVE RECOGNITION OF CHINESE CHARACTERS 261 and Case 2: The recognition result is considered false, if (13) (14) Case 3: The result is evaluated as uncertain, if the values of confidence functions and are within the range and (15) (16) D. Interaction The interaction between the two routines acts according to the values of the confidence functions and as follows. 1) When the global pattern approach (subsystem 1) gets a result evaluated to be true, the recognition is finished and the self-learning of both subsystems will be started. 2) When the global pattern approach has an evaluation of false, then the subpattern approach (subsystem 2) begins to work. 3) If an evaluation of uncertain happens to the recognition result of the global pattern approach, 10 candidate characters will be handed down to the subpattern approach for further recognition. 4) When the subpattern approach gets an evaluation of true, the result is verified. The recognition process is finished and the self-learning of each subsystem will be started. 5) If an evaluation of false or uncertain happens to the subpattern approach, the general analyzer, which is to be described below, will make a decision for the system output. Its final decision may be a reject or a recommended but not identified result. However, no selflearning will be implemented unless both subsystems get uncertain results and have the same first candidate. Blocking the self-learning process is necessary because that the results thus obtained are accepted without further evaluation and also, to a certain extent, these character samples may be in a vague condition unsuitable for training. E. General Analyzer For a system of practical application the general analyzer following the classifiers had better make its final decision according to some contextual information by means of syntactic and semantic analysis or otherwise by an -gram algorithm based on the hidden Markov model considering transition probabilities of connected words or characters. However, the experimental system discussed in this paper is specifically oriented to imitation of human psychological process in respect of adaptive and self-developing abilities in machine Chinese character recognition. It is of no importance whether the material to be recognized is a text or a set of independent characters. Thus, the system will not consider a general analyzer with textual information, but treat all the characters as isolated ones. In fact, in the experiments described below the characters to be recognized are collected unrelatedly and without any contextual meaning. To the experimental system a subordinate algorithm is added for processing the false and uncertain results given by the foregoing classifiers. It makes final decision without any additional information but with a hybrid nature considering evaluations of both the previous subsystems based on global pattern approach and subpattern approach. Therefore, such a subordinate algorithm can be looked on as a passive general analyzer. It will output a reject or provide a recommended recognition result without further identification as follows. 1) If both subsystems give false results, the general analyzer will make a final decision to reject the unknown character. 2) If one of the subsystems gives a false and the other gets an uncertain, it implies that the input character is in a condition quite unsuitable to the former subsystem. The general analyzer will take the first candidate given by the latter subsystem as a recommended result for final output. 3) If both the subsystems get uncertain evaluations and obtain the same first candidate, the general analyzer will accept it as a true result for system output. Self-learning of both subsystems is to be implemented. 4) If the results of two subsystems are uncertain and different first candidates are obtained, ten candidates will be handed to the general analyzer for further decision. In such a case a compound confidence function is defined for each of the candidates as the root mean square value of its confidence functions and for subsystems 1 and 2, respectively (17) (18) Candidate will be taken as the recommended result for the final decision, if is the largest among (19) VIII. EXPERIMENTAL RESULTS For experimental study of the system performance, 51 sample sets of 1128 Chinese characters, i.e., samples in total, written clearly in regular style by persons of different education were involved. The quality and writing style of the characters in different sample sets are unavoidably, to a certain extent, different from each other. Therefore, it can be understood that the results obtained in experiments will more or less depend on the sequential order of the sample sets to be applied. However, the influence was found to be of no importance to the peculiarities of the system performance and would never bring about meaningful effect to the virtue of the

10 262 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 3, MAY 1998 experimental results. In the following some typical results of the test are to be illustrated and discussed. A. Test Procedure In the experiments the system performance in both the training stage and the working stage was studied. From the 51 sample sets prepared for experiments ten of them were taken for the training stage to bring the system to a state capable of working autonomously and to observe the learning property of the system. Forty of the others were involved in the working stage to examine the performance of the system in respect of its adaptability and self-development. The extra one sample set would not be involved in training and self-development but was specifically prepared for observing the improvement in regard to recognition rates not only for the experimental system as a whole but also for each of the subsystems when working separately. Only once at the very beginning of the training stage the system was initially trained with a whole sample set of Chinese characters by means of supervised learning under teachers instruction. In order to initialize the representation systems with more suitable data, the first sample set was chosen to be of relatively standard quality. After the system was initialized, the other nine training sample sets were input successively to the system for preliminary recognition as well as more training, during which self-learning was implemented if a character was classified by any of the subsystems to a category evaluated as true, while supervised learning was introduced if otherwise. Afterwards, the system performance in the working stage was examined. In each cycle of the recognition process by successively using the 40 testing sample sets no more supervised learning was given and the further improvement of the system became the duty of self-learning. During the experiments of the working stage the system adaptation and self-development were observed and the overall recognition rate of the whole system and the shares of each subsystem were recorded. For comparing the development of the overall system and each of the subsystems working separately in their singleton form the extra testing sample set of characters was used in order to obtain some comparable data under a same condition. In such tests self-learning was blocked and the recognition rates of the subsystems were counted according to the correctness of their first candidates. B. Training Stage When the experimental system was trained with the designated ten training sample sets, the dramatic shift from supervised learning to self-learning happened. Fig. 7 illustrates that the percentage of characters processed by supervised learning descends from 100% to 4.6%, while that processed by self-learning ascends from 0% to 95.4%. Such a quick shift becomes possible due to the high learning efficiency of the subpattern approach. The different learning efficiencies of subsystem 1 and subsystem 2 can interestingly be compared by Fig. 8, which shows the recognition rates of both subsystems obtained after every Fig. 7. Fig. 8. Shift from supervised learning to self-learning in the training stage. The learning property of subsystems 1 and 2 in the training stage. training cycle when the characters of the extra sample set were classified by each of the subsystems working separately in their singleton form. After ten cycles of training, subsystem 1 reached a recognition rate being only 33.4%, but meanwhile subsystem 2 got a rate as high as 95.2%. The subsystem based on the global pattern approach learned much slower than the subsystem based on the subpattern approach constructed with structural semantic knowledge processing. It can be explained by the reason that the structural semantic knowledge applied is capable of representing the real virtue of structural peculiarities for Chinese characters and also that the complicated attributes of a global character are substituted by the much simpler attributes and relations of stroxels.

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

GOAL-BASED INTELLIGENT AGENTS

GOAL-BASED INTELLIGENT AGENTS International Journal of Information Technology, Vol. 9 No. 1 GOAL-BASED INTELLIGENT AGENTS Zhiqi Shen, Robert Gay and Xuehong Tao ICIS, School of EEE, Nanyang Technological University, Singapore 639798

More information

Numerical Field Extraction in Handwritten Incoming Mail Documents

Numerical Field Extraction in Handwritten Incoming Mail Documents Numerical Field Extraction in Handwritten Incoming Mail Documents Guillaume Koch, Laurent Heutte and Thierry Paquet PSI, FRE CNRS 2645, Université de Rouen, 76821 Mont-Saint-Aignan, France Laurent.Heutte@univ-rouen.fr

More information

QUALITY TOOLBOX. Understanding Processes with Hierarchical Process Mapping. Robert B. Pojasek. Why Process Mapping?

QUALITY TOOLBOX. Understanding Processes with Hierarchical Process Mapping. Robert B. Pojasek. Why Process Mapping? QUALITY TOOLBOX Understanding Processes with Hierarchical Process Mapping In my work, I spend a lot of time talking to people about hierarchical process mapping. It strikes me as funny that whenever I

More information

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Preliminary Discussion on Program of Computer Graphic Design of Advertising Major

Preliminary Discussion on Program of Computer Graphic Design of Advertising Major Cross-Cultural Communication Vol. 11, No. 9, 2015, pp. 19-23 DOI:10.3968/7540 ISSN 1712-8358[Print] ISSN 1923-6700[Online] www.cscanada.net www.cscanada.org Preliminary Discussion on Program of Computer

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Brown Hills College of Engineering & Technology Machine Design - 1. UNIT 1 D e s i g n P h i l o s o p h y

Brown Hills College of Engineering & Technology Machine Design - 1. UNIT 1 D e s i g n P h i l o s o p h y UNIT 1 D e s i g n P h i l o s o p h y Problem Identification- Problem Statement, Specifications, Constraints, Feasibility Study-Technical Feasibility, Economic & Financial Feasibility, Social & Environmental

More information

Making Decisions in Chess

Making Decisions in Chess Making Decisions in Chess How can I find the best move in a position? This is a question that every chess player would like to have answered. Playing the best move in all positions would make someone invincible.

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Teaching Methodology for 3D Animation

Teaching Methodology for 3D Animation Abstract The field of 3d animation has addressed design processes and work practices in the design disciplines for in recent years. There are good reasons for considering the development of systematic

More information

Five High Order Thinking Skills

Five High Order Thinking Skills Five High Order Introduction The high technology like computers and calculators has profoundly changed the world of mathematics education. It is not only what aspects of mathematics are essential for learning,

More information

Mechanics 1: Vectors

Mechanics 1: Vectors Mechanics 1: Vectors roadly speaking, mechanical systems will be described by a combination of scalar and vector quantities. scalar is just a (real) number. For example, mass or weight is characterized

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Novelty Detection in image recognition using IRF Neural Networks properties

Novelty Detection in image recognition using IRF Neural Networks properties Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,

More information

Analysis of Micromouse Maze Solving Algorithms

Analysis of Micromouse Maze Solving Algorithms 1 Analysis of Micromouse Maze Solving Algorithms David M. Willardson ECE 557: Learning from Data, Spring 2001 Abstract This project involves a simulation of a mouse that is to find its way through a maze.

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 21 CHAPTER 1 INTRODUCTION 1.1 PREAMBLE Wireless ad-hoc network is an autonomous system of wireless nodes connected by wireless links. Wireless ad-hoc network provides a communication over the shared wireless

More information

Diagnosis of Students Online Learning Portfolios

Diagnosis of Students Online Learning Portfolios Diagnosis of Students Online Learning Portfolios Chien-Ming Chen 1, Chao-Yi Li 2, Te-Yi Chan 3, Bin-Shyan Jong 4, and Tsong-Wuu Lin 5 Abstract - Online learning is different from the instruction provided

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

How to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics

How to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics How to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics Rocio Garcia-Retamero (rretamer@mpib-berlin.mpg.de) Center for Adaptive Behavior and Cognition, Max Plank Institute for Human

More information

KEY FACTORS AND BARRIERS OF BUSINESS INTELLIGENCE IMPLEMENTATION

KEY FACTORS AND BARRIERS OF BUSINESS INTELLIGENCE IMPLEMENTATION KEY FACTORS AND BARRIERS OF BUSINESS INTELLIGENCE IMPLEMENTATION Peter Mesároš, Štefan Čarnický & Tomáš Mandičák The business environment is constantly changing and becoming more complex and difficult.

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Recognition of Handwritten Digits using Structural Information

Recognition of Handwritten Digits using Structural Information Recognition of Handwritten Digits using Structural Information Sven Behnke Martin-Luther University, Halle-Wittenberg' Institute of Computer Science 06099 Halle, Germany { behnke Irojas} @ informatik.uni-halle.de

More information

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines. International Journal of Computer Application and Engineering Technology Volume 3-Issue2, Apr 2014.Pp. 188-192 www.ijcaet.net OFFLINE SIGNATURE VERIFICATION SYSTEM -A REVIEW Pooja Department of Computer

More information

Effects of CEO turnover on company performance

Effects of CEO turnover on company performance Headlight International Effects of CEO turnover on company performance CEO turnover in listed companies has increased over the past decades. This paper explores whether or not changing CEO has a significant

More information

Test Automation Architectures: Planning for Test Automation

Test Automation Architectures: Planning for Test Automation Test Automation Architectures: Planning for Test Automation Douglas Hoffman Software Quality Methods, LLC. 24646 Heather Heights Place Saratoga, California 95070-9710 Phone 408-741-4830 Fax 408-867-4550

More information

Comparative Analysis on the Armenian and Korean Languages

Comparative Analysis on the Armenian and Korean Languages Comparative Analysis on the Armenian and Korean Languages Syuzanna Mejlumyan Yerevan State Linguistic University Abstract It has been five years since the Korean language has been taught at Yerevan State

More information

INTERNATIONAL FRAMEWORK FOR ASSURANCE ENGAGEMENTS CONTENTS

INTERNATIONAL FRAMEWORK FOR ASSURANCE ENGAGEMENTS CONTENTS INTERNATIONAL FOR ASSURANCE ENGAGEMENTS (Effective for assurance reports issued on or after January 1, 2005) CONTENTS Paragraph Introduction... 1 6 Definition and Objective of an Assurance Engagement...

More information

Why are thesis proposals necessary? The Purpose of having thesis proposals is threefold. First, it is to ensure that you are prepared to undertake the

Why are thesis proposals necessary? The Purpose of having thesis proposals is threefold. First, it is to ensure that you are prepared to undertake the Guidelines for writing a successful MSc Thesis Proposal Prof. Dr. Afaf El-Ansary Biochemistry department King Saud University Why are thesis proposals necessary? The Purpose of having thesis proposals

More information

Chapter 6 Experiment Process

Chapter 6 Experiment Process Chapter 6 Process ation is not simple; we have to prepare, conduct and analyze experiments properly. One of the main advantages of an experiment is the control of, for example, subjects, objects and instrumentation.

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

Measurement Information Model

Measurement Information Model mcgarry02.qxd 9/7/01 1:27 PM Page 13 2 Information Model This chapter describes one of the fundamental measurement concepts of Practical Software, the Information Model. The Information Model provides

More information

CHAPTER 4 RESULTS. four research questions. The first section demonstrates the effects of the strategy

CHAPTER 4 RESULTS. four research questions. The first section demonstrates the effects of the strategy CHAPTER 4 RESULTS This chapter presents the statistical analysis of the collected data based on the four research questions. The first section demonstrates the effects of the strategy instruction on the

More information

How to Improve Reading Comprehension

How to Improve Reading Comprehension How to Improve Reading Comprehension Daniel E. Himes, Ph.D. Virtual Learning Environment Solutions, Inc. July, 2007 Your reading comprehension program should implement a multiple-strategy approach using

More information

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

More information

Big Data with Rough Set Using Map- Reduce

Big Data with Rough Set Using Map- Reduce Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,

More information

INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION CONTENTS

INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION CONTENTS INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION (Effective for assurance reports dated on or after January 1,

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION 1 ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION B. Mikó PhD, Z-Form Tool Manufacturing and Application Ltd H-1082. Budapest, Asztalos S. u 4. Tel: (1) 477 1016, e-mail: miko@manuf.bme.hu

More information

Writing a degree project at Lund University student perspectives

Writing a degree project at Lund University student perspectives 1 Writing a degree project at Lund University student perspectives Summary This report summarises the results of a survey that focused on the students experiences of writing a degree project at Lund University.

More information

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream

More information

Module 9. User Interface Design. Version 2 CSE IIT, Kharagpur

Module 9. User Interface Design. Version 2 CSE IIT, Kharagpur Module 9 User Interface Design Lesson 21 Types of User Interfaces Specific Instructional Objectives Classify user interfaces into three main types. What are the different ways in which menu items can be

More information

Concepts of digital forensics

Concepts of digital forensics Chapter 3 Concepts of digital forensics Digital forensics is a branch of forensic science concerned with the use of digital information (produced, stored and transmitted by computers) as source of evidence

More information

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network Qian Wu, Yahui Wang, Long Zhang and Li Shen Abstract Building electrical system fault diagnosis is the

More information

Using Use Cases for requirements capture. Pete McBreen. 1998 McBreen.Consulting

Using Use Cases for requirements capture. Pete McBreen. 1998 McBreen.Consulting Using Use Cases for requirements capture Pete McBreen 1998 McBreen.Consulting petemcbreen@acm.org All rights reserved. You have permission to copy and distribute the document as long as you make no changes

More information

Facilitating Knowledge Intelligence Using ANTOM with a Case Study of Learning Religion

Facilitating Knowledge Intelligence Using ANTOM with a Case Study of Learning Religion Facilitating Knowledge Intelligence Using ANTOM with a Case Study of Learning Religion Herbert Y.C. Lee 1, Kim Man Lui 1 and Eric Tsui 2 1 Marvel Digital Ltd., Hong Kong {Herbert.lee,kimman.lui}@marvel.com.hk

More information

THEORETICAL APPROACHES TO EMPLOYEE APPRAISAL METHODS

THEORETICAL APPROACHES TO EMPLOYEE APPRAISAL METHODS THEORETICAL APPROACHES TO EMPLOYEE APPRAISAL METHODS Andrea Šalková Abstract: Performance appraisal is the most important process of HR management in an organization. Regular employee appraisal can reveal

More information

Semantic Errors in SQL Queries: A Quite Complete List

Semantic Errors in SQL Queries: A Quite Complete List Semantic Errors in SQL Queries: A Quite Complete List Christian Goldberg, Stefan Brass Martin-Luther-Universität Halle-Wittenberg {goldberg,brass}@informatik.uni-halle.de Abstract We investigate classes

More information

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. MATHEMATICS: THE LEVEL DESCRIPTIONS In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. Attainment target

More information

THIRD REGIONAL TRAINING WORKSHOP ON TAXATION. Brasilia, Brazil, December 3 5, 2002. Topic 4

THIRD REGIONAL TRAINING WORKSHOP ON TAXATION. Brasilia, Brazil, December 3 5, 2002. Topic 4 THIRD REGIONAL TRAINING WORKSHOP ON TAXATION Brasilia, Brazil, December 3 5, 2002 Topic 4 INFORMATION TECHNOLOGY IN SUPPORT OF THE TAX ADMINISTRATION FUNCTIONS AND TAXPAYER ASSISTANCE Nelson Gutierrez

More information

Introduction Solvability Rules Computer Solution Implementation. Connect Four. March 9, 2010. Connect Four

Introduction Solvability Rules Computer Solution Implementation. Connect Four. March 9, 2010. Connect Four March 9, 2010 is a tic-tac-toe like game in which two players drop discs into a 7x6 board. The first player to get four in a row (either vertically, horizontally, or diagonally) wins. The game was first

More information

Managing large sound databases using Mpeg7

Managing large sound databases using Mpeg7 Max Jacob 1 1 Institut de Recherche et Coordination Acoustique/Musique (IRCAM), place Igor Stravinsky 1, 75003, Paris, France Correspondence should be addressed to Max Jacob (max.jacob@ircam.fr) ABSTRACT

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information

Name of pattern types 1 Process control patterns 2 Logic architectural patterns 3 Organizational patterns 4 Analytic patterns 5 Design patterns 6

Name of pattern types 1 Process control patterns 2 Logic architectural patterns 3 Organizational patterns 4 Analytic patterns 5 Design patterns 6 The Researches on Unified Pattern of Information System Deng Zhonghua,Guo Liang,Xia Yanping School of Information Management, Wuhan University Wuhan, Hubei, China 430072 Abstract: This paper discusses

More information

Neural Networks and Support Vector Machines

Neural Networks and Support Vector Machines INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

More information

The Secret to Playing Your Favourite Music By Ear

The Secret to Playing Your Favourite Music By Ear The Secret to Playing Your Favourite Music By Ear By Scott Edwards - Founder of I ve written this report to give musicians of any level an outline of the basics involved in learning to play any music by

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Level 2 Routing: LAN Bridges and Switches

Level 2 Routing: LAN Bridges and Switches Level 2 Routing: LAN Bridges and Switches Norman Matloff University of California at Davis c 2001, N. Matloff September 6, 2001 1 Overview In a large LAN with consistently heavy traffic, it may make sense

More information

Online Farsi Handwritten Character Recognition Using Hidden Markov Model

Online Farsi Handwritten Character Recognition Using Hidden Markov Model Online Farsi Handwritten Character Recognition Using Hidden Markov Model Vahid Ghods*, Mohammad Karim Sohrabi Department of Electrical and Computer Engineering, Semnan Branch, Islamic Azad University,

More information

Q1. The graph below shows how a sinusoidal alternating voltage varies with time when connected across a resistor, R.

Q1. The graph below shows how a sinusoidal alternating voltage varies with time when connected across a resistor, R. Q1. The graph below shows how a sinusoidal alternating voltage varies with time when connected across a resistor, R. (a) (i) State the peak-to-peak voltage. peak-to-peak voltage...v (1) (ii) State the

More information

A Stock Pattern Recognition Algorithm Based on Neural Networks

A Stock Pattern Recognition Algorithm Based on Neural Networks A Stock Pattern Recognition Algorithm Based on Neural Networks Xinyu Guo guoxinyu@icst.pku.edu.cn Xun Liang liangxun@icst.pku.edu.cn Xiang Li lixiang@icst.pku.edu.cn Abstract pattern respectively. Recent

More information

The Distinction between Manufacturing and Multi-Project And the Possible Mix of the Two. By Eli Schragenheim and Daniel P. Walsh

The Distinction between Manufacturing and Multi-Project And the Possible Mix of the Two. By Eli Schragenheim and Daniel P. Walsh vector strategies The Distinction between Manufacturing and Multi-Project And the Possible Mix of the Two By Eli Schragenheim and Daniel P. Walsh The floor looks similar to any other manufacturing floor.

More information

Functional Decomposition Top-Down Development

Functional Decomposition Top-Down Development Functional Decomposition Top-Down Development The top-down approach builds a system by stepwise refinement, starting with a definition of its abstract function. You start the process by expressing a topmost

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

The Universal Laws of Gravitation. Copyright 2012 Joseph A. Rybczyk

The Universal Laws of Gravitation. Copyright 2012 Joseph A. Rybczyk The Universal Laws of Gravitation Copyright 2012 Joseph A. Rybczyk Abstract Close examination of Newton s universal law of gravitation and Galileo s discovery that all objects fall to Earth at the same

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Artificial Intelligence in Retail Site Selection

Artificial Intelligence in Retail Site Selection Artificial Intelligence in Retail Site Selection Building Smart Retail Performance Models to Increase Forecast Accuracy By Richard M. Fenker, Ph.D. Abstract The term Artificial Intelligence or AI has been

More information

I. The SMART Project - Status Report and Plans. G. Salton. The SMART document retrieval system has been operating on a 709^

I. The SMART Project - Status Report and Plans. G. Salton. The SMART document retrieval system has been operating on a 709^ 1-1 I. The SMART Project - Status Report and Plans G. Salton 1. Introduction The SMART document retrieval system has been operating on a 709^ computer since the end of 1964. The system takes documents

More information

1 Organization of Operating Systems

1 Organization of Operating Systems COMP 730 (242) Class Notes Section 10: Organization of Operating Systems 1 Organization of Operating Systems We have studied in detail the organization of Xinu. Naturally, this organization is far from

More information

pm4dev, 2016 management for development series Project Scope Management PROJECT MANAGEMENT FOR DEVELOPMENT ORGANIZATIONS

pm4dev, 2016 management for development series Project Scope Management PROJECT MANAGEMENT FOR DEVELOPMENT ORGANIZATIONS pm4dev, 2016 management for development series Project Scope Management PROJECT MANAGEMENT FOR DEVELOPMENT ORGANIZATIONS PROJECT MANAGEMENT FOR DEVELOPMENT ORGANIZATIONS A methodology to manage development

More information

OPTIMIZATION MODEL OF EXTERNAL RESOURCE ALLOCATION FOR RESOURCE-CONSTRAINED PROJECT SCHEDULING PROBLEMS

OPTIMIZATION MODEL OF EXTERNAL RESOURCE ALLOCATION FOR RESOURCE-CONSTRAINED PROJECT SCHEDULING PROBLEMS OPTIMIZATION MODEL OF EXTERNAL RESOURCE ALLOCATION FOR RESOURCE-CONSTRAINED PROJECT SCHEDULING PROBLEMS Kuo-Chuan Shih Shu-Shun Liu Ph.D. Student, Graduate School of Engineering Science Assistant Professor,

More information

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

2. Auditing. 2.1. Objective and Structure. 2.2. What Is Auditing?

2. Auditing. 2.1. Objective and Structure. 2.2. What Is Auditing? - 4-2. Auditing 2.1. Objective and Structure The objective of this chapter is to introduce the background information on auditing. In section 2.2, definitions of essential terms as well as main objectives

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller

More information

Fourth generation techniques (4GT)

Fourth generation techniques (4GT) Fourth generation techniques (4GT) The term fourth generation techniques (4GT) encompasses a broad array of software tools that have one thing in common. Each enables the software engineer to specify some

More information

INTRUSION PREVENTION AND EXPERT SYSTEMS

INTRUSION PREVENTION AND EXPERT SYSTEMS INTRUSION PREVENTION AND EXPERT SYSTEMS By Avi Chesla avic@v-secure.com Introduction Over the past few years, the market has developed new expectations from the security industry, especially from the intrusion

More information

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network , pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and

More information

Session 4. System Engineering Management. Session Speaker : Dr. Govind R. Kadambi. M S Ramaiah School of Advanced Studies 1

Session 4. System Engineering Management. Session Speaker : Dr. Govind R. Kadambi. M S Ramaiah School of Advanced Studies 1 Session 4 System Engineering Management Session Speaker : Dr. Govind R. Kadambi M S Ramaiah School of Advanced Studies 1 Session Objectives To learn and understand the tasks involved in system engineering

More information

Testing LTL Formula Translation into Büchi Automata

Testing LTL Formula Translation into Büchi Automata Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland

More information

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin * Send Orders for Reprints to reprints@benthamscience.ae 766 The Open Electrical & Electronic Engineering Journal, 2014, 8, 766-771 Open Access Research on Application of Neural Network in Computer Network

More information

SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL

SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL Krishna Kiran Kattamuri 1 and Rupa Chiramdasu 2 Department of Computer Science Engineering, VVIT, Guntur, India

More information

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities Algebra 1, Quarter 2, Unit 2.1 Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned

More information

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

CATIA V5 Surface Design

CATIA V5 Surface Design CATIA V5 Training Foils CATIA V5 Surface Design Version 5 Release 19 August 2008 EDU_CAT_EN_V5S_FI_V5R19 1 Lesson 1: Introduction to Generative Shape Design About this Course Introduction CATIA is a robust

More information

Theory of electrons and positrons

Theory of electrons and positrons P AUL A. M. DIRAC Theory of electrons and positrons Nobel Lecture, December 12, 1933 Matter has been found by experimental physicists to be made up of small particles of various kinds, the particles of

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM (CHINESE INTO ENGLISH)

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM (CHINESE INTO ENGLISH) [From: Translating and the Computer, B.M. Snell (ed.), North-Holland Publishing Company, 1979] AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM (CHINESE INTO ENGLISH) Shiu-Chang LOH and Luan KONG Hung

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

1 INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN

1 INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN 1 INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN 1.1 INTRODUCTION Systems are created to solve problems. One can think of the systems approach as an organized way of dealing with a problem. In this dynamic

More information