, pp. 385-396 http://dx.do.org/10.14257/jmue.2014.9.11.37 Topc Sentment Analyss n Chnese News Ouyang Chunpng, Zhou Wen +, Yu Yng, Lu Zhmng and Yang Xaohua School of Computer Scence and Technology, Unversty of South Chna, Hunan Hengyang, 421001, Chna + mrwentan@foxmal.com Abstract Sentment analyss n news s dfferent from normal text sentment analyss. News usually have a specfc topc, a focus semantc emoton, therefore, ths paper, based on the prncpal of usng Emoton Dependency Tuple (EDT) as the basc unt of news emoton analyss, resolves topc sentment analyss n news nto three progressve sub-problem, namely, topc sentence recognton, EDT extracton and topc sentment analyss. We use an mproved TF- IDF and cross entropy to extract feature set of topcs. Then, based on space vector model, calculate the topc assocaton of a sentence and extract topc sentence. Fnally, we construct topc sentence based on EDT and complete clusterng of news topc sentment. Ths method s evaluated usng COAE2014 dataset, and dfferental means shows that our results close to the best results. Ths shows that the topc based EDT could effectvely mprove performance of sentment analyss n news. Keywords: sentment analyss, EDT, topc sentence extracton, News context 1. Introducton Context sentment analyss, also nown as opnon mnng, s the process of mnng, analyzng and dfferentatng users opnon, hobbes and emotons. It s a cross-doman scentfc research feld, whch ncludes probablstc theory, data statstcs analyss, computer lngustcs, natural language processng, machne learnng, nformaton retreval, ontology and many other related felds. Context sentment analyss has drawn the attenton of many nsttute and researcher recently due to the mportant applcaton value. Sentment analyss can be categorzed as three levels of progressve research tas, namely sentment nformaton extracton, sentment nformaton classfcaton and sentment nformaton retreval and summarzaton. Ths paper focuses on sentment nformaton classfcaton. The objectve s to classfy context sentment nto postve or negatve terms and even specfc classes. Dependng on the granularty of context analyss, sentment nformaton classfcaton can be dvded nto word level, phrase level, sentence level and paragraph level. Currently, there are two man branches n sentment classfcaton: sentment nowledge based and feature based. The former sums up and set weghts on sentments or polar words (or language unts) based on exstng sentment nowledge base. The latter extracts meanngful class related features from context and apply machne learnng method for classfcaton. There have many researches on both sentment research felds all around the world. Km etc., [3] uses the frst dea by summng up and addng weght to Englsh text to acqure the polar of sentences and paragraphs. Turney [4] used the dea mutual nformaton n unsupervsed learnng methods and dstngush emoton of artcle by calculatng dfference n mutual nformaton of word combnaton chosen by pre-defned rule and excellent and poor of seed words. Pang etc., [5] started to use machne learnng models to classfy Englsh text. There are many researches whch are dstance learnng supervson based on ISSN: 1975-0080 IJMUE Copyrght c 2014 SERSC
SVM, renforced KNN corpus learnng, Nave Bayesan (NB) feature learnng, etc [6]. Due to language dfference, learnng method cannot be used drectly n Chnese language sentment analyss. Researchers n Chna have done research for sentment analyss based on Chnese. There some researches whch was based on HowNet emoton word dctonary [7]. They used semantc relaton feld and automaton to do emoton classfcaton. Moreover, some researchers use CRF and nformaton gan algorthm combned wth feature selecton method to acheve sentment classfcaton [8]. The above methods has made progress n sentment analyss n both englsh and chnese, but these researches all consder sentment analyss as a set of words [9] (bags-of-word). But n realty, people not only use emoton words to express emotons, but also use a certan level of emoton expresson structure. Bag-of-word models gnore syntactc and semantc relatons n emoton word thus mang accuracy low. Also, some researcher present sentment analyss based on dependency analyss. Matsumoto use dependency of a sentence as SVM features to classfy sentment. Wu [10] use dependency analyss to perform sentment analyss n comment text. These method usng dependency relatons has mproved the performance n emoton classfcaton. Dependency Grammar (DG) theory s set up by French lngustc Lucen Tesnère n 1959. He beleves that sentence s a complete organzed structure, where components are words. Words wll generate relatons wth the adjacent word, whch forms a framewor of sentence, and express meanng [11]. Most research based on dependency grammar uses dependency structure and hgh accuracy machne learnng analyss method, but does not acheve the transformaton from syntactc level to semantc level [12,13]. Ths type of syntactc dependency analyss wll ntroduce nose unrelated to emoton, but mang t more effcent for syntactc analyss usng dependency grammar reasonable assumpton. News has dfferent characterstcs. It has a stable structure, obvous topc and emoton expresson concentrate n topc sentence [14]. In news sentment analyss, f news topc and topc words are not consdered, nformaton un-related to topc wll be ntroduced. To avod unnecessary nose, ths paper borrows the dea of usng verb n dependency tree to construct sentence framewor, and propose a basc structure to descrbe news emoton: Emotonal dependency tuple (EDT) and analyze emoton tendency based on EDT. EDT uses topc feature words as core, wth other modfcaton word are attached to core word. We presented a novel sentment analyss model based on emotonal dependency tuple, whch regards the topc word of sentence as headword of the emotonal dependency tuple, mproves the effcency of the sentment analyss. In the remander of the ntroducton, we motvate our nterest n the tas, descrbe the topc sentment analyss model for Chnese news, and dscuss some examples hghlghtng some dffcultes. In Secton 2, we summares how to accomplsh sentment analyss n Chnese news and whch technques wll be appled. Tryng to llustrate the ey theores n our method, we tempt to refne the tas n Secton 3 and Secton 4. At last, we descrbe our experment data and results. The paper ends wth a concluson. 2. Analyss Ppelne for Topc Sentment of Chnese News News context data exst n portal stes, blogs and bulletn n large scale. Most of t s sentment orented and sentment analyss can provde mportant evdence for users to grasp socal dynamcs and dstngush publc status [15, 16]. News report s an mportant carrer for news events, whch has standard grammar, correct syntax, and reasonable adjectve modfers, and needs to justfy state clearly 6W (when, where, who, why, how, what) of news events [17]. 6W usually appears n the ttle, begnnng and end of a report. The ttle of a news report 386 Copyrght c 2014 SERSC
Partcple News text preprocessng Internatonal Journal of Multmeda and Ubqutous Engneerng s consdered as the eye of a news event. It s usually a length restrcted, sngle lned statement and contan rch nformaton. So, t s mportant to enhance the mnng of ttles, frst sentence and frst paragraph n order to mne more nformaton. The man source of text sentment evdence comes from emotonal words. But sngle sourced emotonal word dctonary doesn't record new words, hot words, deformed words and potental emoton words n tme, thus leadng to lmted coverage n emoton evdence [18, 19]. We use emoton word and evaluaton word from HowNet as the foundaton, and combne t wth DUT emoton dctonary after removng neutral words and classfyng seven emoton words nto postve and negatve words. Then t s unted wth Tawan Unversty s Chnese emoton word dctonary (NTUSD), Sougou word dctonary s new word poton and basc dctonary [20, 21]. Words n Chnese sentences don t have obvous separators, so lexcon processng s requred before text classfcaton. We use NLPIR tool, whch s based on HMM, as our text parser. Before parsng, nternet words that are doman related and words from the emoton ontology are add to the custom dctonary. Frst, new words extracted from the whole document are added to the parser dctonary. Then, we do parsng and POS taggng to each sentence to ncrease accuracy. The results of parsng s as shown: D { S1, S2, S3,, S j,, Sn}, where S j s the jth sentence n document D, and S j { W1, W2,, W,, Wm}, where W s the th word n the sentence S j, Based on the characterstc of new documents, topc features are frst extracted. Then, topc sentence canddate set s then extracted from the document. Later, EDT s extracted from topc sentence to construct EDT analyss model for topc polar dstnguaton. Fnally, topc sentence subjectve-objectve classfcaton s used to select subjectve news topc sentment wth prorty. Fgure 1 shows the detaled process: Sentment Dctonary Syntactc Analyss Topc Featrue Extracton EDT Extracton Topc Sentence Identfcaton Sentment Analyss of Topc Sentence Sentment Classfcaton Based on Subjectve and objectve Topc Sentence Fgure 1. The News Tendency Analyss Flow Chart News tendency (NT): there are at least two types of news emoton tendency, one s new event tendency (NET),for example, natural dsasters, human casualty and property loss are examples of negatve news, whle technology, sports and culture advance has postve tendency. The other represents the subjectve tendency of new reporters (NRT), for example, the news event dscount on hgh speed tran tcet draws both postve reports and negatve Copyrght c 2014 SERSC 387
reports. Ths paper places subjectve tendency predcton n hgher prorty. NET wll be the fnal tendency for ones that doesn t have subjectve tendency. Sentment topc sentence: sentment topc sentence must be able to express secton topc and general tendency. Therefore, sentment tendency topc sentence contans two man features: essental eyword and sentment tendency eyword. Topc eyword s used to summarze the topc of a secton, sentment tendency eyword s used to summarze tendency of secton. Emoton dependency tree (EDT): usng core words (CW) as topc features, emoton words (EW) as dependent on core word, degree word (DW) and negatve word (NW) seres as modfer for core word and emoton word, the basc structure of the emoton expresson s constructed, ts matchng pattern s shown as formula (1). EDT=[*NW/DW][*[*NW/DW]EW]CW[*[*NW/DW]EW] (1) For example, for topc sentence RouShJ ndustry n Chna s not very good, now our company have to compensate the 45 mllon now., RouShJ ndustry s regarded as core word, not very good s regarded as emoton word used to modfy core word, and negatve word not and degree word very forms a negatve degree structure used to modfy emoton word. The condtons for RouShJ ndustry or ndustry to be core words s that t must exst n topc word set. Topc sentence of News sentment classfcaton: uses news event tendency and news emoton tendency as classfcaton foundaton. Extract news sentment topc sentence from news report that s closest to topc, as subjectve emoton of news. Extract news event topc sentence from news report that has hghest smlarty to topc as objectve emoton of news. 3. Identfcaton of the Topc Sentence of Chnese News Drectly analyzng tendency of news artcle, s often dsturbed by emoton elements unrelated to topc, and s often dffcult to dstngush news report and news event. Gven the above problems, ths paper presents a two-step process: frst extract topc sentence from new artcle, and then dstngush emotons of topc sentence to suppress nterference. 3.1. Feature Sets of News Topc Constructon The topc an artcle can be expressed by topc feature words, then topc feature set of news artcle can be obtaned by usng TF-IDF. A tradtonal TF-IDF method doesn t consder the frequency of a word n a document, whch can have certan nfluence on the characterzaton of a word. Consderng ths problem, we mprove TF-IDF: IFIDF N Num a tf df f a tf log( ) ( ) (2) N Num Where f s the rato of the frequency of feature value W n document D over the total. There are totally N documents, and N represents the number of documents wth feature values. Num s the frequency of feature value W n document Num all s the frequency of feature value W n all documents. all D. And Usng formula (2), the TF-IDF value of every word can be calculated, formng a feature subset. To mprove the accuracy of topc feature, we extract topc features usng a method based on crossed entropy, thus gettng another feature subset. then we obtan a frst feature set 388 Copyrght c 2014 SERSC
by usng equaton 3 to do normalzaton on topc smlarty for the two features T(tfdf) and T(cross) : T ( temp) T( fdf ) T( cross) (3) Where α and β are weghts for adjustng the weght of the two methods. Due to the fact that new headlne can better characterze the topc, at frst, through the analyss of the dstrbuton of topc score n T (temp) features two thresholds s set to 8.5 and 5.5. Then carry out these operatons n T (temp) : sequently tae word W from topc, f T (temp) exst W wth topc value lower than 8.5, then set W to 8.5; else f T (temp) doesn t contan W, then add W to T (temp) and set topc value to 5.5. The complete topc feature set T s then obtaned by mergng topc words and T (temp). 3.2. Topc Sentence Extracton Because topc feature words are centered on topc sentence of new reports, topc sentence from news artcle s extracted by usng vector space model to calculate the cosne dstance of topc feature word and news artcle. The detaled steps are as follows: Step 1: Calculate cosne smlarty. Use all values from topc feature set T as vector dmenson space to construct the feature vector V (T ) and the sentence vector V S ) : VT ( ) weght1, weght 2 weghtn (4) V( S j ) N1 weght1, N2 weght 2 N2 weghtn (5) Where n s the number of feature value nt. V T uses the smlarty of ts feature value and topc as the weght of each dmenson. V( S j ) uses the product of the number of feature words and ts smlarty as the dmenson weght. Then calculate the cosne V T and V( S ) as the topc base value of a sentence. smlarty Score cos of j Step 2: Calculate locaton score. Topc feature extracton has not consdered the locaton feature of the sentence. Sentences at the begnnng and endng of the artcle are more lely to be topc sentences, whle sentences n the mddle of the artcle t s more lely to be detals. So, hgher scores for sentences at the begnnng and endng of the artcle and lower scores for sentences n the mddle meets the characterstcs of quadratc functon. Ths paper uses functon 2 ( j a x 0. 5 to calculate the locaton score of a sentence Score loc. Step 3: Extract sentence length feature. Longer sentences have hgher probablty of havng more feature value. Ths paper uses quotent of the sum of TF-IDF value of every word n a sentence and the square value of the sentence length as the length value Score len to adjust the nfluence of sentence length. Step 4: Calculate ttle smlarty. Ttle can reflect over 90% of the topc. Consderng the fact the ttle smlarty can complement for the defcency of feature vector method, ttle smlarty Score ttle s the count of words n a sentence that also appears n the ttle. Copyrght c 2014 SERSC 389
Step 5: The topc sentence s extracted by weght fuson after normalzaton of the above scores. an artcle at most has four topc sentences. The fuson formula s as follows: When s equal to 1 to 4, weght of each tem. 4 Score Score (6) 4. Sentment Analyss of the Topc Sentence set 1 Score ndcates the score of step 1 to 4, and γ s the Emoton expresson of news sentence uses emoton dependency tuple as unts. Emoton value of sentence s a comprehensve emoton representaton of EDT. Extractng EDT s a process of syntactc analyss of a sentence, analyss of dependency, extractng EDT from dependency, and fnally dstngush new tendency based on EDT. 4.1. Emotonal Dependency Tuple Extracton Document structured can be extracted from news artcle through syntactc analyss. Informaton extracton based on ths can obtan more precse nowledge and mprove performance of nformaton extracton system. In ths paper, Stanford parser s used to analyze dependency of a sentence and extract the dependency tree. Usng the sentence Reporters also found a lot of nvestors are very optmstc. as an example, EDT extracton process s as follows: (1) Frst convert parsed text and POS nto strng sequence Reporters/NN also/ad found/vv a lot of/cd nvestor /NN very/ad optmstc /VA, then perform syntactc analyss to construct syntactc tree and dependency relaton; (2) Extract topc feature word that can be used as evaluaton object as the head word of the emoton tuple, for example nvestor n the sentence. (3) Search for head word n all leaf nodes n syntactc tree, then usng extracton rules n Table 1 extract modfer words for head word n all sblng nodes and sub-tree of sblng nodes of head word. Three pars wth structure <head word, modfyng> can be extracted, <nvestor, very>, <nvestor, optmstc>, usng rule; Table 1. Emotonal Dependency Tuple Extracton Rules ID Head-word Modfyng Word 1 Noun Adjectve, Verb, Noun, and JJ 2 Verb Adjectve and Adverb 3 Pronoun Adjectve, Verb, Noun, and JJ (4) Extract the head word, negatve dependency of modfyng word, degree dependency advmod, nummod from the dependency tree to construct a complete EDT. The complete EDT contans a head word wth ts modfyng words and a number (or zero) of dependency relatons. 390 Copyrght c 2014 SERSC
4.2. Sentment Identfy for the Topc Sentence Construct emoton dstngushng model based on emoton atom group, S sub represents the emoton value for head word wth ntal value 1. S dec represents emoton value for modfer word wth ntal value 0. S term represents emoton base value for the emoton atom group. Obtan head word and modfer word from emoton word dctonary, where postve emoton word value of 1 and negatve emoton word value of -1. Then, calculate negatve degree NegW word of head word and each modfer word: obtan negatve dependency of head word and each modfer word, and NegW word NegW word for each negatve obtan a negatve emoton dependency. Obtan degree modfer for negatve dependency, obtan degree modfer NegW word NegW word * W word for each degree modfer word, where W word s the degree coeffcent for degree word. So, emoton value for the whole emoton tuple wll be: n S ( term) S( sub) NegW ( sub) [ S( dec) NegW ( dec ) 1] (7) Where n s the number of modfer for head word, emoton polarty of emoton tuple s dependent on polarty of head word and number of emoton polarty of modfer word. When there s no modfer word or modfer word have no emoton, then S term s determned by polarty of head word. Emoton value of sentence s the sum of all emoton dependency tuple, and when a sentence has no emoton tuple, emoton value of a sentence s calculated based on emoton word dctonary. The model for emoton value of a sentence s as: (8) Where n s the number of emoton tuple n a certan sentence. When n = 0, emoton value of a sentence s the summaton of all emoton value of emoton word where m s the number of emoton word and emo s the emoton word. Usng ths model, the emoton value of topc sentence can be calculated. 4.3. Sentment Classfcaton of Chnese News In the above, topc sentment ncludes two types of emoton, new report and new event. News report emoton represents the subjectve emoton of the author, and the news event emoton represents the objectve emoton of the news event. So, classfyng sentence emoton wth subjectve emoton wll obtan subjectve emoton based on news report. Step 1: extract subject predcate dependency from dependency tree. The subject usually appears frst, so the frst par of subject predcate relaton extracted s subject and predcate. Step 2: usng grammatcal person and POS of an object, n addton wth predcate to dstngush subjectve and objectve state of a sentence. If the object s frst person, pronoun then annotate as subject sentence. News report wth specfc predcate words as predcate s annotated as objectve words. Copyrght c 2014 SERSC 391
Step 3: select most relevant subjectve sentence to topc from canddate topc sentence as emoton ey sentence to obtan emoton of news report. When no subjectve sentences exst, and then use most relevant objectve topc sentence as fnal emoton type of news artcle. 5. Expermental Results 5.1. Parameters Settng We used the Sxth Chnese Opnon Analyss Evaluaton corpus (COAE2014) whch s released by the Chna Conference on Informaton Retreval. Ths corpus has 10000 texts ncludng news, blogs, and forums. We selected 500 News whch are randomly labeled from the corpus for our experments. For gold standard label boundares, the datasets have been labeled by annotator, so that the ey sentence and opnon of the sentment are mared, and 15 topc words are labeled n each text. Then, we utlzed a rule-based method to remove some erroneous examples. Fnally, we get a labeled corpus for our tas whch ncludes 280 texts. In Topc feature extracton, we adopt an aggregatng method of TFIDF and Cross Entropy based weghts. The methodology gets the weghts of TFIDF and Cross Entropy durng the aggregaton through analyzng the smlarty of topc feature wth ttle and annotaton results. Fgure 2 shows the smlarty result In the case of dfferent weghts. Fnally, we selected 9:1 proporton between TFIDF and Cross Entropy. Fgure 2. Topc Features Extracton Parameter Weghts Contrast In Topc feature extracton, we adopt an aggregatng method of TFIDF and Cross Entropy based weghts. The methodology gets the weghts of TFIDF and Cross Entropy durng the aggregaton through analyzng the smlarty of topc feature wth ttle and annotaton results. Fgure 2 shows the smlarty result In the case of dfferent weghts. Fnally, we selected 9:1 proporton between TFIDF and Cross Entropy. In topc sentence extracton, we used the four features whch are the Cosne dstance of topc feature vector, the sentence poston, the sentence sze, and the smlarty wth ttle. In our experments, the evaluaton of each feature was based on the number of the matchng the 392 Copyrght c 2014 SERSC
topc sentence labeled by human. We utlzed the results of evaluaton to determne the weght on each feature. The weght learnng uses the same optmzaton objectve as EM algorthm. Fnally, we get the four weghts: r 1: r2: r3: r4 0.55: 0.2: 0.1: 0. 15 5.2. Results and Observatons We use Precson (P), Recall (R), F measure (F1) and Mcro-average (Mcro) to evaluate the overall performance. Referrng to the evaluaton standards of NLP&CC 2013 [22], the formulas of evaluaton standards for News sentment classfcaton are lsted as follows. # system _ correct ( emoton Y) Precson (9) # system _ proposed ( emoton Y) # system _ correct ( emoton Y) Recall (10) # annotated ( emoton Y) 2 Precson Recall F1 (11) Precson Recall In above formulas, #system_correct denotes the number of the submtted results whch match wth the manually annotated results, #system_proposed denotes all the number of the submtted results, #annotated denotes the number of the manually annotated results. The result shows that our method leads to a hgher performance n Precson, Recall and F- measure than those approaches released by COAE2014. In Mcro-average, our results are the closest to the best results. The detals about the results are shown n Table 2. Table 2. The Results of Sentment Analyss of Chnese News R P F1 Accuracy Mcro Recall Mcro Precson Mcro F-measure EDT-Based 0.2152 0.0683 0.1037 0.0600 0.3495 0.0932 0.1472 medan 0.0676 0.0517 0.0571 0.0405 0.2201 0.0682 0.1041 max 0.2394 0.0846 0.1092 0.0652 0.3887 0.1041 0.1642 The results of our experments are the second best n all the approaches, and very close to the best. Ths shows our sentment analyss model based on emotonal dependency tuple s a practcal and worthy of further study methods. Our approach adopt topc feature to construct emotonal dependency tuple, and can avod the mpact of non-topc sentment. Our model can further enhance, for example, we mprove the recall and precson by the augment of syntactc and semantc evdence. We can also lft performance by addng synonym dctonary, sentment dctonary and headword dctonary. Consequently, the scalablty of our sentment analyss model based on emotonal dependency tuple s very strong. 6. Conclusons and Future Wor To analyze the topc sentment of Chnese news, our tas s dvded nto three sub-tas: dentfyng the topc sentence, extractng the emotonal dependent tuples, and the topc sentment analyss. Dvde - and - conquer strategy s used to complete the dfferent layer of Copyrght c 2014 SERSC 393
sub-tas to ease analyzng sentment of news as a whole. We presented a novel sentment analyss model based on emotonal dependency tuple, whch mprove the effcency of the sentment analyss. Ths model regards the topc words of sentence as head-words of the emotonal dependency tuple, and removes the nose of non-topc sentment. We found the augment of syntactc and semantc evdence has a great help to lft the performance of sentment analyss through our experments. Ths wll be our further wors. Sentment classfcaton n the news text s very useful to ndvdual and corporaton, such as tang effectve measures to reduce the negatve mpact on the news meda, Internet, and other meda. Ths paper proposed the method based on emotonal dependency tuples s prelmnary dscussed for sentment analyss, whch s a worthy of further exploraton and research, has broad applcaton prospects. Acnowledgements Ths research wor s supported by Hunan Provncal Natural Scence Foundaton of Chna (No.13JJ4076), the Scentfc Research Fund of Hunan Provncal Educaton Department for excllent talents (No.13B101), Foundaton of Unversty of South Chna (No.2012XQD28), the Construct Program for the Key Dscplne n Unversty of South Chna (No.NHx02), and the Construct Program for Innovatve Research Team n Unversty of South Chna. References [1] Y. Zhao, B. Qn and T. Lu, Sentment Analyss, Journal of Software, vol. 21, no. 8, (2010). [2] T. F. Yao, X. W. Cheng, F. Y. Xu, H. Wu and R. Wang, A Survey of Opnon Mnng for Texts, Journal of Chnese Informaton Processng, vol. 22, no. 3, (2008). [3] S. M. Km and E. Hovy, Automatc detecton of opnon bearng words and sentences, Proceedngs of the Second Internatonal Jont Conference on Natural Language Processng, (2005); Jeju Island, Korea. [4] B. Pang, L. Lee and S. Vathyanathan, Sentment classfcaton usng machne learnng technques, Proceedngs of the Conference on Emprcal Methods n Natural Language Processng, Morrstown, (2002); NJ, USA. [5] P. Turney, Semantc orentaton appled to unsupervsed of revews, Proceedngs of the 40th Annual Meetng of the Assocaton for Computaton Lngustcs, (2002); USA. [6] B. Pang and L. Lee, Opnon mnng and sentment analyss, Foundatons and trends n nformaton retreval, vol. 2, no. 1, (2008). [7] X. Jun, Y. X. Dng and X. L. Wang, Sentment classfcaton for Chnese news usng machne learnng methods, Journal of Chnese Informaton Processng, vol. 21, no. 6, (2007). [8] W. Yang, J. J. Song and J. Q. Tang, A Study on the Classfcaton Approach for Chnese McroBlog Subjectve and Objectve Sentences, Journal of Chongqng Insttute of Technology, vol. 27, no. 1, (2013). [9] M. B. Salem and J. S. Salvatore, Detectng masqueraders: A comparson of one-class bag-of-words user behavor modelng technques, Journal of Wreless Moble Networs, Ubqutous Computng, and Dependable Applcatons, vol. 1, no. 1, (2010). [10] Y. B. Wu, Q. Zhang and X. J. Huang, Phrase dependency parsng for opnon mnng, Proceedngs of the 47th Annual Meetng of the Assocaton for Computatonal Lngustcs, (2009); Boston, USA. [11] L. Y. Luo, J. L.V. Mejno Jr. and G. Q. Zhang, An Analyss of FMA Usng Structural Self-Bsmlarty, J Bomed Inform, vol. 46, no. 3, (2013). [12] J. L. Zhang, E. Lu, J. Wan, Y. Ren, M. Yue and J. Wang, Implementng Sparse Matrx-Vector Multplcaton wth QCSR on GPU, Appled Mathematcs & Informaton Scences, vol. 7, no. 2, (2013). [13] G. Qu, B. Lu, J. Bu and C. Chen, Expandng Doman Sentment Lexcon through Double Propagaton, the 21 st Internatonal Jont Conference on Artfcal Intellgence, (2009); Pasadena, Calforna, USA. 394 Copyrght c 2014 SERSC
[14] C. Zhang, D. Zeng, J. L, F. Y. Wang and W. Zuo, Sentment analyss of Chnese documents: From sentence to document level, Journal of the Amercan Socety for Informaton Scence and Technology, vol. 60, no. 12, (2009). [15] S. Feng, Y. C. Fu, F. Yang, D. Wang and D. Zhang, Blog Sentment Orentaton Analyss Based on Dependency Parsng, Journal of Computer Research and Development, vol. 49, no. 11, (2012). [16] J. L. Zhang, Y. Chen, J. Wang, E. Y. Lu, Y. Y. Yn, Y. J. Ren and C. H. L,, The Implementaton and Optmzaton of Irregular Applcaton Tas Models Based on the Cell BE Processor, Internatonal Journal of Dgtal Content Technology and ts Applcatons, vol. 6, no. 2, (2012). [17] W. Wang, D. Y. Zhao and W. Zhao, Identfcaton of Topc Sentence about Key Event n Chnese News, Acta Scentarum Naturalum Unverstats Penenss, vol. 47, no. 5, (2011). [18] J. L. Zhang, H. Dng, J. Wan, Y. Ren and J. Wang, Desgnng and Applyng an Educaton IaaS System based on OpenStac, Appled Mathematcs & Informaton Scences, vol. 7, (2013). [19] L. X. Xe, M. Zhou and M. S. Sun, Herarchcal Structure Based Hybrd Approach Sentment Analyss of Chnese Mcroblog and Its Featrue Extracton, Journal of Chnese Informaton Processng, vol. 26, no. 1, (2012). [20] H. H. Wu,, A. C. R. TSAI, R. T. H. Tsa and J. Y. J. HSU, Buldng a Graded Chnese Sentment Dctonary Based on Commonsense Knowledge for Sentment Analyss of Song Lyrcs, Journal of Informaton Scence & Engneerng, vol. 29, no. 4, (2013). [21] G. Xu, X. F. Meng and H. F. Wang, Buld Chnese Emoton Lexcons Usng A Graph-based Algorthm and Multple Resources, Proceedngs of the 23rd Internatonal Conference on Computatonal Lngustcs, (2010); Bejng, Chna. [22] C. P. Ouyang, X. H. Yang, L. Y. Le, Q. Xu, Y. Yu and Z. M. Lu, Mult-strategy Approach for Fne-graned Sentment Analyss of Chnese Mcroblog, Acta Scentarum Naturalum Unverstats Penenss, vol. 50, no. 1, (2014). Authors Ouyang Chunpng, she receved her Ph.D. degree from Unversty of Scence Technology Bejng, Chna, n 2011. Now, she s worng n Unversty of South Chna, where she s an assocate professor and the dean of software engneerng department. Her research nterests are n Semantc Web, socal computng and sentment analyss of text. Zhou Wen, he receved hs B.E. degree from Department of Computer Scence of Unversty of South Chna, Hengyang, n 2012. Now, he s worng hs M.E. Degree at Unversty of South Chna. Hs research nterests nclude natural language processng and data mnng. He s correspondng author of ths paper. Yu Yng, she receved hs M.E. degree from Department of Computer Scence of Unversty of South Chna, Hengyang, n 2012. Now, he s lecture of Unversty of South Chna. Hs research nterests nclude educaton nformaton technology and sentment analyss of mcroblog. Copyrght c 2014 SERSC 395
Lu Zhmng. He receved hs Ph.D. degree from Natonal Unversty of Defense Technology, Chna, n 2012. Now, he s professor of Unversty of South Chna, supervsor of postgraduate. He has authored and coauthored over 20 research publcatons n peer-revewed reputed journals. Hs research nterests nclude nformaton fetreval, Cloud Computng and bg-data mnng. Yang Xaohua. He receved hs Ph.D. degree from CAS, Chna, n 1999. From 2000 to 2001, he was a vstng scholar at Wollongong Unversty. Yang s vce-presdent of Unversty of South Chna, and professor of computer scence, doctoral tutor. He has authored and coauthored over 50 research publcatons n peer-revewed reputed journals. He has served as the program commttee member of varous nternatonal conferences and revewer for varous nternatonal journals. Hs research nterests nclude natural language processng and nformaton retreval. 396 Copyrght c 2014 SERSC