Towards automatic discovery of co-authorship networks in the Brazilian academic areas
|
|
- Arron Bond
- 8 years ago
- Views:
Transcription
1 Towards automatic discovery of co-authorship networks in the Brazilian academic areas Jesús. Mena-Chalco, Roberto M. Cesar Junior Department of Computer Science Institute of Mathematics and Statistics University of São aulo - Brazil {jmena,cesar}@vision.ime.usp.br Abstract In Brazil, individual curricula vitae of academic researchers, that are mainly composed of professional information and scientific productions, are managed into a single software platform called Lattes. Currently, the information gathered from this platform is typically used to evaluate, analyze and document the scientific productions of Brazilian research groups. Despite the fact that the Lattes curricula has semi-structured information, the analysis procedure for medium and large groups becomes a time consuming and highly error-prone task. In this paper, we describe an extension of the scriptlattes (an open-source knowledge extraction system from the Lattes platform), for analysing individuals Lattes curricula and automatically discover largescale co-authorship networks for any academic area. Given some knowledge domain (academic area), the system automatically allows to identify researchers associated with the academic area, extract every list of scientific productions of the researchers, discretized by type and publication year, and for each paper, identify the co-authors registered in the Lattes latform. The system also allows the generation of different types of networks which may be used to study the characteristics of academic areas at large scale. In particular, we explored the node s degree and AuthorRank measures for each identified researcher. Finally, we confirm through experiments that the system facilitates a simple way to generate different co-authorship networks. To the best of our knowledge, this is the first study to examine large-scale co-authorship networks for any Brazilian academic area. I. INTRODUCTION Data mining is a process of extracting large volumes of data, commonly used to identify or reveal possible relationships between instances of the treated elements [1]. The extraction of scientific data, identification of bibliometric patterns, and effective modeling and visualization of networks of interaction among co-authors are relevant topics in e-science, Bibliometrics and Scientometrics. In recent years, there has been growing interest in such topics because of the knowledge discovery process that may be implemented from the treatment of datasets available in the repositories of scientific production (e.g. datasets of bibliographic production, academic guidances and research projects). On the other hand, in Brazil, the National Council for Scientific and Technological Development (CNq) performs an important work on the integration of academic curricula databases of public and private institutions into a single platform called Lattes. The so-called Lattes Curriculum are considered a national standard of assessment or evaluation, representing a history of scientific activities academic and professional of researchers registered on the platform [2]. Lattes curricula have been designed to show individual public information for each user registered on the platform. In this context, performing a summarization of bibliographic production, or generation of co-authorship networks, for a group of registered users of medium or large size (e.g. researchers from a specific academic area), really requires a lot of manual effort and a myriad of process, susceptible to failures, which could influence the obtained results. Therefore, starting from the work 1 [3], an automatic system for generation of large-scale co-authorship networks, based on Lattes curricula belongings to researchers associated to specific academic areas, has been created. It is worth noting that this previous work (i) do not explore the automatic identification of researchers associated with an academic area, and (ii) do not extract measures to characterize co-authorship networks. It is important to note that our approach assumes the existence of a large set of Lattes curricula. Our local dataset contains over 1.3 million of curricula. The data was gathered by crawling publicly accessible information on to the Lattes platform. We believe that our work is the first study to automatically examine Brazilian co-authorship networks in such large-scale. In addition to the generation of co-authorships networks, the system allows to automatic computation of two metrics related to the node s degree and the AuthorRank measure [4] for each identified researcher. Considering our experiments, we provide an insight into the real-world co-authorship networks. We observe a significant correlation between the node measures such as the degree value and the AuthorRank measure calculated of each researcher associated with an academic area. However, we observe that the AuthorRank measure is an indicator more representative than the degree value, because the AuthorRank better revels the status of a researcher, as mentioned by X. Liu et al. [4]. Finally, note that this system can be used for exploring, identifying and validating patterns of scientific activities, thus bringing bibliometric and/or scientometric information about a group of interest [5]. The relevance of our work rest on the benefits of using the process considered for the automatic 1 The scriptlattes is available as open-source code that can be adapted and improved for specific research needs:
2 analysis of scientific production and relationship among the actors of research academic areas [6], [7]. This paper is organized as follows. The acquisition procedures used to collect the used dataset is described in Section II. The system is described in Section III. Some experiments and results are shown in Section IV. The paper is concluded with some comments on our ongoing work in Section V. II. DATA COLLECTION The data presented in this paper is from the Lattes platform on May 5th, 2011 and contains about 1.3 million curricula. We have created several procedures in order to crawl curricula, in HTML format, by accessing the public websearch interface provided by the Lattes platform. Several webquery were executed for the purpose of access to large Lattes curricula dataset. Each web-query has been referred to a specific knowledge area. In summary, it was performed eighty queries to the Lattes platform, each one associated to an knowledge domain. For every result a specific HTML parser has been created in order to automatically obtain the list of curricula. With this in mind, we used this list to automatically download the Lattes curricula. It is important to note that with this collection strategy, we do not attempt to study the entire Lattes database, but a representative quantity data, since in 2007 the CNq announced that the Lattes platform achieved one million of registered curriculum which means that it should be much larger now. III. THE SYSTEM Our proposed system is composed of four main modules, namely (i) data selection, (ii) data extraction, (iii) redundancy treatment, (iv) co-authorship networks generation, and (v) AuthorRank computation. The basic operation of the system is as follows. Firstly, a list of Lattes curricula, in HTML format, must be selected following some criteria of analysis given by the user. This requires multiple queries to the local database. The resulting Lattes curricula are processed through a deterministic HTML parser that extracts all information of interest (e.g. professional information, expertise areas, list of academic productions). Therefore, the processed productions lists, associated with each researcher, are combined into lists, organized by type and publication year. The obtained filtered lists are used to create different types of networks. This is achieved by identifying all co-authors (registered on Lattes platform) of each publication. The AuthorRank computation is then performed to obtain a measure related to the status of each researcher. Fig. 1 shows the high level view of the proposed system. A. Data selection This module allows to select the Lattes curricula, in HTML format, from the local database. Therefore, only the subset of researchers Lattes curricula is chosen in accordance to the academic (expertise) areas informed by the user. In order to facilitate the comparison between strings, the curricula were normalized to use the same characters codification. As Fig. 1. A high level view of our system to automatically discover coauthorship networks belongings to academic areas. indicated in Section II, public users of the Lattes platform do not have access to the complete Lattes database. For this reason, special attention is given to extract the information about the scientific productions, as explained in the data extraction module. B. Data extraction In this module, a deterministic HTML arser [8] is used to automatically extract all information of each researcher related to full name, bibliographic citation name, academic areas, professional address, scholarship type, and gender. Additionally, the designed parser allows to extract the list of academic productions limited by the years indicated in the time period (given by the user). Currently, it is considered three types of academic production, namely (i) bibliographical production, (ii) technical production, and (iii) artistic production. See in Table I the types of academic production considered by the system. Is worth mentioning that a computational challenge for the system is processing the data in HTML format, where the constituent parts of the academic productions (e.g. authors names, publication title, the journal name of publication, number, volume, pages, year) are presented without any indication of division/separation. Thus, the designed parser identifies, in the vast majority of cases, all the constituent parts of the academic productions. C. Redundancy treatment It is common that some academic productions are made in co-authorship with one or more researches of the selected group in the Data selection module. Thus, the same academic production may be being registered in each Lattes curriculum of the corresponding co-authors. The developed system has a redundancy treatment module that allows the detection of
3 TABLE I INFORMATION EXTRACTED FROM THE LATTES CURRICULUM. (i) Bibliographical production Articles in scientific journals Book published/organized Book chapter published Articles in newspapers/magazines Complete works published in proceedings of conferences Expanded summary published in proceedings of conferences Summary published in proceedings of conferences Articles accepted for publication resentations of work Other kinds of bibliographical production (ii) Technical production atented or registered software Not patented or registered software Technological products Techniques or process Technical works Other kinds of technical production (iii) Artistic productions Artistic/cultural production equal or similar academic productions. In this regard, the productions are used to detect co-authorship among the selected group of researchers: two or more researchers are considered co-authors if there is a common production between them. The detection of equal or similar productions is accomplished through comparisons among all extracted academic productions discretized by year and type (e.g., journal paper, book chapter), so that the productions with different publication s year are not used in any comparison, allowing a substantial reduction of processing time of this module. Due to inconsistencies in filling information in the Lattes curriculum (typos or lack of standardization of co-authors names [9]), the comparison of any two productions is made through an inexact matching procedure between the title of the academic productions. Two productions are considered equal if the similarity percentage between the titles is greater than a percentage given by the user. The similarity between two strings is based on the distance proposed by Levenshtein [10]. The Levenshtein distance is obtained through the minimum number of insertions, deletions or substitutions required to transform one string into another (Levenshtein distance equals to 0 indicate that two considered titles are exactly equal). An important feature of this module is the conjunction of data in the similar productions, so that the missing information of a production can be combined/complemented with information from the other. Currently, complementation refers only to the exclusive use of the string with larger size. With this module, the system is able to keep a record of all coauthors associated with a particular academic production. This information is important in numerical weighting of academic productions, such as the normalization of edges weights belonging to co-authorship networks. D. Co-authorship networks generation A co-authorship network describes research activities that have been carried out by a research group [6], [11]. In this module, the designed system uses a graph to represent the coauthorship among researchers based on scientific productions. Each researcher is represented by a node. An edge is created between a pair of nodes whenever a common production of the corresponding researchers is detected by the redundancy treatment module, i.e., if two researchers have co-authored an academic production, their respective nodes in the coauthorship network are linked by an edge (co-author relationship between authors). Our work is focused on creating different co-authorship networks. Below we describe briefly each of them. 1) Networks without weights: this type of network is commonly represented by an undirected graph without weights (or undirected unit-weighted graph), where the edges represent only connections for co-authored academic productions. A simple measure that can be harnessed on this type of graphs is the node degree, that represents the number of edges connected to it. See an example in Fig. 2(a). The degree of the node A1 is 3. 2) Networks with un-normalized weights: this type of network is represented by an undirected weighted graph, where the edges represent the number of academic productions developed in co-authorship between two nodes. In this type of networks the node degree can be seen as the sum of the edges weights of edges connected to it. Note that with this representation, the total number of productions made in co-authorship is overlooked. For example, in Fig. 2(b) the author A1 collaboratively has participated in three productions, however, the node degree is 4 (this value does not correspond with a common sense notion of coauthorship). 3) Networks with co-authorship frequencies: to work around the problem reported in the previous subsection, in [4] it was defined the terms of exclusivity and co-authorship frequency. Exclusivity is related to the value at which two authors have an exclusive co-authorship relation for a particular academic production. Thus, given an academic production co-authored by n authors, the exclusivity value of this production can be defined as 1/(n 1). This measure gives more weight to coauthor relationships in academic production with fewer total co-authors than articles with large numbers of co-authors. See in the top of Fig. 2 examples of exclusivity values computed for three productions. The co-authorship frequency between two co-authors is defined as the sum of all exclusivity values between them. This measure gives more weight to authors who co-publish more academic productions together. See in Fig. 2(c) examples of co-authorship frequencies. Note that with this representation,
4 Fig. 2. Example of co-authorship networks obtained from automatic identification of 3 productions written by 4 authors: A1, A2, A3 and A4. given an author, the total number of productions made in coauthorship is computed by the sum of co-authorship frequencies obtained from co-authors connected to the author. For example, the author A1 has participated in three productions ( ). 4) Networks with normalized weights: this type of network is represented by a directed weighted graph where, given an author, the salient edges are represented by the co-authorship frequencies normalized by the total number of productions made in co-authorship. This normalization ensures that the weights of an author s relationships sum to one. See in Fig. 2(d) an example of network with normalized weights. The normalized weights intuitively gives an idea of the importance of a co-author in the production carried out in collaboration with others. For example, the author A1 participates in 75% of the production made in co-authorship of A2 (i.e., A1 is 75% important for A2). However, the author A2 is important only about 50% for A1. On the other hand, A1 is 100% important for A4, but A4 is important only about 33% for A1. This behavior is typical in the relationship advisor-advisee (A1-A4). E. AuthorRank computation The weights of the aforementioned directed graph show the importance of collaboration and reciprocity, being also used in the system as a basis for the calculation of the AuthorRank measures [4] of researchers. We could understand the AuthorRank measure as a numeric value that represents the status of a researcher within the co-authorship network (the algorithm proposed by X. Liu et al. [4], called AuthorRank, it is an adaptation of the agerank algorithm [12] used in the search for relevant pages in the Google search engine). Therefore, the researchers with more collaborative status have the highest AuthorRank values. The AuthorRank (AR) of an author i, which has n coauthors, is given as follows: AR (t) (i) = (1 d) + d n ( AR (t 1) ) (j) w j,i j=1 where AR (t 1) (j) corresponds to the AuthorRank of the j coauthor of i (j = 1,..., n), w j,i corresponds to the normalized weight between author j and author i, and d is the damping factor (usually set to 0.85 as described in [13]). For this iterative process we considered the initialization of Author- Rank measures as AR (0) (k) = 1 k. In our experiments the AuthorRank measures converged when t = 100. For the example shown in Fig. 2, A1 is the author with the highest AuthorRank. It is important to note that, given the node degree values, the authors A2 and A3 could be classified at the same level. However, considering the AuthorRank measures, the ranking of author A2 is better than the ranking of author A3. IV. EXERIMENTS AND RESULTS Researchers from four domains have been considered in the present study: Computer Science, Mathematics, hysics, and Statistics. In order to standardize the analysis of automatic generation of co-authorship networks, only the journal papers published between 2000 and 2010 were considered. The obtained co-authorship networks are presented in Fig. 3. For each knowledge domain, the Data selection module (Section III) has been configured to consider only researchers with a doctoral degree (at least). In the regular representation (first column), the researchers are represented by nodes, and the co-authorships are represented by unweighted straight edges. The graph drawing was obtained through the Fruchterman- Reingold algorithm [14]. The entire graph is simulated as if it were a physical system. The aforementioned algorithm is based on the attribution of forces between the nodes and the (1)
5 Regular representation Node size is proportional to node s degree Node size is proportional to node s AuthorRank measure (a) Computer Science 6014 researchers (b) Mathematics 5008 researchers (c) hysics 6410 researchers (d) Statistics 2274 researchers Fig. 3. Co-authorship networks belonging to researchers associated to four Brazilian academic areas. The researchers are represented by nodes, the coauthorships are represented by edges. The networks were created considering the journal papers published between 2000 and 2010 by researchers with doctoral degree. Isolated nodes are not shown.
6 cumulative frequency Computer sciences Mathematics hisics Statistics All degree Fig. 4. Cumulative degree distribution of co-authorship networks of four Brazilian academic areas. edges, as if the edges were straight springs and nodes are electrically charged particles. The forces are applied iteratively to the nodes, pulling them closer together or pushing them more distant until the system enter into an equilibrium state. This algorithm was adopted to visualize the complete co-authorship networks because the vertices are evenly distributed, making edge lengths uniform and reflecting symmetry. In general, each co-authorship network apparently reflects characteristics of professional/academic practice. For instance, mathematicians usually have a few co-authors in publishing their work. On the other hand, physicists generally tend to have several co-authors in the publications. This approach may be better understood by calculating the cumulative degree distribution of each co-authorship network (see Fig. 4). In the representation of co-authorship networks of the second column of Fig. 3, each node size is proportional to its own measure of degree of connectivity, i.e., the size of the node represents the number of co-authors of each researcher, so that the more collaborative researcher will have a larger size of the node. On the other hand, in the third column of Fig. 3, the node size is proportional to its own measure of AuthorRank. Note that the nodes maintain the same relative positions for all co-authorship networks of each area. As expected, only one giant component was obtained for each academic area. As a global analysis we execute our proposed system in order to obtain a complete co-authorship network composed by the four aforementioned areas. A unique label was assigned for each academic area hd researchers have been identified in our local database researchers were associated with more than one academic area. Fig. 5 presents the co-authorship networks created considering the journal papers published between 2000 and Note that in the regular representation of the network (first column), the researchers associated with each area are slightly grouped among themselves forming components. Note also that the area of mathematics (nodes represented in green color) is transversal to the other three areas considered in our experiment. In the second and third column of Fig. 5 the node size associated to each researcher was modified in order to represent the degree value and AuthorRank measures, respectively. The higher value is represented by a node with a higher size. Note that the more connected academic area corresponds to hysics. Therefore, it is expected that the majority of nodes with large size (i.e., nodes width high degree value) belong to this area. However, a better sense of collaboration is given by the AuthorRank measures. Regarding the relationship between node measures such as degree and AuthorRank that allows to characterize the topology of real-world networks, few studies had been conducted [15]. In this sense, in our experiment, we computed the earson s correlation coefficient of (p-value < 2.2e- 16) between the degree values and the AuthorRank measures. This significant correlation indicates that the AuthorRank measure is related to degree value of each node. In fact, both measures indicate how related is a researcher in the academic community. Fig. 6 displays the graphic correlation between both measures. In order to compare the representation of co-authorship networks by both measures, in Table II the top 10 researchers are presented sorted by degree values and AuthorRank measures. Note that, considering the nodes degree value, the top 10 identified researchers are associated with the area of hysics (highly connected area). On the other hand, considering the nodes AuthorRank measure, the researchers more collaborative from other areas are shown in evidence in the top 10 researchers associated with the four considered academic areas. It is also interesting to note that both measures are apparently linked in part to the number of papers published and the type of academic scholarship conferred by the CNq (the best types of scholarships associated to the top researchers are:,, 1C and 1D). V. CONCLUSION AND ERSECTIVES In the present work we have presented a system which can improve the usefulness significantly of the scriptlattes (an open-source knowledge extraction system from the Lattes platform) in order to automatically discover co-authorship networks through data extracted from the Lattes platform. Our initiative contributes mainly in two aspects: (i) allow the automatic generation of large-scale co-authorship networks of several Brazilian academic area, and (ii) provides a good opportunity to analyze, through metrics, co-authorship networks derived from different areas associated with the eighty knowledge areas classified by CNq. All things considered, the co-authorship networks may be examined by complementary tools for social network analysis such as ajek, UCInet or R. Thereby, the obtained networks
7 Regular representation Node size is proportional to node s degree value Node size is proportional to node s AuthorRank measure Fig. 5. Co-authorship network belonging to researchers associated to four Brazilian academic areas: Computer Science (red), Mathematics (green), hysics (blue), Statistics (magenta). Researchers with two or more academic areas are presented in black color. The networks were created considering the journal papers published between 2000 and 2010 by researchers with doctoral degree. Isolated nodes are not shown. TABLE II T O 10 RESEARCHERS W. R. T. DEGREE VALUES ( LEFT ) AND AUTHOR R ANK MEASURES ( RIGHT ) OBTAINED FROM JOURNAL AERS UBLISHED BETWEEN 2000 AND 2010 BY RESEARCHERS WITH H D DEGREE. CS=C OMUTER S CIENCE, M=M ATHEMATICS, = HYSICS, S=S TATISTICS. Researcher A6201 A14631 A2763 A12704 A1839 A260 A3599 A14752 A9190 A11411 Degree AuthorRank apers Area(s) CS, Type 1C 1D 1C 1C can be easily treated with various metrics such as: distance, clustering and centrality (closeness and betweenness) [16], [17]. These measures are widely studied to: (i) characterize social networks and automatically identify co-authorship subcommunities of research groups [18], (ii) study and characterize the temporal evolution of co-authorship among researchers, i.e., analyze the cooperation dynamics of researchers across different periods of time [19], or (iii) correlate quantitative with qualitative collaboration data in order to examine the trending of research activities on specific axes [20], [21]. Finally, we emphasize that the use of automatic tools similar to the presented (e.g. DBL, CiteSeer, Google Scholar, Microsoft Academic Search, and ArnetMiner [22]) have been increasingly necessary because there is a increasing volume of academic production data that must be correctly computed [1] and effectively visualized by using computational methods [23]. ACKNOWLEDGMENT The authors would like to thank CNq, CAES, FAES and FINE for their financial support. Researcher A14631 A4600 A9220 A15939 A14752 A2763 A11411 A14054 A16431 A6201 AuthorRank Degree apers Area(s) CS, CS, S CS, M, M CS Type 1C R EFERENCES [1] S. Issue, Communications of ACM: Surviving the data deluge. New York, NY, USA: ACM, 2008, vol. 51, no. 12. [2] C. Amorin, Curriculum vitae organization: the Lattes software platform, esquisa Odontolgica Brasileira, vol. 17, no. 1, pp , [3] J.. Mena-Chalco and R. M. Cesar-Jr., scriptlattes: An open-source knowledge extraction system from the lattes platform, Journal of the Brazilian Computer Society, vol. 15, no. 4, pp , [4] X. Liu, J. Bollen, M. Nelson, and H. V. de Sompel, Co-authorship networks in the digital library research community, Informations rocessing and Management, vol. 41, no. 6, pp , [5] S. Nicholson, The basis for bibliomining: Frameworks for bringing together usage-based data mining and bibliometrics through data warehousing in digital library services, Informations rocessing and Management, vol. 42, no. 3, pp , [6] S. Klink,. Reuther, A. Weber, B. Walter, and M. Ley, Analysing social networks within bibliographical data, in 7th International Conference on Database and Expert Systems Applications, ser. Lecture Notes in Computer Science, vol. 4080, 2006, pp [7] R. Kouzes, G. Anderson, S. Elbert, I. Gorton, and D. Gracio, The changing paradigm of data-intensive computing, Computer, vol. 42, no. 1, pp , [8] M. Tomita, Current issues in parsing technology. Kluwer Academic ublishers, Boston, [9] I. S. Kang, S. H. Na, S. Lee, H. Jung,. Kim, W. K. Sung, and J. H. Lee, On co-authorship for author disambiguation, Information rocessing and Management, vol. 45, no. 1, pp , [10] G. Navarro, A guided tour to approximate string matching, ACM Computing Surveys, vol. 33, no. 1, pp , 2001.
8 Degree AuthorRank Fig. 6. Correlation between degree and AuthorRank measures. [11] M. Maia and S. Caregnato, Co-autoria como indicador de redes de colaborao cientfica, erspectivas em Cincia da Informao, vol. 13, no. 2, pp , [12] L. age, S. Brin, R. Motwani, and T. Winograd, The pagerank citation ranking: Bringing order to the web, Stanford InfoLab, Technical Report , November 1999, previous number = SIDL-W [13] S. Brin and L. age, The anatomy of a large-scale hypertextual web search engine, in Seventh International World-Wide Web Conference, [14] T. M. J. Fruchterman and E. M. Reingold, Graph drawing by forcedirected placement, Software - ractice and Experience, vol. 21, no. 11, pp , [15] A. Jamakovic and S. Uhlig, On the relationships between topological measures in real-world networks, Networks and Heterogeneous Media, vol. 3, no. 2, pp , [16] S. Wasserman and K. Faust, Social network analysis: methods and applications. Cambridge University ress, [17] J.. Scott, Social Network Analysis: A Handbook, 2nd ed. SAGE ublications, [18] M. Newman and M. Girvan, Finding and evaluating community structure in networks, hysical Review E, vol. 69, no. 2, p , [19] B. Wu, F. Zhao, S. Yang, L. Suo, and H. Tian, Characterizing the evolution of collaboration network, in roceeding of the 2nd ACM workshop on Social web search and mining, ser. SWSM 09, 2009, pp [20] L. L. Leydesdorff, Can scientific journals be classified in terms of aggregated journal-journal citation relations using the journal citation reports? Journal of the American Society for Information Science and Technology, vol. 57, no. 5, pp , [21] L. Leydesdorff, Betweenness centrality as an indicator of the interdisciplinarity of scientific journals, Journal of the American Society for Information Science and Technology, vol. 58, no. 9, pp , [22] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, Arnetminer: Extraction and mining of academic social networks, in roceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008, pp [23] D. A. Keim, Information visualization and visual data mining, IEEE transactions on Visualization and Computer Graphics, vol. 8, no. 1, pp. 1 8, 2002.
scriptlattes: an open-source knowledge extraction system from the Lattes platform
Journal of the Brazilian Computer Society, 9; 5():-9. ISSN -65 scriptlattes: an open-source knowledge extraction system from the Lattes platform Jesús Pascual Mena-Chalco*, Roberto Marcondes Cesar Junior
More informationALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search
Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti
More informationA comparative study of social network analysis tools
Membre de Membre de A comparative study of social network analysis tools David Combe, Christine Largeron, Előd Egyed-Zsigmond and Mathias Géry International Workshop on Web Intelligence and Virtual Enterprises
More informationComplex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics Zhao Wenbin 1, Zhao Zhengxu 2 1 School of Instrument Science and Engineering, Southeast University, Nanjing, Jiangsu
More informationTemporal Visualization and Analysis of Social Networks
Temporal Visualization and Analysis of Social Networks Peter A. Gloor*, Rob Laubacher MIT {pgloor,rjl}@mit.edu Yan Zhao, Scott B.C. Dynes *Dartmouth {yan.zhao,sdynes}@dartmouth.edu Abstract This paper
More informationSOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS
SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS Carlos Andre Reis Pinheiro 1 and Markus Helfert 2 1 School of Computing, Dublin City University, Dublin, Ireland
More informationA SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION
A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending
More informationUSE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS
USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu ABSTRACT This
More informationA Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationUSING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationA Framework of User-Driven Data Analytics in the Cloud for Course Management
A Framework of User-Driven Data Analytics in the Cloud for Course Management Jie ZHANG 1, William Chandra TJHI 2, Bu Sung LEE 1, Kee Khoon LEE 2, Julita VASSILEVA 3 & Chee Kit LOOI 4 1 School of Computer
More informationScientific Collaboration Networks in China s System Engineering Subject
, pp.31-40 http://dx.doi.org/10.14257/ijunesst.2013.6.6.04 Scientific Collaboration Networks in China s System Engineering Subject Sen Wu 1, Jiaye Wang 1,*, Xiaodong Feng 1 and Dan Lu 1 1 Dongling School
More informationProteinQuest user guide
ProteinQuest user guide 1. Introduction... 3 1.1 With ProteinQuest you can... 3 1.2 ProteinQuest basic version 4 1.3 ProteinQuest extended version... 5 2. ProteinQuest dictionaries... 6 3. Directions for
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationGraph Processing and Social Networks
Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph
More informationDiagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
More informationTIBCO Spotfire Network Analytics 1.1. User s Manual
TIBCO Spotfire Network Analytics 1.1 User s Manual Revision date: 26 January 2009 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
More informationIntelligent Analysis of User Interactions in a Collaborative Software Engineering Context
Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context Alejandro Corbellini 1,2, Silvia Schiaffino 1,2, Daniela Godoy 1,2 1 ISISTAN Research Institute, UNICEN University,
More informationNetwork-Based Tools for the Visualization and Analysis of Domain Models
Network-Based Tools for the Visualization and Analysis of Domain Models Paper presented as the annual meeting of the American Educational Research Association, Philadelphia, PA Hua Wei April 2014 Visualizing
More informationBig Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
More informationIdentifying Influential Scholars in Academic Social Media Platforms
Identifying Influential Scholars in Academic Social Media Platforms Na Li, Denis Gillet École Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne, Switzerland {na.li, denis.gillet}@epfl.ch Abstract
More informationLog Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com
More informationDynamical Clustering of Personalized Web Search Results
Dynamical Clustering of Personalized Web Search Results Xuehua Shen CS Dept, UIUC xshen@cs.uiuc.edu Hong Cheng CS Dept, UIUC hcheng3@uiuc.edu Abstract Most current search engines present the user a ranked
More informationSix Degrees of Separation in Online Society
Six Degrees of Separation in Online Society Lei Zhang * Tsinghua-Southampton Joint Lab on Web Science Graduate School in Shenzhen, Tsinghua University Shenzhen, Guangdong Province, P.R.China zhanglei@sz.tsinghua.edu.cn
More informationRANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS
ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for
More informationMapReduce Approach to Collective Classification for Networks
MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty
More informationA STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LIX, Number 1, 2014 A STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE DIANA HALIŢĂ AND DARIUS BUFNEA Abstract. Then
More informationHow to Analyze Company Using Social Network?
How to Analyze Company Using Social Network? Sebastian Palus 1, Piotr Bródka 1, Przemysław Kazienko 1 1 Wroclaw University of Technology, Wyb. Wyspianskiego 27, 50-370 Wroclaw, Poland {sebastian.palus,
More information. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns
Outline Part 1: of data clustering Non-Supervised Learning and Clustering : Problem formulation cluster analysis : Taxonomies of Clustering Techniques : Data types and Proximity Measures : Difficulties
More informationSocial Network Discovery based on Sensitivity Analysis
Social Network Discovery based on Sensitivity Analysis Tarik Crnovrsanin, Carlos D. Correa and Kwan-Liu Ma Department of Computer Science University of California, Davis tecrnovrsanin@ucdavis.edu, {correac,ma}@cs.ucdavis.edu
More informationInternational Journal of Engineering Research-Online A Peer Reviewed International Journal Articles are freely available online:http://www.ijoer.
RESEARCH ARTICLE SURVEY ON PAGERANK ALGORITHMS USING WEB-LINK STRUCTURE SOWMYA.M 1, V.S.SREELAXMI 2, MUNESHWARA M.S 3, ANIL G.N 4 Department of CSE, BMS Institute of Technology, Avalahalli, Yelahanka,
More informationGraph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
More informationCharacter Image Patterns as Big Data
22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,
More informationCOMMON CORE STATE STANDARDS FOR
COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationTHREE-DIMENSIONAL CARTOGRAPHIC REPRESENTATION AND VISUALIZATION FOR SOCIAL NETWORK SPATIAL ANALYSIS
CO-205 THREE-DIMENSIONAL CARTOGRAPHIC REPRESENTATION AND VISUALIZATION FOR SOCIAL NETWORK SPATIAL ANALYSIS SLUTER C.R.(1), IESCHECK A.L.(2), DELAZARI L.S.(1), BRANDALIZE M.C.B.(1) (1) Universidade Federal
More informationSubgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro
Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scale-free,
More informationCiteSeer x in the Cloud
Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
More informationExploring Big Data in Social Networks
Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
More informationSocial Analysis of the SEKE Co-Author Network
Social Analysis of the SEKE Co-Author Network Rehab El Kharboutly Swapna S. Gokhale Software Engineering Computer Science & Engg. Quinnipiac University Univ. of Connecticut Hamden, CT 06518 Storrs, CT
More informationVisual Analysis Tool for Bipartite Networks
Visual Analysis Tool for Bipartite Networks Kazuo Misue Department of Computer Science, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, 305-8573 Japan misue@cs.tsukuba.ac.jp Abstract. To find hidden features
More informationViral Marketing in Social Network Using Data Mining
Viral Marketing in Social Network Using Data Mining Shalini Sharma*,Vishal Shrivastava** *M.Tech. Scholar, Arya College of Engg. & I.T, Jaipur (Raj.) **Associate Proffessor(Dept. of CSE), Arya College
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationDMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
More informationChapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Imre Varga Abstract In this paper I propose a novel method to model real online social networks where the growing
More informationLearn Software Microblogging - A Review of This paper
2014 4th IEEE Workshop on Mining Unstructured Data An Exploratory Study on Software Microblogger Behaviors Abstract Microblogging services are growing rapidly in the recent years. Twitter, one of the most
More informationHow To Find Local Affinity Patterns In Big Data
Detection of local affinity patterns in big data Andrea Marinoni, Paolo Gamba Department of Electronics, University of Pavia, Italy Abstract Mining information in Big Data requires to design a new class
More informationSearch in BigData2 - When Big Text meets Big Graph 1. Introduction State of the Art on Big Data
Search in BigData 2 - When Big Text meets Big Graph Christos Giatsidis, Fragkiskos D. Malliaros, François Rousseau, Michalis Vazirgiannis Computer Science Laboratory, École Polytechnique, France {giatsidis,
More informationApplying Social Network Analysis to the Information in CVS Repositories
Applying Social Network Analysis to the Information in CVS Repositories Luis Lopez-Fernandez, Gregorio Robles, Jesus M. Gonzalez-Barahona GSyC, Universidad Rey Juan Carlos {llopez,grex,jgb}@gsyc.escet.urjc.es
More informationCan Twitter Predict Royal Baby's Name?
Summary Can Twitter Predict Royal Baby's Name? Bohdan Pavlyshenko Ivan Franko Lviv National University,Ukraine, b.pavlyshenko@gmail.com In this paper, we analyze the existence of possible correlation between
More informationChapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
More informationNOVEL APPROCH FOR OFT BASED WEB DOMAIN PREDICTION
Volume 3, No. 7, July 2012 Journal of Global Research in Computer Science RESEARCH ARTICAL Available Online at www.jgrcs.info NOVEL APPROCH FOR OFT BASED WEB DOMAIN PREDICTION A. Niky Singhai 1, B. Prof
More informationAbstract. Introduction
CODATA Prague Workshop Information Visualization, Presentation, and Design 29-31 March 2004 Abstract Goals of Analysis for Visualization and Visual Data Mining Tasks Thomas Nocke and Heidrun Schumann University
More informationRecord Deduplication By Evolutionary Means
Record Deduplication By Evolutionary Means Marco Modesto, Moisés G. de Carvalho, Walter dos Santos Departamento de Ciência da Computação Universidade Federal de Minas Gerais 31270-901 - Belo Horizonte
More informationInvestigating Clinical Care Pathways Correlated with Outcomes
Investigating Clinical Care Pathways Correlated with Outcomes Geetika T. Lakshmanan, Szabolcs Rozsnyai, Fei Wang IBM T. J. Watson Research Center, NY, USA August 2013 Outline Care Pathways Typical Challenges
More informationEvaluating Software Products - A Case Study
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATION: A CASE STUDY ON GAMES Özge Bengur 1 and Banu Günel 2 Informatics Institute, Middle East Technical University, Ankara, Turkey
More informationWeb Database Integration
Web Database Integration Wei Liu School of Information Renmin University of China Beijing, 100872, China gue2@ruc.edu.cn Xiaofeng Meng School of Information Renmin University of China Beijing, 100872,
More informationExtracting Information from Social Networks
Extracting Information from Social Networks Aggregating site information to get trends 1 Not limited to social networks Examples Google search logs: flu outbreaks We Feel Fine Bullying 2 Bullying Xu, Jun,
More informationBlog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
More informationMALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationThe Importance of Analytics
CIPHER Briefing The Importance of Analytics July 2014 Renting 1 machine for 1,000 hours will be nearly equivalent to renting 1,000 machines for 1 hour in the cloud. This will enable users and organizations
More informationIMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD
Journal homepage: www.mjret.in ISSN:2348-6953 IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Deepak Ramchandara Lad 1, Soumitra S. Das 2 Computer Dept. 12 Dr. D. Y. Patil School of Engineering,(Affiliated
More informationTOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
More informationORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for
More informationData Mining in Web Search Engine Optimization and User Assisted Rank Results
Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management
More informationCS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationGephi Tutorial Quick Start
Gephi Tutorial Welcome to this introduction tutorial. It will guide you to the basic steps of network visualization and manipulation in Gephi. Gephi version 0.7alpha2 was used to do this tutorial. Get
More informationAvailable online at www.sciencedirect.com Available online at www.sciencedirect.com. Advanced in Control Engineering and Information Science
Available online at www.sciencedirect.com Available online at www.sciencedirect.com Procedia Procedia Engineering Engineering 00 (2011) 15 (2011) 000 000 1822 1826 Procedia Engineering www.elsevier.com/locate/procedia
More informationA Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader
A Performance Evaluation of Open Source Graph Databases Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader Overview Motivation Options Evaluation Results Lessons Learned Moving Forward
More informationFiskP, DLLP and XML
XML-based Data Integration for Semantic Information Portals Patrick Lehti, Peter Fankhauser, Silvia von Stackelberg, Nitesh Shrestha Fraunhofer IPSI, Darmstadt, Germany lehti,fankhaus,sstackel@ipsi.fraunhofer.de
More informationText Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationCorporate Leaders Analytics and Network System (CLANS): Constructing and Mining Social Networks among Corporations and Business Elites in China
Corporate Leaders Analytics and Network System (CLANS): Constructing and Mining Social Networks among Corporations and Business Elites in China Yuanyuan Man, Shuai Wang, Yi Li, Yong Zhang, Long Cheng,
More informationAnalysis of an Outpatient Consultation Network in a Veterans Administration Health Care System Hospital.
Analysis of an Outpatient Consultation Network in a Veterans Administration Health Care System Hospital. Alon Ben-Ari. December 8, 2015 Introduction Many healtcare organizations face the challenge of scaling
More informationA Prototype System for Educational Data Warehousing and Mining 1
A Prototype System for Educational Data Warehousing and Mining 1 Nikolaos Dimokas, Nikolaos Mittas, Alexandros Nanopoulos, Lefteris Angelis Department of Informatics, Aristotle University of Thessaloniki
More informationMedical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu
Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
More informationSECURITY METRICS: MEASUREMENTS TO SUPPORT THE CONTINUED DEVELOPMENT OF INFORMATION SECURITY TECHNOLOGY
SECURITY METRICS: MEASUREMENTS TO SUPPORT THE CONTINUED DEVELOPMENT OF INFORMATION SECURITY TECHNOLOGY Shirley Radack, Editor Computer Security Division Information Technology Laboratory National Institute
More informationSpecific Usage of Visual Data Analysis Techniques
Specific Usage of Visual Data Analysis Techniques Snezana Savoska 1 and Suzana Loskovska 2 1 Faculty of Administration and Management of Information systems, Partizanska bb, 7000, Bitola, Republic of Macedonia
More informationEnhancing the Ranking of a Web Page in the Ocean of Data
Database Systems Journal vol. IV, no. 3/2013 3 Enhancing the Ranking of a Web Page in the Ocean of Data Hitesh KUMAR SHARMA University of Petroleum and Energy Studies, India hkshitesh@gmail.com In today
More informationUsing Big Data Analytics
Using Big Data Analytics to find your Competitive Advantage Alexander van Servellen a.vanservellen@elsevier.com 2013 Electronic Resources and Consortia (November 6 th, 2013) The Topic What is Big Data
More informationUsing bibliometric maps of science in a science policy context
Using bibliometric maps of science in a science policy context Ed Noyons ABSTRACT Within the context of science policy new softwares has been created for mapping, and for this reason it is necessary to
More informationLDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,
More informationTypo-Squatting: The Curse of Popularity
Typo-Squatting: The Curse of Popularity Alessandro Linari 1,2 Faye Mitchell 1 David Duce 1 Stephen Morris 2 1 Oxford Brookes University, 2 Nominet UK {alinari,frmitchell,daduce}@brookes.ac.uk, stephen.morris@nominet.org.uk
More informationA Color Placement Support System for Visualization Designs Based on Subjective Color Balance
A Color Placement Support System for Visualization Designs Based on Subjective Color Balance Eric Cooper and Katsuari Kamei College of Information Science and Engineering Ritsumeikan University Abstract:
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationApplication of Social Network Analysis to Collaborative Team Formation
Application of Social Network Analysis to Collaborative Team Formation Michelle Cheatham Kevin Cleereman Information Directorate Information Directorate AFRL AFRL WPAFB, OH 45433 WPAFB, OH 45433 michelle.cheatham@wpafb.af.mil
More informationRUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY
RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY I. INTRODUCTION According to the Common Core Standards (2010), Decisions or predictions are often based on data numbers
More informationHandling the Complexity of RDF Data: Combining List and Graph Visualization
Handling the Complexity of RDF Data: Combining List and Graph Visualization Philipp Heim and Jürgen Ziegler (University of Duisburg-Essen, Germany philipp.heim, juergen.ziegler@uni-due.de) Abstract: An
More informationFault Analysis in Software with the Data Interaction of Classes
, pp.189-196 http://dx.doi.org/10.14257/ijsia.2015.9.9.17 Fault Analysis in Software with the Data Interaction of Classes Yan Xiaobo 1 and Wang Yichen 2 1 Science & Technology on Reliability & Environmental
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationA Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
More informationNetwork Analysis of a Large Scale Open Source Project
2014 40th Euromicro Conference on Software Engineering and Advanced Applications Network Analysis of a Large Scale Open Source Project Alma Oručević-Alagić, Martin Höst Department of Computer Science,
More information