WHAT DEVELOPERS ARE TALKING ABOUT?
|
|
- Brett Ray
- 8 years ago
- Views:
Transcription
1 WHAT DEVELOPERS ARE TALKING ABOUT? AN ANALYSIS OF STACK OVERFLOW DATA 1. Abstract We implemented a methodology to analyze the textual content of Stack Overflow discussions. We used latent Dirichlet allocation (LDA), a statistical topic modeling technique, to automatically discover the main topics present in developer discussions. We analyzed the discovered topics, as well as their relationships and trends over time, to gain insights into the development community. 2. Topic Modelling 2.1 Topic Model - LDA A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Intuitively, given that a document is about a particular topic, we can expect particular words to appear in the document more or less frequently. Currently, Latent Dirichlet allocation (LDA), is one of the most common topic model in use. Basically, LDA is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. 2.2 MALLET MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. It includes sophisticated tools for document classification: efficient routines for converting text to "features", a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics. 3. Stack Overflow Data Set In this section, we discuss the relevance of Stack Overflow Data and the organization of data dump. 3.1 Stack Overflow In recent years, stackoverflow.com has become a major source of information for developer community. This Q & A site is quite popular among software developers and discussions on this website reflects upon the current usage or popularity of technologies. 3.2 Data Set Stack Overflow Data is publicly available under the Creative Commons license. The dataset is organised into five XML documents: badges.xml, comments.xml, posts.xml, users.xml and votes.xml. We were particularly interested in posts.xml, which contains posts information (questions
2 and answers with tags). We analysed the data set which spanned over 3 years from July, 2008 to June, Size of posts.xml for three years was around 10 GB and posed as a challenge in terms of parsing and processing it. 4. Method Overview Figure 1, depicts the various phases of data processing which are discussed in this section. FI G U R E 1 : O V E R A L L W O R K I N G O F PR O J EC T 4.1 Data extraction and pre-processing. Posts.xml was parsed using SAX XML Parser in python and content (title, tags and body) of the post were written to plain text files. Majority of the posts fall in the category of coding related discussions and hence contain code snippets. We remove all code snippets(if, while, etc) from the posts and utilize the remaining information in the post. Also, the content of the posts is present in html format and hence html tags were removed in order to get the actual text content of the post. 4.2 Topic Modeling The text files generated after data-extraction and preprocessing were then fed to the Topic Modeling component of MALLET. This package by default takes one input text file and performs topic modeling over that file, we modified this package to process large number of files and enable automatic discovery of files, given a directory name. Stop words list was also modified incrementally to include more technical stop words and to reduce noise. 4.3 Post processing Topic modeling was performed over quarterly data to generate trends discussed in section 6.3 and over the posts related to most popular topics to generated trends discussed in section Research Question 1 - Does a question in one topic trigger answers in another?
3 5.1 Motivation We investigate whether some topics are related to other topics in terms of questions and answers. This can help us identify closely-coupled topics, where questions in one topic tend to generate answers in seemingly unrelated topics. Moreover, this can help point out the cross-cutting areas of concerns for developers across different topics: problems so common that they span across multiple domains. For instance, if many questions regarding both mobile application development and web development generate answers related to user interfaces, it hints that user interface development is a cross-cutting concern faced by developers across two different platforms Solution The steps involved are: Finding Top K topics Generating Mappings(topic to question posts and question posts to answer posts) Get all answer posts for each topic in Top K posts Run topic modeling for each topic over the answer posts from above step Project the data collected in a comprehensible manner using data visualize 5.3. Results The entire space represents Stack Overflow. Each outer circle represents the topics and the size of each circle is proportional to the popularity of the topic. Each topic in-turn consists of lot of nested circles to represent the topics triggered from it. Again the size of each of them is proportional to the popularity of the topic. We have done some post processing to remove few obvious topics in each category which might not be of interest to our research question in context. For instance, topic java did generate topics like data structure, library usage and so on. But any language is bound to trigger activities in such areas. Hence, we added this step in post processing stage Analysis Few of the results shown above are surprising and very informative. Lets talk about the most trending topic Java. Java has triggered activities in areas like hibernate (ORM tool), SQL, etc. This information would be a good food for business analyst to figure out statistics like the most sought after ORM tool used with java, most used backend database with java and much more. One surprising stats is the blooming of github due to ruby on rails. Git hub is known to be gaining popularity in recent times, but this data analysis shows that more interactions have been triggered due to ruby-on-rails. Thus, this data analysis gives us a wholesome view of relation between various topics and to get an insight about the activities triggered in cross cutting areas of concern.
4 FI G U R E 2 : V I S U A L I Z A T I O N O F TH E RE S U L TS 6. How does developer interest change over time? 6.1. Motivation and Research Question By analyzing the rise and fall of interests in different topics, product developers will be able to assess the relative popularity of their products. This will also help in identifying marketing and research opportunities and trends. For example, if interest in.net Framework topic is rising while interest in Java topic is dropping, then companies, book publishers, and researchers might want to direct their attention to.net problems and challenges. The trend analysis also helps in reasoning about the rise or fall of certain topics in developer discussions.
5 6.2 Solution We divided the entire dataset into chunks of fixed time frames with each chunk covering posts over 3 months. Hence, we got 12 partitions over the entire data set covering 3 years in all. Topic modeling is performed for each chunk separately to find the trending topics. We also wish to analyze the temporal trends of topics. To do so, we define the impact of a topic z k in month m as where D(m) is the set of all posts over 3 months in context. The impact metric measures the relative θ (di, zk) proportion of posts related to that topic compared to the other topics in that particular time frame. represents the topic score of zk for the document di. All the statistics thus collected are projected in a 2- dimensional space where in, impact of a topic versus time is shown as below. We categorized the entire space into various meaningful categories so make our comparison more meaningful and comprehensible. Thus we had 4 different comparisons showing comparisons of different topics. Category 1 Programming Languages: Java, c++, Python Category 2 Web Technologies: JavaScript, php, Ruby-on-Rails, django and HTML/CSS Category 3 Application Development iphone application development and Android application development Category 4 General Trend Web Technologies, Server side Technologies and Mobile application development Last category is a more general comparison where in, we combined few topics put together to give a holistic idea of which layer of stack is trending more among developers. Thus, server-side technologies include.net framework, MySQL; web technologies include PHP, JavaScript, ruby-on-rails, HTML/CSS and django; and mobile technologies include iphone application development and Android application development. This analysis will give us an overall picture of the general trend among developers, whether developers are more interested in server-side development or web development or mobile application development.
6 Figure 3 Languages Java Green C++ - Blue Orange Python Light green PHP Figure 6 Yellow Web Technologies Blue Mobile Technologies Technology Domains Green Server Side Technologies 6.3. Results Above graph shows the comparison of web technologies over 3 years time frame with each plot representing a 3-month period. The graph shows that Web technologies is clearly the winner among the related all technology domains, as it remains the top player during most of the quarters. Thus the above analysis gives a good comparison of the popularity of various technologies among developers. It also helps us to reason out the highs and lows for a particular technology as explained in the next paragraph Real time events During this trend analysis, some of the technologies surfaced as trending at some particular point of time, this increased our curiosity to discover the reasons behind the sudden increase in the popularity of some of the technologies. Following is the list of trends which surfaced and their association with real time events:
7 1. iphone OS 2.0 SDK was released in March 2008 which led to iphone Application Development trending in Apr-June Rails version 2.3 (with major changes) was released in March 2009 leading to Ruby on Rails surfacing up in trends in Apr-June Adobe Flex version released in March 2010 and it started trending in Apr-June Challenges faced and Future Work One of the challenges which we faced was that of the Data size. The post.xml file was 10 GB. This took a lot of processing time. One more challenge which we faced was that, MALLET does not remove technical stop words from the data. In other words, there are technical words, which would not help in topic modeling, and are quite general in nature. To remove such kind of technical stopwords we used explicit codes. One more challenge which we faced was that of wrongly tagged questions. In stack over flow, the person who asks the questions has to tag it with keywords which are related to the question. There are chances of questions being wrongly tagged. Wrongly tagged questions create noise which is hard to eliminate. MALLET just gives us the set of keywords related to the topic, but it does not give us the name of the topic corresponding to the set of keywords. So, we had to manually go through all the keywords of a particular topic and name it accordingly. This process was arduous and time consuming. Also, there were few topics which had keywords which were general in nature, and made it difficult to name the specific topic. As future work we would like to extend our work to compare trends of specific technologies, and how interests in related/competing technologies differ over time. 8. Conclusion In this project, we implement a methodology to discover and quantify the topics and trends in Stack Overflow, a popular Q&A website with millions of active users. Our methodology is based on LDA, a widely-applied statistical topic model, which discovers topics from the textual content of Stack Overflow. We use various metrics to quantify the topics and their changes over time, which allows us to gain insight into the discussions in Stack Overflow. Our analysis provides an approximation of the wants and needs of the contemporary developer. Also, Our analysis can be used by the Stack Overflow team to better understand the content generated by its users. Knowing what topics are present, and which are popular at any given time, could help in the moderation of the website. 9. Source Code References [1] Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan, "What are developers talking about? An analysis of topics and trends in Stack Overflow", Empirical Software Engineering, 2012 [2] MALLET
What Are Developers Talking About? An Analysis of Topics and Trends in Stack Overflow
Empirical Software Engineering manuscript No. (will be inserted by the editor) What Are Developers Talking About? An Analysis of Topics and Trends in Stack Overflow Anton Barua Stephen W. Thomas Ahmed
More informationA Manual Categorization of Android App Development Issues on Stack Overflow
2014 IEEE International Conference on Software Maintenance and Evolution A Manual Categorization of Android App Development Issues on Stack Overflow Stefanie Beyer Software Engineering Research Group University
More informationDeposit Identification Utility and Visualization Tool
Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in
More informationIT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
More informationDATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
More informationWeb Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it
Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationVisualization of Semantic Windows with SciDB Integration
Visualization of Semantic Windows with SciDB Integration Hasan Tuna Icingir Department of Computer Science Brown University Providence, RI 02912 hti@cs.brown.edu February 6, 2013 Abstract Interactive Data
More information10CS73:Web Programming
10CS73:Web Programming Question Bank Fundamentals of Web: 1.What is WWW? 2. What are domain names? Explain domain name conversion with diagram 3.What are the difference between web browser and web server
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationLegal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION
Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,
More informationData Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority
More informationWeb Frameworks. web development done right. Course of Web Technologies A.A. 2010/2011 Valerio Maggio, PhD Student Prof.
Web Frameworks web development done right Course of Web Technologies A.A. 2010/2011 Valerio Maggio, PhD Student Prof.ssa Anna Corazza Outline 2 Web technologies evolution Web frameworks Design Principles
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationActive Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
More informationSenior Business Intelligence/Engineering Analyst
We are very interested in urgently hiring 3-4 current or recently graduated Computer Science graduate and/or undergraduate students and/or double majors. NetworkofOne is an online video content fund. We
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationGoogle Analytics for Robust Website Analytics. Deepika Verma, Depanwita Seal, Atul Pandey
1 Google Analytics for Robust Website Analytics Deepika Verma, Depanwita Seal, Atul Pandey 2 Table of Contents I. INTRODUCTION...3 II. Method for obtaining data for web analysis...3 III. Types of metrics
More informationWeb 2.0 Technology Overview. Lecture 8 GSL Peru 2014
Web 2.0 Technology Overview Lecture 8 GSL Peru 2014 Overview What is Web 2.0? Sites use technologies beyond static pages of earlier websites. Users interact and collaborate with one another Rich user experience
More information2. Distributed Handwriting Recognition. Abstract. 1. Introduction
XPEN: An XML Based Format for Distributed Online Handwriting Recognition A.P.Lenaghan, R.R.Malyan, School of Computing and Information Systems, Kingston University, UK {a.lenaghan,r.malyan}@kingston.ac.uk
More informationEducation. Relevant Courses
and s and s COMM/CS GPA: topsecret Developed application and designed logo: https://play.google.com/- store/apps/details?id=com.teamhex. colorbird Permanent Address 759 East 221 Street Apt. Website: 1B
More informationBraindumps.C2150-810.50 questions
Braindumps.C2150-810.50 questions Number: C2150-810 Passing Score: 800 Time Limit: 120 min File Version: 5.3 http://www.gratisexam.com/ -810 IBM Security AppScan Source Edition Implementation This is the
More informationPowerful. Flexible. Intelligent
Powerful. Flexible. Intelligent The Highland Business Research Quick Guide to new features in Released 20 th October 2009 Google has just announced a range of new features available to Google Analytics
More informationidashboards FOR SOLUTION PROVIDERS
idashboards FOR SOLUTION PROVIDERS The idashboards team was very flexible, investing considerable time working with our technical staff to come up with the perfect solution for us. Scott W. Ream, President,
More informationOperationalise Predictive Analytics
Operationalise Predictive Analytics Publish SPSS, Excel and R reports online Predict online using SPSS and R models Access models and reports via Android app Organise people and content into projects Monitor
More informationHexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
More informationADHAWK WORKS ADVERTISING ANALTICS ON A DASHBOARD
ADHAWK WORKS ADVERTISING ANALTICS ON A DASHBOARD Mrs. Vijayalaxmi M. 1, Anagha Kelkar 2, Neha Puthran 2, Sailee Devne 2 Vice Principal 1, B.E. Students 2, Department of Information Technology V.E.S Institute
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationDATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights
DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE Domains Data Mining Mining longitudinal and linked datasets from web and other
More informationSyllabus INFO-GB-3322. Design and Development of Web and Mobile Applications (Especially for Start Ups)
Syllabus INFO-GB-3322 Design and Development of Web and Mobile Applications (Especially for Start Ups) Spring 2015 Stern School of Business Norman White, KMEC 8-88 Email: nwhite@stern.nyu.edu Phone: 212-998
More informationTrollhättan, Sweden. http://keryx.se/ http://twitter.com/itpastorn/ http://itpastorn.blogspot.com/
Trollhättan, Sweden Lars Gunther is a web developer, computer science teacher and a pastor, who lives in Trollhättan, Sweden. He is the lead editor of several courses for WaSP Interact and invited expert
More informationCrossreader. Open Positions
Open Positions Crossreader CrossReader develops a Revolutionary product to enhance the mobile web experience by enabling content discovery and search in tablets and ereaders Job Title Team leader for application
More informationPentaho Data Mining Last Modified on January 22, 2007
Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org
More informationA Study of Web Log Analysis Using Clustering Techniques
A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept
More informationWhitepapers at Amikelive.com
Brief Overview view on Web Scripting Languages A. Web Scripting Languages This document will review popular web scripting languages[1,2,12] by evaluating its history and current trends. Scripting languages
More informationMENDIX FOR MOBILE APP DEVELOPMENT WHITE PAPER
MENDIX FOR MOBILE APP DEVELOPMENT WHITE PAPER TABLE OF CONTENTS Market Demand for Enterprise Mobile Mobile App Development Approaches Native Apps Mobile Web Apps Hybrid Apps Mendix Vision for Mobile App
More informationBuilding a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
More informationFIVE STEPS FOR DELIVERING SELF-SERVICE BUSINESS INTELLIGENCE TO EVERYONE CONTENTS
FIVE STEPS FOR DELIVERING SELF-SERVICE BUSINESS INTELLIGENCE TO EVERYONE Wayne Eckerson CONTENTS Know Your Business Users Create a Taxonomy of Information Requirements Map Users to Requirements Map User
More informationBEST WEB PROGRAMMING LANGUAGES TO LEARN ON YOUR OWN TIME
BEST WEB PROGRAMMING LANGUAGES TO LEARN ON YOUR OWN TIME System Analysis and Design S.Mohammad Taheri S.Hamed Moghimi Fall 92 1 CHOOSE A PROGRAMMING LANGUAGE FOR THE PROJECT 2 CHOOSE A PROGRAMMING LANGUAGE
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationYour Own Web Page: Quick and Dirty
Your Own Web Page: Quick and Dirty A Special Language for the Web In the early 1990 s web pages were mostly described using a special purpose language, called Hyper- Text Markup Language, HTML HTML provides
More informationBazaarvoice SEO implementation guide
Bazaarvoice SEO implementation guide TOC Contents Bazaarvoice SEO...3 The content you see is not what search engines see...3 SEO best practices for your review pages...3 Implement Bazaarvoice SEO...4 Verify
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationAutomatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationTOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
More informationDATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
More informationA Cost Effective GPS-GPRS Based Women Tracking System and Women Safety Application using Android Mobile
A Cost Effective GPS-GPRS Based Women Tracking System and Women Safety Application using Android Mobile Devendra Thorat, Kalpesh Dhumal, Aniket Sadaphule, Vikas Arade B.E Computer Engineering, Navsahyadri
More informationSyllabus INFO-UB-3322. Design and Development of Web and Mobile Applications (Especially for Start Ups)
Syllabus INFO-UB-3322 Design and Development of Web and Mobile Applications (Especially for Start Ups) Fall 2014 Stern School of Business Norman White, KMEC 8-88 Email: nwhite@stern.nyu.edu Phone: 212-998
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationVisualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR ccfong@umac.mo Abstract An e-government
More informationSEO Techniques for Higher Visibility LeadFormix Best Practices
Introduction How do people find you on the Internet? How will business prospects know where to find your product? Can people across geographies find your product or service if you only advertise locally?
More informationA review and analysis of technologies for developing web applications
A review and analysis of technologies for developing web applications Asha Mandava and Solomon Antony Murray state University Murray, Kentucky Abstract In this paper we review technologies useful for design
More informationA Comparative Study on Vega-HTTP & Popular Open-source Web-servers
A Comparative Study on Vega-HTTP & Popular Open-source Web-servers Happiest People. Happiest Customers Contents Abstract... 3 Introduction... 3 Performance Comparison... 4 Architecture... 5 Diagram...
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationStart up Jobs Germany FEB 2014
Start up Jobs y FEB 2014 JOB TITLE LANGUAGE LOCATION REQUIREMENTS REF Lead English Berlin Lots of PHP, Magento, Zend, 80H PHPUnit, MySQL Snr ERP English Berlin Navision ERP development, Version 80I 2009
More informationFinancial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationCOMPASS Database Work in 2014/15
COMPASS Database Work in 2014/15 Martin Bodlak Joined Czech Group, COMPASS Experiment at CERN 30 July 2015 COMPASS database servers in 888 PCCODB00 VIRTUAL ADDR PCCODB22 CLIENTS PCCODB21 PCCODB23 PCCODB20
More informationS3 Monitor Design and Implementation Plans
S 3 Monitor Version 1.0 Specifications and Integration Plan 1 Copyright c 2011 Hewlett Packard Copyright c 2011 Purdue University Permission is hereby granted, free of charge, to any person obtaining a
More informationPROVIDING INSIGHT FOR OPERATIONAL SUCCESS
idashboards for Operations Management PROVIDING INSIGHT FOR OPERATIONAL SUCCESS idashboards helped Precoat move from manual data mining and paper reports to a system that allows us to identify best practices
More informationwww.expaway.com Offerte del 13 giugno 2014
www.expaway.com Offerte del 13 giugno 2014 TR1414A - SOFTWARE DEVELOPER/ ARCHITECT (GERLINGEN) Location: Gerlingen (9 km west of Stuttgart) Field of operation: Consumer Services Founded: 2011 and German
More informationVAT: SE556981-2265. Phone: +46 (0) 733443238
Hello My name is Tord and I'm a freelancing programmer with six years of professional. I like native ios and Android programming as well as web development - including server configuration and maintenance.
More informationThe Analysis of Online Communities using Interactive Content-based Social Networks
The Analysis of Online Communities using Interactive Content-based Social Networks Anatoliy Gruzd Graduate School of Library and Information Science, University of Illinois at Urbana- Champaign, agruzd2@uiuc.edu
More informationGrow Revenues and Reduce Risk with Powerful Analytics Software
Grow Revenues and Reduce Risk with Powerful Analytics Software Overview Gaining knowledge through data selection, data exploration, model creation and predictive action is the key to increasing revenues,
More informationCo-evolving document collections and knowledge structures. CoDAK. Dr. Evgeny Knutov! ! (MSc Seminar Nov. 11 2013)
Co-evolving document collections and knowledge structures CoDAK Dr. Evgeny Knutov (MSc Seminar Nov. 11 2013) The CoDAK project CoDAK: Co-evolving Document Collections and Knowledge Structures AgentschapNL:
More informationSENIOR WEB DEVELOPER
SENIOR WEB DEVELOPER Belatrix s Software Developers play a vital role in helping our global clients to innovate and produce game changing software products. Using an Agile approach, Developers participate
More informationWelcome to the second half ofour orientation on Spotfire Administration.
Welcome to the second half ofour orientation on Spotfire Administration. In this presentation, I ll give a quick overview of the products that can be used to enhance a Spotfire environment: TIBCO Metrics,
More informationBoolean 101. The Recruiter s Guide to the Hunt for Top Talent AN EBOOK BY
Boolean 101 The Recruiter s Guide to the Hunt for Top Talent AN EBOOK BY Baffled by Boolean? We can help with that. Finding the right candidate for your open opportunity is no walk in the park. Sourcing
More informationCleaned Data. Recommendations
Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110
More informationData Visualization in Ext Js 3.4
White Paper Data Visualization in Ext Js 3.4 Ext JS is a client-side javascript framework for rapid development of cross-browser interactive Web applications using techniques such as Ajax, DHTML and DOM
More informationDeploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture
Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Apps and data source extensions with APIs Future white label, embed or integrate Power BI Deploy Intelligent
More informationChallenge 10 - Attack Visualization The Honeynet Project / Forensic Challenge 2011 / 2011-12-18
Challenge 10 - Attack Visualization The Honeynet Project / Forensic Challenge 2011 / 2011-12-18 Fabian Fischer Data Analysis and Visualization Group University of Konstanz Data Preprocessing with & I wanted
More informationHTML5. Turn this page to see Quick Guide of CTTC
Programming SharePoint 2013 Development Courses ASP.NET SQL TECHNOLGY TRAINING GUIDE Visual Studio PHP Programming Android App Programming HTML5 Jquery Your Training Partner in Cutting Edge Technologies
More informationTool Support for Inspecting the Code Quality of HPC Applications
Tool Support for Inspecting the Code Quality of HPC Applications Thomas Panas Dan Quinlan Richard Vuduc Center for Applied Scientific Computing Lawrence Livermore National Laboratory P.O. Box 808, L-550
More informationPolitecnico di Torino. Porto Institutional Repository
Politecnico di Torino Porto Institutional Repository [Proceeding] NEMICO: Mining network data through cloud-based data mining techniques Original Citation: Baralis E.; Cagliero L.; Cerquitelli T.; Chiusano
More informationAdaptive Context-sensitive Analysis for JavaScript
Adaptive Context-sensitive Analysis for JavaScript Shiyi Wei and Barbara G. Ryder Department of Computer Science Virginia Tech Blacksburg, VA, USA {wei, ryder}@cs.vt.edu Abstract Context sensitivity is
More informationUsing Ruby on Rails for Web Development. Introduction Guide to Ruby on Rails: An extensive roundup of 100 Ultimate Resources
Using Ruby on Rails for Web Development Introduction Guide to Ruby on Rails: An extensive roundup of 100 Ultimate Resources Ruby on Rails 100 Success Secrets Copyright 2008 Notice of rights All rights
More informationDEVELOPMENT OF AN ANALYSIS AND REPORTING TOOL FOR ORACLE FORMS SOURCE CODES
DEVELOPMENT OF AN ANALYSIS AND REPORTING TOOL FOR ORACLE FORMS SOURCE CODES by Çağatay YILDIRIM June, 2008 İZMİR CONTENTS Page PROJECT EXAMINATION RESULT FORM...ii ACKNOWLEDGEMENTS...iii ABSTRACT... iv
More information1.Full-Time Positions Marketing and Project Consultant
1.Full-Time Positions Marketing and Project Consultant As Oursky grows from a team of 3 to 35, we have scaled up our development, design, project management and QA team. While it was impressive that we
More informationIMPLEMENTING HEALTHCARE DASHBOARDS FOR OPERATIONAL SUCCESS
idashboards for Healthcare IMPLEMENTING HEALTHCARE DASHBOARDS FOR OPERATIONAL SUCCESS idashboards gives me access to real-time actionable data from all areas of the hospital. Internally, the adoption rate
More informationMachine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error
More informationFinding Execution Faults in Dynamic Web Application
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 5 (2014), pp. 445-452 International Research Publications House http://www. irphouse.com /ijict.htm Finding
More informationDMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
More informationPSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
More informationPoS(ISGC 2013)021. SCALA: A Framework for Graphical Operations for irods. Wataru Takase KEK E-mail: wataru.takase@kek.jp
SCALA: A Framework for Graphical Operations for irods KEK E-mail: wataru.takase@kek.jp Adil Hasan University of Liverpool E-mail: adilhasan2@gmail.com Yoshimi Iida KEK E-mail: yoshimi.iida@kek.jp Francesca
More informationTechnology Services...Ahead of Times. Enterprise Application on ipad
Technology Services...Ahead of Times Enterprise Application on ipad Diaspark, 60/2 Babu Labhchand Chhajlani Marg, Indore M.P. (India) 452009 Overview This white paper talks about the capabilities of ipad
More informationModel Deployment. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/
Model Deployment Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Model Deployment Creation of the model is generally not the end of the project.
More informationCiteSeer x in the Cloud
Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar
More informationI'M MICHAL I'M JANKOWSKI
I'M MICHAL I'M JANKOWSKI.NET Enthusiast & Professional Developer.NET Enthusiast & Professional Developer ABOUT ME A small introduction about myself Michal Jankowski C# Desktop Developer With Passion Determined
More informationPREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS
PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao ABSTRACT Department of Computer Engineering, Fr.
More informationA Platform Independent Testing Tool for Automated Testing of Web Applications
A Platform Independent Testing Tool for Automated Testing of Web Applications December 10, 2009 Abstract Increasing complexity of web applications and their dependency on numerous web technologies has
More informationIntroduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.
Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus
More informationAPP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS
APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS This article looks into the benefits of using the Platform as a Service paradigm to develop applications on the cloud. It also compares a few top PaaS providers
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More informationWorking with telecommunications
Working with telecommunications Minimizing churn in the telecommunications industry Contents: 1 Churn analysis using data mining 2 Customer churn analysis with IBM SPSS Modeler 3 Types of analysis 3 Feature
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationPROVIDING INSIGHT FOR OPERATIONAL SUCCESS
idashboards for Financial Services PROVIDING INSIGHT FOR OPERATIONAL SUCCESS We had a huge leap in account openings once staff could see how their sales compared to other staff and branches. They now have
More information