RECOMMENDATION SYSTEM
|
|
- Alan Barnett
- 8 years ago
- Views:
Transcription
1 RECOMMENDATION SYSTEM October 8, 2013 Team Members: 1) Duygu Kabakcı, , 2) Işınsu Katırcıoğlu, , 3) Sıla Kaya, , 4) Mehmet Koray Kocakaya, , 1. Problem Definition and Background Information Recommendation systems are software tools or knowledge discovery techniques that provide suggestions for items to a user. Items can be music, books, movies, people or social groups. The first purpose of this system is to recommend items that a user is likely to be interested in. The second purpose is to learn more about their preferences and constraints. The main source of our data is user behavior. Data collection from user behavior can be classified as either implicit or explicit [2]. Implicit data collection consists of observing the items that a user views in an online store, analyzing item/user viewing times, keeping a record of the items that a user purchases online, obtaining a list of items that a user has listened to or watched on his/her computer and analyzing the user s social network and discovering similar likes and dislikes. On the other hand, explicit data collection consists of asking a user to rate an item on a sliding scale, asking a user to rank a collection of items from favorite to least favorite, presenting two items to a user and asking him/her to choose the better one of them and asking a user to create a list of items that he/she likes [1]. When a user is asked to give an opinion through a rating scale, the following forms can be considered: Numerical ratings such as 1-5 stars, ordinal ratings such as strongly agree, agree, neutral, disagree, strongly disagree, binary ratings that model choices in which the user is simply asked to decide if a certain item is good or bad, unary ratings that can indicate whether a user has observed or purchased an item. In unary cases, the absence of a rating indicates that we have no information relating the user to the item. The recommendations are usually personalized so that each user receives specific suggestions. Besides, there are non-personalized recommendations which gather information based on general popularity. This is done through Active Learning (AL), Data mining, Machine Learning and Artificial Intelligence which provide diverse knowledge for constructing probabilistic results. The aim is to help users become more aware of their preferences and feed new data into the system for further analysis. The vast majority of today s enterprises Page 1 of 10
2 make use of recommender systems in the fields of commerce, Internet streaming and social networking services. The leading projects and contributions about recommendation systems belong to Amazon, Netflix, Pandora Radio, YouTube, Facebook, Last.fm, 8tracks, ebay, LinkedIn, MySpace, Internet Movie Database (IMDb) and so forth. Some of their recommendation approach will be discussed later on. There are two main paradigms studied for the problem of recommending items: Contentbased and collaborative filtering paradigms. Content-based filtering paradigm is based on matching the characteristics of an item to the preferences of a user. In order to come up with an accurate match, attributes of the item must be specified. An attribute defines a unique feature that enables us to classify an item. The attributes are then represented in a weighted vector for each item. The weight corresponds to the degree of importance that a user assigns to a feature. The next thing to consider is the user profile. User profile describes the detailed model of tastes and preferences of a user. In addition to the user profile, the user is expected to give direct feedback through rating the item offered by the system. This can be used to assign higher or lower weights to certain attributes of the item. The entire system using this approach improves as better personalization is maintained [2]. A simple movie example can illustrate the content-based approach. If a user rates a movie from Sci-Fi genre or includes this genre in his/her preferences then a movie from this genre is recommended to the user. The well-known example of contentbased filtering is Music Genome Project of Pandora Radio [5]. Pandora s recommendations are based on the inherent qualities of the music. This comprehensive project uses mathematical algorithms to organize the songs according to attributes such as melody, harmony, lyrics, orchestration, vocal character, gender of the lead vocalist, level of distortion on a specific instrument and so on. Pandora calls these musical attributes genes and each gene is assigned to a number between 0-5. The database of Pandora consists of thousands of songs classified according to hundreds of such attributes. The second well-known filtering method is collaborative filtering. It is based on collecting and analyzing huge amount of information on users behaviors, activities, preferences and similarities to other users. Therefore, it is referred as people-to-people correlation. Similar users are determined according to the similarities between their evaluations of certain attributes for a specific group of items. Analysis concerning user s past behavior has extensive applications such as previously purchased item, recently watched movie (or music preference) and search history. One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by Amazon.com recommendation system. It is based on customizing the browsing experiences of users instead of matching similar users. Amazon.com algorithm can support massive data sets and millions of customer information. The algorithm matches each of the user s purchased and rated items and puts them in a Page 2 of 10
3 recommendation list [4]. From this list, it is more likely to construct a product-to-product matrix which scans through all pairs and assigns a similarity metric to them. Last.fm is another example of the collaborative filtering system. Last.fm recommends music through comparing the listening habits of similar users. It collects the profile information of each user by keeping track of the songs that the user has listened. Their data set is also rich in tag information. Moreover, Facebook, MySpace, LinkedIn and other social networks use collaborative filtering to recommend new friends, groups and other social connections through examining the network of connections between a user and his/her friends. It has been experienced that both the content-based and the collaborative approaches have some downsides. The scope of content-based approach is limited and it may suffer from little information of user. The collaborative approach requires large amount of data to be accurate but this might be a problem at the very beginning of the system when the data set is insufficient. Therefore, hybrid models are widely used today. They provide combination of previously mentioned approaches. We will be using hybrid models in order to contribute to the existing systems. There is a wide range of methods proposed to solve and improve the existing recommendation systems. One such method is constructing a rating matrix R that stores the vote for an item for each user-item pair. Accordingly, there is a non-negative number r in the cells of the matrix. However, there might be many missing values due to the insufficient rating. Therefore, predictions of ratings are calculated by finding the similarities between users. A rating of some user u for an item i is deduced from ratings of that item by users with high similarity to the user u [1]. For this purpose, neighborhood based model and regularized matrix factorization model are used together in hybrid methods [8]. Neighborhood formation scheme usually uses Pearson correlation which is currently weak for large and sparse databases [7]. Another recent method is Association Rule Mining which is a technique in data mining. The purpose is to find relation among different items in databases such as the books borrowed together from a library or the items bought together in a supermarket [3]. Apart from these methods, Bayesian classifier, uncertaintybased active learning, error-based active learning [6], cluster analysis, decision trees and artificial neural networks are used to group similar items and estimate the probability that a user will prefer them. Netflix Prize, which is an open competition for the best collaborative filtering algorithm to predict user ratings for films, encourages the teams to improve the existing methods and reduce the root mean square error to yield more accurate predictions. In this project, we will be working on the recommendation system of TTNET Müzik. TTNET Müzik is a music downloading and listening service supported by TTNET. This service provides TTNET ADSL users with the legal right of accessing music files. It also enables the users to download music in mp3 format in case of purchasing music packages. The service provides the Page 3 of 10
4 opportunity of displaying latest news and music lists along with going over the lyrics and album information. Their recommendation system allows creating and sharing personal music lists. A user can also view the music lists generated by other members. This is done through suggesting relevant lists to users. Our goal is to use and extend the existing recommender system of TTNET Müzik. Our targeted user profile is ordinary web or mobile users. According to 2012 data, it was expected to affect more than 3.5 million TTNET Müzik users monthly with a more efficient recommendation system. This data shows that there are several millions of target users for this end product. Moreover, another expectation was to increase the number of times a track is played up to 67 million. TTNET Müzik web application is only available in Turkey. Due to the IP restrictions, its web application is not accessible from other countries. But its mobile application has no such restrictions and can be used worldwide. 2. Significance of the Problem and Motivation Recommendation systems suffer from three major problems; cold start, huge data and sparsity. Cold start is a potential problem in computer-based information systems which involve a degree of automated data modelling. Specifically, it concerns the issue that the system cannot draw any inferences for users or items about which it has not yet gathered sufficient information. There are basically two types of this problem. The first one occurs when a new user starts using the system. The system knows little about their preferences and it is necessary to pick some training points for rating so that the system begins learning what the user wants [6]. The second type occurs when a new product is introduced to the system. Again it is essential to rate this item in order to quickly improve the prediction accuracy about it [6]. The next problem is huge data. In many of the environments that these systems make recommendations in, there are millions of users and products. It is quite a challenge to produce high quality recommendations and perform many recommendations per second for millions of customers and items. Thus, a large amount of computation power is often necessary to calculate recommendations. The runtime and memory usage of existing algorithms can still be improved. The third problem is sparsity. Users that are very active contribute to the rating for a few number of items available in the database and even very popular items are rated by only a few number of users available in the database. Because of sparsity, it is possible that the similarity between two users cannot be defined, rendering collaborative filtering useless. The rating matrix R, which is mentioned in the previous section, might be full of missing entries since many users vote for just a few of the items. It is obvious that the companies having excellent recommendation systems are those with a lot of user data: Google, Amazon and Netflix and so forth. Page 4 of 10
5 We would like to work on recommendation systems because it is currently a hot topic. Of the main reasons why we chose this project, two in particular stand out. To begin with, a lot of web or mobile applications use this system to improve their popularity in social area. User-oriented technologies are more important parts of the social area. The new trend is to get feedback from the users and perform active learning on their behaviors or habits. This type of interaction resembles real world and it is more efficient to increase the popularity of items through making suggestions. The second reason is academic. We are interested in machine learning, artificial intelligence and data mining, so we would like to gain experience in these fields. The improvements in recommendation systems include a closer look at certain algorithms and mathematical notions. We believe that this subject will be a milestone for a quick introduction to academic researches. There are many big companies which handle this problem effectively as we explained in the first part. The conflict for the existing systems is the big data. Having a large dataset is important to have more accurate predictions. But big data means a more complex set of computations. At this point, the existing algorithms suffer from the big data. Besides, the root mean square error for the predictions can be further improved, therefore there are many open problems in this area such as context-awareness. Apart from this, some companies still don't have effective recommendation system such as TTNET Müzik website and mobile application. We are planning to improve TTNET's recommendation features. After this project is completed, TTNET Müzik will have better suggestions to its users. In this way, both the company and users are affected positively. TTNET Müzik will be more successful at urging people to buy and download songs legally. We will also benefit from learning many new algorithms and platforms. The data of our project are retrieved from an existing commercial product. We aim to develop the recommendation algorithm of this product. The main steps we will follow are literature survey, documentation, algorithm design, algorithm improvement, testing and deployment. After the project is completed, we want to turn it into an academic work if we can come a long way and bring a new perspective to this field. 3. Draft Project Plan Our aim in this project is to improve the existing recommendation system in such a way that the major problems discussed in the previous section will almost be overcome. The existing system is at the moment TTNET Müzik application but it will become definite after the meeting with the advisors. In order to keep track of contemporary models and software tools, we will be using Neo4j, NoSQL, and Mahout. Neo4j is an open-source graph database implemented in Java. Its community edition is licensed under GNU General Public License. It is much faster than relational databases. NoSQL provides is a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational databases. We will use it because it Page 5 of 10
6 is highly optimized for big data and real-time web applications. Apache Mahout is a tool for collaborative filtering, classification and clustering. It consists of machine learning algorithms. The suggested model for this project is MapReduce. It is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. We will also study Google s mapreduce algorithm, other learning algorithms such as SVM and as a starting point we will examine the neighborhood based models and hybrid models. Later on, we will study graph algorithms and databases which have evolved recently. Finally, we are aiming to make contributions to the problem of higher dimensions due to context. The end product won t be a standalone product. TTNET already uses a recommendation system, but we will develop it because the current algorithm is not sufficient. After we complete the project, there will not be a major user interface oriented difference in TTNET Müzik application but the users will receive more realistic and reliable recommendations. The distribution of the major tasks among project members is shown in the table below: Task Literature Survey & Documentation Database Design UI Interactions Algorithm Design Testing and Deployment Error Estimation Members Duygu Kabakcı, Işınsu Katırcıoğlu, Sıla Kaya, Koray Kocakaya Duygu Kabakcı, Işınsu Katırcıoğlu Sıla Kaya, Koray Kocakaya Duygu Kabakcı, Işınsu Katırcıoğlu, Sıla Kaya, Koray Kocakaya Duygu Kabakcı, Işınsu Katırcıoğlu, Sıla Kaya, Koray Kocakaya Duygu Kabakcı, Işınsu Katırcıoğlu, Sıla Kaya, Koray Kocakaya The diagrams below are all designed according to the putative TTNET Müzik dataset and the intended recommendation system. TTNET Müzik provides users to play songs and to apply basic functionalities on these songs (Figure 1). Users can rate an item and like an item to create their own profile sufficiently so that she/he can get accurate recommendations. Users can also search items by keywords such as album title, singer or released year. If users want to purchase an item, he/she can download it legally on the system. Moreover, TTNET Müzik gives an opportunity to tag items. Users can generate lists by gathering songs together which provide significant data to our algorithm for classifying items and estimating similar songs. These music lists can also be viewed by other users to discover new songs. All these transactions give insight about user's taste of music so that the system makes more reliable recommendations. On the other hand, operators can add or remove items (songs or albums) to and from the system. They can also classify items according to their contents before uploading them to the system. Another important operator Page 6 of 10
7 mission is collecting data from user activities for recommendation algorithm. They gather all the data from user transactions such as rating, liking, purchasing an item so on. Operators make suggestions to users based on the results of our estimation system. Figure 1: Use Case Diagram The UI package of the product that our recommendation system will be integrated to already exists and we will not be dealing with UI design unless we encounter some incompatibility. But we will be dealing with UI and Business Logic interactions. The View module must be consistent with PC, Smart Phone and Tablet PC. The MVC architecture is roughly shown in Figure 2. The Business Logic package mainly consists of our algorithms. Since we will be working on big data, graph algorithm component is necessary. It provides faster computation. Machine learning algorithm component allows us to learn the behavior of user and train the user data to construct a rule. This component might include Bayesian statistics, SVM, decision trees or nearest neighbor approach. Accuracy estimation algorithm component allows to derive the degree of belief in our method. Mahout library component provides clustering, classification and pattern mining Page 7 of 10
8 algorithms. All of these algorithms can deal with big data that reside on our NoSQL database in the database package. In case that there are previously constructed relational databases, we will convert them into non-relational databases. Because they can express queries as traversals and support semi-structured information as well. We will use Neo4j as a NoSQL database and we are planning to implement our algorithm packages in Java unless otherwise specified. Figure 2: Component Diagram We can use TTNET Müzik application via PC, Tablet PC and Smart Phone (Android and IOS). Our final algorithm will be compatible with these three hardware components (Figure 3). The PC product must be compatible with Unix/Windows operating systems. We are going to use map-reduce as a programming model, Neo4j as a graph database, Mahout to produce free implementations and NoSQL to store big data. Our large scale databases will be located in the server. Due to the big data size, we will be provided a computer cluster to which other hardware components belong and interact. Moreover, we are going to implement our project with Java unless otherwise specified. Page 8 of 10
9 Figure 3: Deployment Diagram 4. Support In this project, we will be supported by Prof. Dr. İsmail Hakkı Toroslu, Dr. Güven Fidan and AGMLab. Prof. Dr. İsmail Hakkı Toroslu is a faculty member at the Department of Computer Engineering, METU. He has acknowledged expertise in Databases, Algorithms, Data Mining, Web Mining, Graph Algorithms, Logic Programming and Query Processing Algorithms. He is the leading advisor in this project and we will get academic support from him. Contact Info: toroslu@ceng.metu.edu.tr Dr. Güven Fidan is the Co-Founder of AGMLab. He and his colleagues in R&D team will provide us the big data that we will work on. AGMLab team will also indicate the requirements for developing the recommendation system for the given data. Contact Info: guven.fidan@agmlab.com Address: ODTÜ Teknokent, Silikon Binası Kat: 1 No: ODTÜ Ankara / TÜRKIYE Tel: All rights of the end product are reserved. We will make contributions to an existing recommendation system and further regulations about the intellectual property rights of the endproduct will be mentioned in SRS report. Page 9 of 10
10 5. References [1] Herlocker, J., Konstan, J. A., & Riedl, J. (2002). An Empirical Analysis of Design Choices in Neighborhood-Based Collaborative Filtering Algorithms. Information Retrieval, 5(4), doi: /a: [2] Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems, 22(1), doi: / [3] Jafarkarimi, H., Sim, A. T. H., & Saadatdoost, R. (2012). A Naïve Recommendation Model for Large Databases. International Journal of Information and Education Technology, 2(2), doi: /ijiet.2012.v2.113 [4] Linden, G., Smith, B., & York, J. (2003). Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing, 7(1), doi: /mic [5] Pandora. (2005). About The Music Genome Project. Retrieved October 8, 2013, from [6] Rubens, N., Kaplan, D., & Sugiyama, M. (2011). Active Learning in Recommender Systems. In P. B. Kantor, F. Ricci, L. Rokach & B. Shapira (Eds.), Recommender Systems Handbook (pp ). New York, USA: Springer. [7] Sarwar, B. M., Karypis, G., Konstan, J., & Riedl, J. (2002). Recommender Systems for Large-scale E-Commerce: Scalable Neighborhood Formation Using Clustering. Paper presented at the 5th International Conference on Computer and Information Technology (ICCIT), December Retrieved from [8] Töcher, A., Jahrer, M., & Legenstein, R. (2008). Improved Neighborhood-Based Algorithms for Large-Scale Recommender Systems. Paper presented at the 2nd Netflix-KDD Workshop, 24 August Retrieved from Page 10 of 10
The Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
More informationRecommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek
Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...
More informationRecommendation Tool Using Collaborative Filtering
Recommendation Tool Using Collaborative Filtering Aditya Mandhare 1, Soniya Nemade 2, M.Kiruthika 3 Student, Computer Engineering Department, FCRIT, Vashi, India 1 Student, Computer Engineering Department,
More informationA NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 7, Issue 1, Jan-Feb 2016, pp. 07-16, Article ID: IJARET_07_01_002 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=7&itype=1
More informationKeywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
More information4, 2, 2014 ISSN: 2277 128X
Volume 4, Issue 2, February 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Recommendation
More informationIntelligent Web Techniques Web Personalization
Intelligent Web Techniques Web Personalization Ling Tong Kiong (3089634) Intelligent Web Systems Assignment 1 RMIT University S3089634@student.rmit.edu.au ABSTRACT Web personalization is one of the most
More informationA Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering
A Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering GRADUATE PROJECT TECHNICAL REPORT Submitted to the Faculty of The School of Engineering & Computing Sciences
More informationBUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE
BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE Alex Lin Senior Architect Intelligent Mining alin@intelligentmining.com Outline Predictive modeling methodology k-nearest Neighbor
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationChallenges and Opportunities in Data Mining: Personalization
Challenges and Opportunities in Data Mining: Big Data, Predictive User Modeling, and Personalization Bamshad Mobasher School of Computing DePaul University, April 20, 2012 Google Trends: Data Mining vs.
More informationCollaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains
More informationA Web Recommender System for Recommending, Predicting and Personalizing Music Playlists
A Web Recommender System for Recommending, Predicting and Personalizing Music Playlists Zeina Chedrawy 1, Syed Sibte Raza Abidi 1 1 Faculty of Computer Science, Dalhousie University, Halifax, Canada {chedrawy,
More informationSemantically Enhanced Web Personalization Approaches and Techniques
Semantically Enhanced Web Personalization Approaches and Techniques Dario Vuljani, Lidia Rovan, Mirta Baranovi Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, HR-10000 Zagreb,
More informationContact Recommendations from Aggegrated On-Line Activity
Contact Recommendations from Aggegrated On-Line Activity Abigail Gertner, Justin Richer, and Thomas Bartee The MITRE Corporation 202 Burlington Road, Bedford, MA 01730 {gertner,jricher,tbartee}@mitre.org
More informationBIG DATA TOOLS. Top 10 open source technologies for Big Data
BIG DATA TOOLS Top 10 open source technologies for Big Data We are in an ever expanding marketplace!!! With shorter product lifecycles, evolving customer behavior and an economy that travels at the speed
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationVolume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image
More informationRole of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
More informationIMAV: An Intelligent Multi-Agent Model Based on Cloud Computing for Resource Virtualization
2011 International Conference on Information and Electronics Engineering IPCSIT vol.6 (2011) (2011) IACSIT Press, Singapore IMAV: An Intelligent Multi-Agent Model Based on Cloud Computing for Resource
More informationApplication and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang 2011-10
Application and practice of parallel cloud computing in ISP Guangzhou Institute of China Telecom Zhilan Huang 2011-10 Outline Mass data management problem Applications of parallel cloud computing in ISPs
More informationCollaborative Filteration for user behavior Prediction using Big Data Hadoop
Collaborative Filteration for user behavior Prediction using Big Data Hadoop Jayshri Manohar Somwanshi, Prof. Y. B. Gurav Department of Computer Engineering, P.V.P.I.T Bavdhan, Pune, Maharashtra, India
More informationWeb analytics: Data Collected via the Internet
Database Marketing Fall 2016 Web analytics (incl real-time data) Collaborative filtering Facebook advertising Mobile marketing Slide set 8 1 Web analytics: Data Collected via the Internet Customers can
More informationEnsemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008
Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles
More informationMobile Storage and Search Engine of Information Oriented to Food Cloud
Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:
More informationA Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com
A Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com Abstract. In today's competitive environment, you only have a few seconds to help site visitors understand that you
More informationOPINION MINING IN PRODUCT REVIEW SYSTEM USING BIG DATA TECHNOLOGY HADOOP
OPINION MINING IN PRODUCT REVIEW SYSTEM USING BIG DATA TECHNOLOGY HADOOP 1 KALYANKUMAR B WADDAR, 2 K SRINIVASA 1 P G Student, S.I.T Tumkur, 2 Assistant Professor S.I.T Tumkur Abstract- Product Review System
More informationText Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
More informationUtility of Distrust in Online Recommender Systems
Utility of in Online Recommender Systems Capstone Project Report Uma Nalluri Computing & Software Systems Institute of Technology Univ. of Washington, Tacoma unalluri@u.washington.edu Committee: nkur Teredesai
More informationComposite Data Virtualization Composite Data Virtualization And NOSQL Data Stores
Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE...
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationAchieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services
Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services Ms. M. Subha #1, Mr. K. Saravanan *2 # Student, * Assistant Professor Department of Computer Science and Engineering Regional
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationSECURE BACKUP SYSTEM DESKTOP AND MOBILE-PHONE SECURE BACKUP SYSTEM HOSTED ON A STORAGE CLOUD
SECURE BACKUP SYSTEM DESKTOP AND MOBILE-PHONE SECURE BACKUP SYSTEM HOSTED ON A STORAGE CLOUD The Project Team AGENDA Introduction to cloud storage. Traditional backup solutions problems. Objectives of
More informationA THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING
A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING Ahmet Selman BOZKIR Hacettepe University Computer Engineering Department, Ankara, Turkey selman@cs.hacettepe.edu.tr Ebru Akcapinar
More informationWhere is... How do I get to...
Big Data, Fast Data, Spatial Data Making Sense of Location Data in a Smart City Hans Viehmann Product Manager EMEA ORACLE Corporation August 19, 2015 Copyright 2014, Oracle and/or its affiliates. All rights
More informationAutomated Collaborative Filtering Applications for Online Recruitment Services
Automated Collaborative Filtering Applications for Online Recruitment Services Rachael Rafter, Keith Bradley, Barry Smyth Smart Media Institute, Department of Computer Science, University College Dublin,
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationImplementing Graph Pattern Mining for Big Data in the Cloud
Implementing Graph Pattern Mining for Big Data in the Cloud Chandana Ojah M.Tech in Computer Science & Engineering Department of Computer Science & Engineering, PES College of Engineering, Mandya Ojah.chandana@gmail.com
More informationIT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users
1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data
More informationUser Data Analytics and Recommender System for Discovery Engine
User Data Analytics and Recommender System for Discovery Engine Yu Wang Master of Science Thesis Stockholm, Sweden 2013 TRITA- ICT- EX- 2013: 88 User Data Analytics and Recommender System for Discovery
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationBig Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
More informationJournal of Chemical and Pharmaceutical Research, 2015, 7(3):1388-1392. Research Article. E-commerce recommendation system on cloud computing
Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2015, 7(3):1388-1392 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 E-commerce recommendation system on cloud computing
More informationINTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
More informationMonitoring Web Browsing Habits of User Using Web Log Analysis and Role-Based Web Accessing Control. Phudinan Singkhamfu, Parinya Suwanasrikham
Monitoring Web Browsing Habits of User Using Web Log Analysis and Role-Based Web Accessing Control Phudinan Singkhamfu, Parinya Suwanasrikham Chiang Mai University, Thailand 0659 The Asian Conference on
More informationTowards Effective Recommendation of Social Data across Social Networking Sites
Towards Effective Recommendation of Social Data across Social Networking Sites Yuan Wang 1,JieZhang 2, and Julita Vassileva 1 1 Department of Computer Science, University of Saskatchewan, Canada {yuw193,jiv}@cs.usask.ca
More informationBusiness Challenges and Research Directions of Management Analytics in the Big Data Era
Business Challenges and Research Directions of Management Analytics in the Big Data Era Abstract Big data analytics have been embraced as a disruptive technology that will reshape business intelligence,
More informationHow To Understand And Understand The Theory Of Computational Finance
This course consists of three separate modules. Coordinator: Omiros Papaspiliopoulos Module I: Machine Learning in Finance Lecturer: Argimiro Arratia, Universitat Politecnica de Catalunya and BGSE Overview
More informationarxiv:1506.04135v1 [cs.ir] 12 Jun 2015
Reducing offline evaluation bias of collaborative filtering algorithms Arnaud de Myttenaere 1,2, Boris Golden 1, Bénédicte Le Grand 3 & Fabrice Rossi 2 arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 1 - Viadeo
More informationMammoth Scale Machine Learning!
Mammoth Scale Machine Learning! Speaker: Robin Anil, Apache Mahout PMC Member! OSCON"10! Portland, OR! July 2010! Quick Show of Hands!# Are you fascinated about ML?!# Have you used ML?!# Do you have Gigabytes
More informationNetworks Using Data Mining
Recommending People in Social Networks Using Data Mining Slah A. Alsaleh MIT. Hons. (Griffith University, Australia), BEd. (CS) (King Saud University, Saudi Araibia) Thesis submitted in fulfilment of the
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationNoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationIPTV Recommender Systems. Paolo Cremonesi
IPTV Recommender Systems Paolo Cremonesi Agenda 2 IPTV architecture Recommender algorithms Evaluation of different algorithms Multi-model systems Valentino Rossi 3 IPTV architecture 4 Live TV Set-top-box
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationBig Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationPersonalization of Web Search With Protected Privacy
Personalization of Web Search With Protected Privacy S.S DIVYA, R.RUBINI,P.EZHIL Final year, Information Technology,KarpagaVinayaga College Engineering and Technology, Kanchipuram [D.t] Final year, Information
More informationDistributed Computing and Big Data: Hadoop and MapReduce
Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:
More informationInternational Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: editor@ijermt.org November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
More informationOutline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
More informationIntroduction to Recommender Systems Handbook
Chapter 1 Introduction to Recommender Systems Handbook Francesco Ricci, Lior Rokach and Bracha Shapira Abstract Recommender Systems (RSs) are software tools and techniques providing suggestions for items
More informationMining Signatures in Healthcare Data Based on Event Sequences and its Applications
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1
More informationData Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationUnderstanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationKnowledge Pump: Community-centered Collaborative Filtering
Knowledge Pump: Community-centered Collaborative Filtering Natalie Glance, Damián Arregui and Manfred Dardenne Xerox Research Centre Europe, Grenoble Laboratory October 7, 1997 Abstract This article proposes
More informationImportance of Online Product Reviews from a Consumer s Perspective
Advances in Economics and Business 1(1): 1-5, 2013 DOI: 10.13189/aeb.2013.010101 http://www.hrpub.org Importance of Online Product Reviews from a Consumer s Perspective Georg Lackermair 1,2, Daniel Kailer
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationUser Behavior Analysis Based On Predictive Recommendation System for E-Learning Portal
Abstract ISSN: 2348 9510 User Behavior Analysis Based On Predictive Recommendation System for E-Learning Portal Toshi Sharma Department of CSE Truba College of Engineering & Technology Indore, India toshishm.25@gmail.com
More informationVisualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR ccfong@umac.mo Abstract An e-government
More informationUsing LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.
White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,
More informationMonitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
More informationAutomatic measurement of Social Media Use
Automatic measurement of Social Media Use Iwan Timmer University of Twente P.O. Box 217, 7500AE Enschede The Netherlands i.r.timmer@student.utwente.nl ABSTRACT Today Social Media is not only used for personal
More informationAlexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
More informationRECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE
RECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE Reena Pagare and Anita Shinde Department of Computer Engineering, Pune University M. I. T. College Of Engineering Pune India ABSTRACT Many clients
More informationIntinno: A Web Integrated Digital Library and Learning Content Management System
Intinno: A Web Integrated Digital Library and Learning Content Management System Synopsis of the Thesis to be submitted in Partial Fulfillment of the Requirements for the Award of the Degree of Master
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationA Survey on Web Research for Data Mining
A Survey on Web Research for Data Mining Gaurav Saini 1 gauravhpror@gmail.com 1 Abstract Web mining is the application of data mining techniques to extract knowledge from web data, including web documents,
More informationBIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationOnline Content Optimization Using Hadoop. Jyoti Ahuja Dec 20 2011
Online Content Optimization Using Hadoop Jyoti Ahuja Dec 20 2011 What do we do? Deliver right CONTENT to the right USER at the right TIME o Effectively and pro-actively learn from user interactions with
More informationThe big data revolution
The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing
More informationA Practical Approach to Process Streaming Data using Graph Database
A Practical Approach to Process Streaming Data using Graph Database Mukul Sharma Research Scholar Department of Computer Science & Engineering SBCET, Jaipur, Rajasthan, India ABSTRACT In today s information
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationAnalysis of Cloud Solutions for Asset Management
ICT Innovations 2010 Web Proceedings ISSN 1857-7288 345 Analysis of Cloud Solutions for Asset Management Goran Kolevski, Marjan Gusev Institute of Informatics, Faculty of Natural Sciences and Mathematics,
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationA Social Network-Based Recommender System (SNRS)
A Social Network-Based Recommender System (SNRS) Jianming He and Wesley W. Chu Computer Science Department University of California, Los Angeles, CA 90095 jmhek@cs.ucla.edu, wwc@cs.ucla.edu Abstract. Social
More informationContent-Based Recommendation
Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches
More informationGigaSpaces Real-Time Analytics for Big Data
GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and
More informationQDquaderni. UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti. university of milano bicocca
A01 084/01 university of milano bicocca QDquaderni department of informatics, systems and communication UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti research
More informationISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
More informationInternational Journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online http://www.ijoer.
REVIEW ARTICLE ISSN: 2321-7758 UPS EFFICIENT SEARCH ENGINE BASED ON WEB-SNIPPET HIERARCHICAL CLUSTERING MS.MANISHA DESHMUKH, PROF. UMESH KULKARNI Department of Computer Engineering, ARMIET, Department
More information