HCRS: A hybrid clothes recommender system based on user ratings and product features

Similar documents
Colour Image Segmentation Technique for Screen Printing

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

Image Classification for Dogs and Cats

Local features and matching. Image classification & object localization

Recommending News Articles using Cosine Similarity Function Rajendra LVN 1, Qing Wang 2 and John Dilip Raj 1

A Genetic Algorithm-Evolved 3D Point Cloud Descriptor

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE

CONTENT-BASED IMAGE RETRIEVAL FOR ASSET MANAGEMENT BASED ON WEIGHTED FEATURE AND K-MEANS CLUSTERING

Predict the Popularity of YouTube Videos Using Early View Data

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

How To Filter Spam Image From A Picture By Color Or Color

Recommender Systems for Large-scale E-Commerce: Scalable Neighborhood Formation Using Clustering

Course Syllabus For Operations Management. Management Information Systems

Environmental Remote Sensing GEOG 2021

The Delicate Art of Flower Classification

Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)

Image Segmentation and Registration

Canny Edge Detection

Utility of Distrust in Online Recommender Systems

ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES

Android Ros Application

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Simultaneous Gamma Correction and Registration in the Frequency Domain

Machine Learning using MapReduce

Image Normalization for Illumination Compensation in Facial Images

Saving Mobile Battery Over Cloud Using Image Processing

Automated Collaborative Filtering Applications for Online Recruitment Services

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Module II: Multimedia Data Mining

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Study on Recommender Systems for Business-To-Business Electronic Commerce

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Representation of Electronic Mail Filtering Profiles: A User Study

Recognization of Satellite Images of Large Scale Data Based On Map- Reduce Framework

Tracking and Recognition in Sports Videos

arxiv: v1 [cs.ir] 12 Jun 2015

Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services

Visibility optimization for data visualization: A Survey of Issues and Techniques

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

Probabilistic Latent Semantic Analysis (plsa)

The Scientific Data Mining Process

A Web Recommender System for Recommending, Predicting and Personalizing Music Playlists

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

Emotion Detection from Speech

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

EHR CURATION FOR MEDICAL MINING

The Effect of Correlation Coefficients on Communities of Recommenders

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Multiscale Object-Based Classification of Satellite Images Merging Multispectral Information with Panchromatic Textural Features

A STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE

Data Mining Yelp Data - Predicting rating stars from review text

Novelty Detection in image recognition using IRF Neural Networks properties

A Learning Based Method for Super-Resolution of Low Resolution Images

Comparison of K-means and Backpropagation Data Mining Algorithms

Personalized advertising services through hybrid recommendation methods: the case of digital interactive television

Overview. Raster Graphics and Color. Overview. Display Hardware. Liquid Crystal Display (LCD) Cathode Ray Tube (CRT)

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

An Overview of Knowledge Discovery Database and Data mining Techniques

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Analecta Vol. 8, No. 2 ISSN

Classifying Manipulation Primitives from Visual Data

Lecture 2: The SVM classifier

Object Recognition. Selim Aksoy. Bilkent University

Big Data: Rethinking Text Visualization

Big Data with Rough Set Using Map- Reduce

Supervised and unsupervised learning - 1

Modeling and Design of Intelligent Agent System

Content-Boosted Collaborative Filtering for Improved Recommendations

Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation

Open issues and research trends in Content-based Image Retrieval

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Accurate is not always good: How Accuracy Metrics have hurt Recommender Systems

Music Genre Classification

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

Assessment of Camera Phone Distortion and Implications for Watermarking

Information Retrieval and Web Search Engines

Knowledge Discovery from patents using KMX Text Analytics

A successful market segmentation initiative answers the following critical business questions: * How can we a. Customer Status.

Spazi vettoriali e misure di similaritá

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Big Data: Image & Video Analytics

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Collaborative Filtering. Radek Pelánek

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Optimization of Image Search from Photo Sharing Websites Using Personal Data

Social Media Mining. Data Mining Essentials

A Hybrid Model of Data Mining and MCDM Methods for Estimating Customer Lifetime Value. Malaysia

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Intelligent Diagnose System of Wheat Diseases Based on Android Phone

Transcription:

HCRS: A hybrid clothes recommender system based on user ratings and product features Xiaosong Hu, Wen Zhu, Qing Li Sch. of Economic Inf. Eng. Southwestern Univ. of Finance & Econ. Chengdu, China Abstract-Nowadays, online clothes-selling business has become popular and extremely attractive because of its convenience and cheap-and-fine price. Good examples of these successful Web sites include Yintai.com, Vancl.com and Shop.vipshop.com which provide thousands of clothes for online shoppers. The challenge for online shoppers lies on how to find a good product from lots of options. In this article, we propose a collaborative clothes recommender for easy shopping. One of the unique features of this system is the ability to recommend clothes in terms of both user ratings and clothing attributes. Experiments in our simulation environment show that the proposed recommender can better satisfy the needs of users. Keywords-Collaborative filtering; Clothes recommender system; Probabilistic model; Information filtering I. INTRODUCTION The concept of a recommender system was first put forward in the mid 99s []. The basic function of a recommender system is to suggest interesting products to end users in terms of users behavior. With the fast growth of e-commerce, recommenders have been playing a vital role in e-commerce and have attracted great attention from chief-officers in the big e-business companies including Amazon.com, Tmail.com. In factuality, many companies find out that recommenders can not only recommend goods which fit customers, but also take enormous effect in converting browsers into buyers, increasing cross-sell and building loyalty []. The adoption of recommender system in business is widespread. In general, the recommender can be categorized into three categories, i.e., the content-based recommender, the collaborative recommender, and the hybrid recommender. The content-based recommender recommends products that are most similar to what the user like most in terms of the inner attributes of the products. It derived from information retrieval with a specific focus on long-term information filtering. A good example is the PicSOM which recommends similar images based on the color, texture, and shape of images [3]. For the collaborative recommender system [5], it usually recommends products considering the opinions of others. It is a sort of word-of-mouth advertisement. Specifically, it first finds friends of a target user who share similar interest on the same products based on the historical information. And then, it predicts the preference of the target user on a certain product based on the opinions of his/her friends [4]. It can be further categorized into memory-based recommender and model-based recommender [6]. The hybrid system takes both forms of the contentbased system and the collaborative system [6]. It recommends products based on both the product attributes and user ratings. A good example of such hybrid recommender is the music recommender system developed by Li et al. [7], which suggests music in terms of user ratings and audio features. In this article, we propose a hybrid recommender system for easy clothes shopping. One of the unique features of this system is the ability to recommend clothes in terms of both user ratings and clothing attributes. Experiments in our simulation environment show that the proposed recommender can better satisfy the needs of users. II. SYSTEM DESIGN Figure is the outline of our proposed hybrid clothes recommendation system, dubbed HCRS, which takes the utilization of both user ratings and product features. In particular, it first applies the human detection techniques to detect the clothes area in an image. Second, it analyzes the clothes area and calculates the percentage of each color in this area. Third, it extends the item-rating matrix with grouprating matrix to accommodate product features for recommendation. Fourth, similar products of the target product to be recommended are selected based on such extended rating matrix. At last, the recommended products are determined by the ratings earned by the similar products.

Integrator Rating Matrix Clothes ID Tom Lily Bill Clothes Clothes Clothes 3 Preprocessor User Data Clustering Engine Clothes Features Clothes ID Clothes Clothes Clothes 3 Item Clusters Extended Rating Matrix Tom Lily Bill Recommendations Group Group Cluster-based Collaborative Filter Algorithm : Adjusted K-means Clustering Input : the number of clusters k and item attribute features. Output : a set of k clusters that minimizes of the squared-error criterion, and the probability of each item belonging to each cluster center. () Choose k objects as the initial cluster centers; () Repeat (a) and (b) until small change; (a) (Re)assign each object to the cluster to which the object is the most similar, based on the mean value of the objects in the cluster; (b) Update the cluster means, i.e., calculate the mean value of the objects for each cluster; (3) Compute the possibility between each object and each cluster center. Main Color Extraction Human Profile Human Detection Color Images Figure. Outline of the HCRS system. III. Figure. Outline of the HCRS system. RATING MATRIX CONSTRUCTION A. Group-Rating Matrix Construction User ratings generally are not comprehensive enough. To handle this challenge, we utilize the product features as a complement to capture the inner connections of products. More specifically, we first group clothes into clusters based on the product features, each clothes belongs one or more product groups. Such memberships are utilized as groupratings to construct an extended rating matrix. Essentially, these group-ratings provide abstract summarization of the primitive low-level features, revealing the high-level semantic information. The traditional K-means clustering [8] is a hard clustering technique, in which each item only can be assigned into one cluster. In this study, we extend it with fuzzy set theory to support soft clustering, in which each item belongs to one or more groups, for our recommendation purpose. The details of this adjusted K-means algorithm is shown in Figure. In particular, the probability of one product assigned to a certain clique k is defined as: where Pro(j,k) is the possibility of object j belonging to cluster k; CS(j,k) is the counter-similarity between the object j and the cluster k, which is calculated based on the Clothes ID Tom Lily Jack Bill Group Group Clothes 3.357.89 Clothes..45 Clothes 3 3 3..789 Clothes 4.8.43 Clothes 5.5.53 Clothes 6 3.367.3 Figure 3. Extend item-rating matrix. Euclidean distance; and MaxCS(i,k) is the maximum counter-similarity between an object and cluster k. B. Extend Item-Rating Matrix With the group-ratings, we extend the traditional itemrating matrix for recommending products. The group-ratings are attached to the item-ratings as pseudo-ratings as shown in Figure 3. C. Collaborative Recommendations To determine whether a product should be recommended or not, HCRS takes collaborative filtering techniques to predict the probability of recommendation. In particular, the prediction for recommending a product is calculated by performing a weighted average of deviations from the neighbors mean as where represents the prediction for user i of item k, n denotes the top N nearest neighbors of item k, and represent the average ratings of item k and u, respectively.

Input image Normalize gamma & color Compute gradients Weighted Vote into spatial & orientation cells Contrast normalize over overlapping spatial blocks Collect HOG's over detection window Linear SVM Person / nonperson classificatoin Figure 4. An overview of the human detection chain IV. FEATURE EXTRACTION A user s preference of clothes color is a key factor for his or her shopping decision-making [9]. In this section, we describe the approach to extract physical features from clothes images to provide additional information for clothes recommendation. Specifically, we first use histograms of oriented gradients [] to detect the profile of human being in a cloth image, and then analyze dominant colors from the extracted area. A. Human Detection In this section, we adopt the method proposed by Navneet Dalal and Bill Triggs [] to detect the profile of fashion models in clothes image (Figure 4). Specifically,. Gamma/Color Normalization: First, we normalize an image for later feature detection and extraction. However, gamma coding was put forward to compensate for the properties of human vision, hence to maximize the use of the bits or bandwidth relative to how humans perceive light and color. And it is widely used in study of photography. According to the work of Navneet Dalal and Bill Triggs [], RGB color space with no gamma correction displays superior performance comparing to the normalizations with grayscale, RGB and LAB. Therefor, we use RGB color method in this work.. Gradient Computation: An image gradient is a directional change in the intensity or color in the image which is widely used to extract information from images. In this part, we use [-,, ] gradient filter with no smoothing to calculate gradient for each color channel in the color image, and take the one with the largest norm as the pixel s gradient vector to enhance the sensitivity. 3. Spatial / Orientation Binning: After step we found that gradient strengths vary over a wide range owing to local variations, so effective local contrast normalization turns out to be essential for good performance. In the experiment, we take 8 8 pixel as a cell. And for each cell, we apply the method of linear gradient voting into 9 orientation bins in 8 to get cells. 4. Normalization and descriptor blocks: To obtain useful information from larger area, we group four 8 8 pixel cells into 6 6 pixel blocks and get blocks with block spacing stride of 8 pixels (hence 4-fold coverage of each cell). And then we use L-Hys to normalize blocks. 5. Detector Window and Context: we use 64 8 pixel to be a detector window to represent a part of an image. 6. SVM is one of the most popular algorithms for pattern recognition. As we mention before, in order to detect human being, we link SVM with the human-image and background-image to training a model for human detection. B. Main Color Extraction According to biology, human being cannot distinguish more than 5 colors. HSV which is short for hue, saturation, and value is formally described by Alvy Ray Smith []. It can be converted from RGB (red, green and blue) to represent a color with an array consisted of 3 numbers. And it is more intuitive and perceptually relevant than the Cartesian representation. Therefore, we can use different combination to represent different colors that can be distinguished by human being. However, in the theory of HSV, hue has a greater influence on color detecting than saturation and value. So we use Equation 3 and equation in Figure 5 to generate the number that can present a color. (3) where L stands for the number representing a set of color which can not be distinguished by human. And H, S and V stand for the result of the function in Figure 5. HSV H S V 3 4 5 6 7 Figure 5. Way of caculating the color

.86.94 Combination Approach Classic Pearson.85.9.84.9 MAE.83 MAE.88.86.8.84.8 3 4 5 6 No. of Cluster.8 4 6 8 No. of Neighbors Figure 6. Clustering Figure 7. Comparison of our approach and classic Pearson method C. Vector for Clothes In this section, we calculate L for each of the pixels in the picture. And then, we use Equation 4 to calculate the percentage for each color (represented by L in Equation 3) in each picture. (4) stands for the percentage of the kind of color in the image where stands for the number of pixels belonging to this kind of color and stands for the total amount of the pixels in the image. V. EXPERIMENT EVALUATIONS To gauge the performance of our approach, we carried out a series of experiments. A. Data Set & Evaluation Metric We performed experiments on the data set generated by 63 users who usually purchase their new clothes on e- commercial Web sites. Every user is supported to give a mark on 5 clothes of different styles. However, after wiping out the blank record, we achieve a total of 783 rating record. And then we divide the data set into a training set and a test data set. Twenty percent of users are randomly selected to be the test users and the rest servers as the training set. MAE [4] is used as the metric in our experiments, which is calculated by summing these absolute errors of corresponding rating prediction pairs and then computing the average. A lower MAE indicates greater accuracy. (5) where is the system s rating (prediction) of item k for user i, and is the rating of the user i for item k in the test data. N is the number of rating-prediction pairs between the test data and the prediction result. B. Behaviors of The Proposed Recommender ) Number of clusters: We implement group-rating method described as in part III and found that the number of clusters to a massive extent as show in Figure 6. It can be observed that the number of clusters affects the quality of prediction. And when the number of clusters is about 3, MAE reach the optimal value. ) Methods for computing user-user similarity: As we have observed, the value scale of the group-rating matrix and user-rating is distinct. Hence, we should modify the value of the same scale or separately compute the user-user similarity. In our experiments, we enlarge the value scale of the group-rating matrix from [ ] to [ 5]. And then we use the Pearson correlation-based algorithm to calculate the user-user similarity from the user-rating matrix and adjusted cosine algorithm for the similarity from the group-rating matrix. Finally, the total user-user similarity is then calculated as the linear combination of the above two. And Figure 7 shows that our linear combination method displays better performance than then classic algorithm. 3) Neighborhood size: The size of the neighborhood has a significant effect on the prediction quality []. In our experiment, we vary the number of neighbors and compute the MAE. It can be observed from Figure 7 that the size of the neighbors affect the quality of the prediction. When the number of neighbors is changed from 4 to 6 in our approach, the optimal MAE value is obtained.

VI. CONCLUTION In this article, we propose a hybrid recommender system for easy clothes shopping. One of the unique features of this system is the ability to recommend clothes in terms of both user ratings and clothing attributes. To simplify your challenge, we only extract the color information of clothes to enrich the experience learning of recommender system. Experiments in our simulation environment show that the proposed recommender can better satisfy the needs of users. However, it would be interesting to explore more features of clothes including clothe texture or even fashion style. VII. ACKNOWLEDGMENT This work has been supported by the National Natural Science Foundation of China (NSFC Grant No. 6836, 6733, 693, 983), the Fok Ying-Tong Education Foundation, China (Grant No. 68), the Program for Key Innovation Research Team in Sichuan Province (JTD8). REFERENCES [] Hill, W., Stead, L., Rosenstein, M., & Furnas, G. Recommending and evaluating choices in a virtual community of use. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 94-, 995. [] Schafer, J. B., Konstan, J. A., & Riedl, J. E-commerce recommendation applications. In Applications of Data Mining to Electronic Commerce, pages 5-53,. [3] Laaksonen, J., Koskela, M., Laakso, S., & Oja, E. PicSOM contentbased image retrieval with self-organizing maps. Pattern Recognition Letters, (3): 99-7,. [4] Breese, J. S., Heckerman, D., & Kadie, C. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pages 43-5, 998. [5] Billsus, D., & Pazzani, M. J. Learning collaborative information filters. In Proceedings of the fifteenth international conference on machine learning, 54: 48, 998. [6] Kim, B. M., Li, Q., Park, C. S., Kim, S. G., & Kim, J. Y. A new approach for combining content-based and collaborative filters. Journal of Intelligent Information Systems, 7(): 79-9, 6. [7] Li, Q., Myaeng, S. H., & Kim, B. M. A probabilistic music recommender considering user opinions and audio features. Information processing & management, 43(): 473-487, 7. [8] MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, (8-97): 4, 967. [9] Biggam, C. P. New directions in colour studies,. [] Dalal, N., & Triggs, B.. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 5. CVPR 5. IEEE Computer Society Conference, : 886-893, 5. [] Smith, A. R.. Color gamut transform pairs. In ACM Siggraph Computer Graphics, (3): -9, 978. [] Herlocker, J. L., Konstan, J. A., Borchers, A., & Riedl, J. An algorithmic framework for performing collaborative filtering. In Proceedings of the nd annual international ACM SIGIR conference on Research and development in information retrieval, 999: 3-37, 999.