Introduction to Recommendation System

Have you ever wondered how after logging into an e-commerce website and making purchase, you are being shown relative items for you to buy? Maybe probably you’ve wondered how facebook recommends “friends of a friend” to you.  Maybe as in my case, when i decided to watch just one movie on Netflix, I ended up watching three. It’s all based on what the system “feels” you would be interested in. The term feeling here is used to describe the ultimate goal of this kind of system and it’s called personalization. This technology is referred to as Recommendation System and this article won’t delve into details of implementing a recommendation system but instead, it aims to give an introduction of this technology. The audience of this article is suitable for both developers and service providers who want to get into this kind of technology. For the developers, we aim to show the technique and algorithms used in recommendation systems and for service providers, the usefulness of a recommendation system.

Recommendation system: What is it?

Recommendation system is a personalized information filtering technique that provides a user with information that he/she is interested in and it does that with high probability. During a live interaction, a recommendation system apply some knowledge discovery technique(in other terms, data mining technique ) to the problem of making a personalized  recommendations.  The amount of information we have has grown exponentially based on technology. So we need technology to help us navigate to the rightful information. The output of a recommendation system implementation is a ranked recommendation lists. And they are generated based on; item-features, user-items previous interaction, and user preferences.

A quick look at the meaning of this terms. Item features: In a recommendation system, items are what we want to recommend to users. An example of items is book (in the case of amazon), movies (in the case of Netflix) groceries (online grocery store). Hence, item-features are the characteristics of those items. The item features of a movie item are movie rating, title, movie category. Transactions: this is the interaction of a user with an item. More Often, the user-item interaction is the history data of a users interaction of items. The interaction could be items the user add to cart, or favourite. User preference could be what the user likes. Consider a young teenager that watched 10 movies in 1 week. The user-item interaction is the list of movies he interacted with. The interaction could be watching the movie, visit the movie link, rating the movie. Imagine he rated 4 movies excellently and 4 movies he rated averagely and the rest poor. From this we can determine the boys’ preference based on his rating.

Why the need for a recommendation system?

Netflix for example knew the importance of a good recommendation system and organized a $1 million prize for the team that could increase the accuracy of their system by 10%. They knew that their service can be ditched as soon as when a user realizes there’s nothing to watch. With their recommendation system, they have increased they were able to increase their revenue and with 100 million users, they have 250 million active users. Hence, the need for a recommendation system cannot be overemphasized. While there are quite a lot of benefits derived from having a recommendation system, I would give 3 important reason why a service provider need a recommendation system.

  1. Increase the users’ satisfaction: A user that continuously uses your system should find the recommendation interesting and relevant. As a service provider, the moment your recommendation system (RS) combines accurate recommendations and usable interface, it increases the system usage and a good evaluation of the service you provide.
  2. Increase the number of item sold: Consider an online grocery store, one of the aim is to increase sales and thereby having a high return on investment (ROI). A good recommendation system would definitely be able to sell additional items. As a user enters the system to order an item, the recommendation system should suggest similar item or items bought together. This is referred to as basket analysis.
  3. Better understanding of what the user wants: This is an important feature of a RS because there is a description of the users preference either collected or predicted by the system. the business can then use this information for a number of other goals like improving the management of the item’s stock.

Recommendation Techniques and Algorithms.

Recommendation Techniques

Recommendation techniques can be broadly classified into 2 main categories. This article is not meant to go into details of each of the recommendation models, but it aims to give a high level understanding of the different model and some algorithms that can be used.

Content- Based Filtering

This type of recommendation system learns to recommend items that are similar to the ones that the user liked in the past. The calculation of this similarity is dependent on the item-features of the compared items. For example if a user buys a book from Amazon within the computer science category and buys another from the artificial intelligence category. The rank of the recommended list should start from artificial intelligence books then down to computer science books. Which shows that the system learns to recommend books based on the taste of the user or user preference. Same applies for travel services, if for a week, my search on a travel agent website is related to beaches, then a good recommendation system should recommend a list of possible beaches based on my other preference (like locations).

Content based filtering exploits the content of the data items to predict its relevance based on the user’s profile. The advantage of this system is that it’s user-independent because it relies solely on the ratings provided by the active users to build her own profile. Another advantage is that its capable of recommending new items that haven’t been rated by a user before. But it has its own limitations too which is over – specialization. This means that if a user only rates books written by Stephen king, content-based system recommends this set of books. So novelty isn’t applied here.

Examples of algorithms used in content based recommendation system is Probabilistic Methods and Naive Bayes. This kind of approach generate a probabilistic model based on previously observed data. Other methods employed are decision trees, Nearest neighbour algorithm, Relevance feedback and Rocchio’s Algorithm. The idea of Rocchio’s algorithm is to allow users rate items suggest by the recommended system with respect to their information need.

Collaborative Filtering

Collaborative filtering methods are based on collecting and analyzing a large amount of information on users’ behaviors, activities or preferences and predicting what users will like based on their similarity to other users (source: wikipedia). This approach has a wide use and its simplest implementation is recommending to active users the items that other users with similar tastes liked in the past. The similarity in tastes of users is calculated based on the similarity in the rating history of the users. This area of RS has gained attractions from research community and lots of method/ algorithms can be used for building a collaborative model.

This kind of model overcome some of the limitations of content-based models. It can recommend items with different content in as much as the item gained the interest of other users. There are 2 general cases we could group a collaborative filtering method which are the model based and neighbourhood based methods. In a model based approach, we learn a predictive model on the user-items interactions and with training, we can use this model to predict ratings of users for new items. Some algorithms used in a model based approach is Singular Value Decomposition (SVD), Bayesian Clustering, Boltzmann Machines amidst others. While for the Neighbourhoods based approaches, the user-items rating stored in the system are used to directly predict new ratings for new items.

 

There are various domain where RS can be applied they include but not limited to entertainment (recommending movies, music), content (personalized newspaper, recommendation for documents or web pages), E-commerce(recommendation for consumers of products to buy), Services( travel services, of houses to rent, or matchmaking services. As a developer, there is need to understand the development challenges, limitations and algorithms suitable for the domain you’re applying a RS.

In conclusion, the need to building a personalized information filtering system should be considered when aiming to increase sales, increase traffic, user fidelity or better still winning over your competition.