Thanks to


Ranking product reviews at Flipkart

Data Science

Ranking product reviews at Flipkart

Flipkart is one of the biggest e-commerce platforms in India with over 80 million products being sold from over 80 different categories. Furthermore, the platform receives more than 200 million users on a monthly basis and thousands of purchases are made daily.

Given all of these statistics, the company invests heavily in research and development for making its platform a little better each day for its consumers. In this article, we will be discussing how Flipkart researched and developed its own algorithm to rank product reviews and why they did it.

The importance of product reviews

Product reviews and their ranking (order of display) play a vital role for an e-commerce platform like Flipkart. According to a study, reviews and ratings are the “most important” factor determining the purchase decisions of customers and reviews can boost up sales by giving customers the information on a product or service they are considering to purchase.

However, it is not easy for an e-commerce platform to rank and display appropriate product reviews since a single product can receive a variety of negative, positive or even mischievous reviews due to a number of reasons.

For this purpose, Flipkart used to have a pretty straight forward algorithm to rank the reviews based on the following two metrics:

  1. The proportion of upvotes – More preference is given to reviews with high number of upvotes
  2. The recency of reviews – More preference is given to reviews posted recently

As you can guess, this algorithm had many shortcomings.

Reviews with a high proportion of upvotes could be misleading as in the case with the following example. Though these reviews were the most upvoted ones, they didn’t describe the product well and were vague.

On the other hand, the recency of reviews only carried some kind of value to the users. A recent review with high upvotes may still be non-informative and may negatively influence the purchase decision of a customer.

Thus, Flipkart had a need to find a more robust approach to rank the reviews so that the customers be provided with in-depth, valuable and honest reviews.

Flipkart’s NLP and Machine Learning based approach

After realizing the shortcomings of the company’s review ranking algorithm, Flipkart shifted to a different approach. They came up with a new NLP and ML-based algorithm where they could rank reviews based on structure, relevance, and depth. Here is a step by step guide of what they did:

Step 1: Extract cluster seeds using catalog data

The words best describing the product can be found in that product’s catalog data. So they used the catalog data to extract nouns (seeds for clustering).

Step 2: Extend seed list using word2vec

Word2vec is a two-layer neural network that takes in a word as its input and outputs a feature vector representing that word. They used this model to extend the initially extracted seeds and create a list consisting of words similar to the seeds.

Step 3: Extract phrases from review text

They then extracted nouns and adjectives from the review text using Stanford’s Part of Speech (POS) tagger,

Step 4: Cluster phrases

The extracted phrases from the review text are then clustered according to the seed list obtained in step two. The extracted phrases that are not a part of the seed list are reviewed and added to the original seed list if needed.

Step 5: Assign weights

Weights are assigned to clusters based on the cluster frequency in the entire corpus. In other words, those clusters that occur more in reviews are given higher weights than the clusters that occur less.

Step 6: Score each review

Finally, they developed a scoring algorithm to assign a score called Review Quality Score (RQS) to each review. The brief overview of the algorithm is as follows:

1. Extract keywords matching from cluster definition and assign respective cluster weights (w_1, w_2, \dots, w_n).
2. If a review has keywords such as “pros” or “cons”, a multiplier of 1.2 is used to bump up the score.
3. Finally, calculate the RQS as the sum of cluster weights as:

    \[\textbf{RQS} = w_1 + w_2 + … + w_n\]


    \[\textbf{RQS} = 1.2 * (w_1 + w_2 + … + w_n)\]


With the new approach, the reviews that received higher Review Quality Score were ranked higher. As a result of this, more relevant and unbiased reviews were displayed to customers which in fact helped them make a more informed purchase decision. Also, the new ranking of reviews significantly increased the key KPIs at Flipkart, such as product page views and the number of reviews read per product page view grew by up to 142%.

Leave your thought here

Your email address will not be published. Required fields are marked *

Close Bitnami banner