Home Tech Frontier of Crowd Wisdom: Public ratings are outdated, emotional ratings are more...

Frontier of Crowd Wisdom: Public ratings are outdated, emotional ratings are more useful


The original Xiong Hongjin Jizhi Club is included in the topic #Frontiers of Complex Science 202183


Online network evaluation provides people with a channel to immediately obtain the wisdom of the masses. In all online reviews of Amazon and Yelp, positive reviews account for the vast majority, but there are obvious differences in the nature of people’s behaviors towards these items. So how can we learn from the “ocean” of these positive online reviews Identify the truly valuable and successful projects? Can the popular “star ratings” be used as a reliable basis for predicting the success of their projects? Recently, the article Nature human behavior gave answers to these questions. This article is an overview of this paper.

The Jizhi Club has started recruiting, with a number of experts taking the lead to discuss issues in the social, economic and other fields from the interdisciplinary perspective of computing science and complex science. This article is written for members of the Reading Club. The book club lasts for 10-12 weeks and is held every Thursday night. See the end of the article for details.

熊宏晋 | Author

Deng Yixue | Edit

Essay topic:

Mass-scale emotionality reveals human behaviour and marketplace success

Paper URL:


  1. Extract user sentiment from the “massive” evaluations of online platforms

    At present, with the development of online crowdsourcing platforms on the Internet, the sale of goods or offline services will attach certain evaluation information to its online crowdsourcing platform, and the most common of these is the star-rating evaluation method. This allows potential consumers on its platform to obtain the most direct reference when choosing goods or services in the lowest cost way. But, is this the truth?Research and investigation have proved that these online rating systems have certain limitations: most of their online reviews are positive[1].For example, on Amazon.com, the average star rating is about 4.2 (out of 5), and far more than half of the reviews are 5-star ratings[2].Nearly half of Yelp reviews are 5-star reviews[3], And nearly 90% of Uber reviews may be 5 stars[4].

    The above situation will cause individuals to often face choices among many similar star-rated projects, especially considering that people will not even consider obtaining options with less than 3 stars. Therefore, the star rating itself may not be able to reliably predict the success of its project, and it cannot be used as a valid reference for the true value of the project. Highly concentrated positive evaluation will make it an invalid signal, so how can we obtain effective information from these massive positive evaluations? Professor Matthew D. Rocklage and his research team from the School of Management at the University of Massachusetts in the United States called this challenge of identifying success in a large number of positive reviews as a “massive” evaluation positivity problem. They first proved the universality of this “massive” positivity problem, and proposed that emotional commentary language can provide individuals with more meaningful references. They published this research result on Nature Human Behaviour under the title “Mass-scale emotionality reveals human behaviour and marketplace success”.

    They studied the issue of massive evaluation enthusiasm in four large-scale online evaluation cases, including movie box office revenue, Amazon book sales, followers of new brands in Super Bowl advertisements, and restaurant reservations on Yelp. Through research on four large-scale online information evaluation platforms, they proved that 80% to 100% of online star ratings are positive, and found that online star ratings are unreliable for predicting the behavior and success of a project Yes, that is, more and more positive reviews usually do not predict the success of their projects.but Emotional factors of evaluation text It can be used to predict the behavior of a project and its likelihood of success.This is because emotional language provides an indication for the individual himself that something particularly influential has happened[5,6], So they can be used as a particularly clear signal for individuals to understand their attitudes.This strong signal will in turn lead to a stronger attitude in memory[7], This is a generally accepted factor predicting the influence and persistence of attitudes.

    The four cases they studied are shown below:

  2. Emotional factors predict movie box office

    The researchers obtained online reviews of all movies from 2005 to 2018 from Metacritic.com, and used the first 30 user reviews written for each movie to measure the movie’s star rating (0 to 10 stars) and online Emotional language analysis of the evaluation text. They found that the average star rating of a movie would significantly reduce the box office revenue of their movie. And when all movies are included—even those initially negatively rated—star ratings have no significant predictive effect on box office revenue.

    Then they added the average sentiment factor of the review text and the average text valence to the same model as a control. Star ratings are still an important negative predictor of movie box office revenue (see Figure 1 on the left). Most importantly, the emotional factor in the evaluation text is an important positive predictor of future box office revenue (see Figure 1 on the right).

    Figure 1. (Left) Predicting the relationship between movie box office revenue and its movie star rating; (Right) Predicting the relationship between movie box office revenue and the emotional factors in the movie review text

  3. Book sales: text sentiment is more important than rating

    In the second case study, the researchers predicted the success of all books on Amazon.com from 1995 to 2015 (20 years of data). They again used the first 30 reviews of each book to index the book’s star rating (1-5 stars), text valence, and emotional factors in the text.

    The regression results of its average star rating are mixed. Star rating is a negative factor in predicting the number of book purchases. When books rated as negative are also included, positive star ratings have a significant predictive effect on purchases. However, the overall evidence here is mixed, because in one-third of the book types, star ratings are insignificant or negative predictors.

    When analyzing positively reviewed books, they predicted the book’s purchase volume based on the book’s average star rating and the sentimentality of the text. It found that the average star rating is a negative predictor of purchase, and the emotionality of the text is an important positive predictor. In addition to these effects, more positive emotional language in the first 30 reviews indicates more purchases, and this conclusion is expressed in 93% of the book types.

  4. New brand followers in advertising: evaluation predicts fan growth

    In Case Study 3, the researchers examined whether the emotionality of real-time tweets for TV commercials can predict success and human behavior, that is, the increase in the number of new fans of the brand every day. For the 2016 and 2017 Super Bowls, they obtained all real-time tweets that occurred on the day of the Super Bowl, which mentioned the advertisements that were broadcast during the Super Bowl. There are 94 advertisements from 84 companies, and the total number of tweets about these advertisements is 187,206. Then, they used the evaluation dictionary to quantify the average valence and emotional expression of each business in the tweet.

    They found that the number of fans the company had accumulated before the Super Bowl could predict the number of fans they had accumulated after the Super Bowl, but the company’s star ratings on USA Today had no predictive effect on fans.

    Then, they added the text sentiment factor of each ad’s tweet as the main predictor, and added the average valence of the text as a control variable. It found that the star ratings of USA Today and the valence of tweets had no predictive effect on the number of new fans. However, the higher the degree of positive positivity of the textual emotional language of business tweets, the more Facebook fans the company will accumulate in the next two weeks.

  5. Restaurant reservations: scores and emotions are both useful

    In Case Study 4, the researchers studied the success of restaurants and the number of reservations based on the top 30 Yelp.com reviews of all restaurants in Chicago, Illinois as of 2017. They use these reviews to index each restaurant’s average star rating (1 to 5 stars), text valence, and text sentimentality.

    This time, their research results are different from the previous three research cases. In terms of restaurant reservations, the average star rating can predict more table reservations. They then added the textual emotional factors of the first 30 reviews of the restaurant and their text valence to the regression model. The average star rating becomes insignificant (see Figure 2 on the left), and the valence of the text is a positive predictor. In addition to these effects, restaurants with more positive emotional language evaluations will get more meal reservations (see Figure 2 on the right).

    Figure 2. (Left) Predicting the relationship between table reservations and movie star ratings; (Right) Predicting the relationship between table reservations and the emotional factors in the movie review text

  6. “Massive” evaluation enthusiasm problem solutions

    Nowadays, the problem of “massive” evaluation enthusiasm in large-scale online evaluation information has become more and more common, and sometimes accompanied by businesses themselves in order to make their own products or services get better sales, so as to make good reviews. It will further cause the difficulty for mass consumers to identify effective signals. And emotional language evaluation can be the correct way to solve this problem. This calls for relevant third-party platform organizations to pay more attention to the emotional nature of personal attitudes. Platform managers can consider summarizing the language of reviewers and provide an “emotional star rating” to provide individuals with a more meaningful evaluation reference. The exploratory research that is effective in predicting and can replace star ratings will be reserved for researchers who are interested in this research question.


    [1] Hu, N., Zhang, J. & Pavlou, PA Overcoming the J-shaped distribution of product reviews. Commun. ACM 52, 144–147 (2009).

    [2] Woolf, M. Playing with 80 million Amazon product review ratings using Apache Spark. minimaxir http://minimaxir.com/2017/01/amazon-spark/ (2017).

    [3] Yelp Factsheet (Yelp, 2017); https://www.yelp.com/factsheet

    [4] Athey, S., Castillo, JC & Knoepfle, D. Service quality in the gig economy: empirical evidence about driving quality at Uber. White Paper. https://doi. org/10.2139/ssrn.3499781 (2019).

    [5] Tooby, J. & Cosmides, L. The past explains the present. Ethol. Sociobiol. 11, 375–424 (1990).

    [6] Ekman, PE & Davidson, RJ The Nature of Emotion: Fundamental Questions (Oxford Univ. Press, 1994).

    [7] Rocklage, MD & Fazio, RH Attitude accessibility as a function of emotionality. Pers. Soc. Psychol. Bull. 44, 508–520 (2018).

    Social Computing Series Reading Club Starts Recruitment

    With the continuous accumulation of big data and the iterative of digital technology, the intersecting field of social computing is rapidly emerging. Social network analysis, natural language processing, machine learning, system dynamics, multi-agent modeling and other technologies are here. The collision and fusion of one field gradually unearth the deep laws of social behavior in the information age.

    With the theme of “Social Computing”, the Jizhi Club organizes a 10-12 week book club with a number of experts taking the lead to study classics and cutting-edge literature, exchange and stimulate scientific research inspiration. The reading club was initiated by Teacher Wang Shuo, and the expert advisory group included many teachers such as Meng Xiaofeng, Luo Jiade, Wang Xiao, Lu Peng, Wang Jingyuan, and Li Yong.

    For details and registration methods, see:

    Original title: “Frontier of Crowd Wisdom: Public ratings are outdated, emotional ratings are more useful”

    Read the original