Social media became one of the main communication channels between brands and customers, as well as a place where supporters and opposers of a brand or company can gather and voice their opinions. Due to this, most companies, brands and even people of interest have a big presence on social media, and the different platforms allow for a different form of interaction with the brand. Some of them, such as Facebook or Google, allow for scoring companies with a one-to-five star rating, some allow for liking or disliking them, and some, such as Reddit or Twitter, give the option to provide text descriptions, reactions and responses about them. When somebody looks at a new product, they may be interested not only in the reviews and ratings published on certain websites and in magazines but also in what people who actually came into contact with the brand or the product think. Collecting this information is very tedious work, as it requires searching for the product on all kinds of different platforms and to read through pros-and-cons arguments in order to form one’s own opinion about a certain product or brand.
In this blog post, I explore how reviews can be used to create a basis for a machine learning model, that can label social media posts and assign them star-ratings, and translate this into a metric that can be used by any company or consumer.
In this post, we focus on the American fast-food chain, Wendy’s, as they have quite a strong presence on social media. I also use 3 datasets: Yelp, Twitter and Reddit to show how the metric works.
The Twitter dataset was created by scraping Twitter in the first half of 2019 for any mention of Wendy’s, reposts about Wendy’s and comments. This dataset contains the single tweets and their texts, the number of retweets they got, whether the tweet itself is a retweet or not, the number of times the tweet was favorited and if exists, where the tweet was sent from. There is some additional information associated with each tweet, like who sent it, who they replied to, what the ID of the tweet is but these will mainly be used to identify the unique tweets. During the research, I have gathered 161 708 tweets about or to Wendy`s. These tweets were gathered using the (now depreciated) Twitter “Tweet” API, which allowed 1000 requests per hour with an R script running multiple times each day for 3 months. Note the retweet count and the isRetweet flag, indicating that some tweets are duplicates and the original tweet needs to be found. On the figure, some fields like Date, Latitude, Longitude are omitted.
There is a Reddit subreddit dedicated to Wendy’s, where people comment, ask questions, give warnings or praises about the fast-food chain. During this research, I scrapped the subreddit throughout the first half of 2019 to gather 543 posts about the chain. Reddit’s structure allows each post to have multiple comments that can have replies, that can also have replies and it can go as deep as users desire. In my Reddit dataset, I capture the structure (the location of the comment in relation to the post), the date at which the comment was made, the number of comments the post has in total, the score the post received, the score the comment received, the full text of the comment, the full text of the title and the post. I also keep whether the post had a picture or outgoing link attached to it or not. Note that the columns up-vote proportion, and post score can give a clear indication on in total how many people interacted with the post, as the post score is calculated by subtracting all down-votes from all up-votes. On the figure, some fields like Date, post id, domain are omitted.
There is an ongoing Yelp challenge each year, where Yelp releases their data in json format for researchers to work on. I used this dataset to find reviews related to Wendy’s. In Yelp, each location of Wendy’s is a separate entity that the audience can review, so all the reviews need to be aggregated together for the whole of Wendy’s. There is a lookup table of business IDs and who those businesses belong to, from which I selected all the IDs that belong to one of the Wendy’s locations. Filtering the reviews to all these IDs leads to around 50 000 reviews from 2012 to 2019 that are about one of the Wendy’s locations. Each review contains the full text of the review, the number of stars awarded to the location and the service, the number of people who found the review useful, funny or cool and the date at which it was issued. After filtering out all the reviews made outside of the time-frame of this research, there are 4812 reviews left.
Creating the star-classifier for the data
The Yelp dataset serves as a base for our classifier. As each review already contains both a text and a star rating, the classifier can be built for the rest of the social media posts. However, social media posts are very different from well-written reviews, and therefore the learning algorithm needs to be able to learn the structure of these posts and how they use language. Before using the Yelp dataset as the training set, we need to make sure that the dataset is class balanced. For this, I used the Synthetic Minority Oversampling Technique (SMOTE), which creates artificial points from the minority classes. Then these datasets can be used to train the initial classifiers, and classify the social media datasets for their star ratings. Once this is done, the most confident predictions can be added to the training data, the model can be re-learned, now with social media lingo included. The new model is then used to predict the labels of the social media data, adding the most confident predictions once again to training, and repeating this process until all the data has been labelled confidently.
The final metric
To build the trust metric, first a couple concepts about social media need to be established. Social media posts, in contrast with simple reviews can be reacted to, shared, retweeted, and even the brand itself can directly reply to them. This means that while reviews are equal and can be aggregated just based on their star ratings, social media posts are much more fickle. One needs to take into account who writes about the brand, how many people it reaches and resonates with, and what the overall reach of the brand is in the digital world of social media. For example, a tweet with a thousand retweets influences the perception of the brand much more than a tweet with no retweets at all. The Star rating classifications of the SVM models is used as a baseline to designate each social media post a numeric value. This numeric value is then weighted based on the number of retweets, likes, thumbs-ups, or any other indicators that it was seen and agreed with it receives. The score then becomes a simple weighted average formula in a given time-frame:
In this formula the score will be a numeric value between 1 and 5, as it is a representation of the star rating; n is the number of posts observed in the given time-frame; w is the amount of interactions the post had; X is the predicted star rating the post had, strictly an integer between 1 and 5; and J is the total number of interactions each post received summed up in the given time-frame. Each post receives one extra weight, so that it is still counted even if no one has interacted with it. With this formula, any time-frame can be picked, and all the social media posts through Reddit and Twitter can be aggregated into a single score between 1 and 5, which tells about the perception of the social media community about the brand. Furthermore, on Twitter, most tweets are assigned a geo code that denotes the location the tweet was sent from. With this, scores can be broken down for each location as well.
Real world event and the metric
Wendy’s announcement of the return of the Spicy Nuggets was met with cheers from the social media users. On the day of the announcement, the data shows that over 4000 tweets were made either as a response to the news or individual tweets mentioning Wendy’s. On Reddit 8 new posts appeared as well, gathering over 100 comments in total. On the day of the announcement, there was a clear increase in the score for the brand, which lasted in the coming days as well. Below shows the increase of the score all the way to 4.6 on the day of the announcement from the usual standard 3-3.5, and it shows that this increase lasted into the 2 days following the announcement as well. In fact, it took about 8 days for the score to return to the normal range Wendy’s usually has. The 0 point on the graph is the day of the announcement.
Since Wendy’s mainly operates in the USA and Canada, and the return of the Spicy Nuggets was only confirmed for the USA, it was worth taking a look at how this score changes if all the tweets made from the USA are ignored. The trend for the rest of the world shows a slight drop compared to the usual score at around 3.3, as people expressed their disappointment for not being able to try the new menu item.
If you are interested in how social media mining and machine learning can help you understand your customers better, reach out to us on LinkedIn or at email@example.com.