One of the most valuable features of the ClearVoice software is our comprehensive search index of content and influencers, which we call Content Studio. As of today, the index contains 122 million posts and more than 380,000 author profiles from over 240,000 of the most popular publishers across the web. The index is updated in real time as we discover new content — currently about 200,000 new posts per day.
Content Studio enables us to get a deep understanding of what content is being published where and how it performs on social networks. This is our first data study in a series that will bring you insights into the world of content.
The first topic we’re investigating: the presidential primaries, where we found some interesting data. We focused our search on articles on all types of sites with headlines that explicitly contained a candidate’s first and last names. Looking at articles from May 2015 to June 2016, we found 147,139 articles in our index that met this criteria.
First we looked at who made the most headlines:
With 53% of the posts written about Trump, the presumptive GOP candidate is the subject of more articles than Hillary and Bernie combined. The majority of Trump-related news also carries over into Facebook, where he dominates activities in “likes,” comments and shares.
While there are more overall posts about Hillary than Bernie, the Vermont senator leads Hillary in Facebook activity, with nearly double the amount of “likes.”
Then we looked at the Facebook activity garnered by all this press:
Trump clearly dominates overall Facebook activity. He has more than double Hillary’s “like” count on pieces about him (any publicity is good publicity?). The surprise here is that Bernie nearly doubled the amount of “likes” Hillary nabbed.
Next, we analyzed sentiment across all sites:
Sentiment analysis is the process of determining whether a piece of content is positive, negative or neutral. To understand the overall tone of content about each candidate, we analyzed the average sentiment across all headlines for each candidate using the VADER model.
When we looked at the average sentiment of all headlines across each candidate, we found:
- Trump is the only one who has more negative-sentiment headlines than positive
- Bernie has the largest gap between positive and negative — he has 42 percent more positive headlines than negative — a far larger portion than Hillary at 11 percent and certainly Trump at -10 percent
… and sentiment for the candidates across major news sites:
Once we had the data, we also looked at the average sentiment for each candidate on popular news sites.
- Fox News published a dramatic amount of negative-sentiment posts about Hillary
- We were surprised to see how close to neutral Trump was across the board. We found that Trump has a fairly balanced set of headline sentiment across the sites we analyzed. Huffington Post came out neutral to slightly positive on Trump based on this analysis, which surprised us — but after sampling a set of headlines, we felt confident in the results
And the big surprise this graph tells: The sentiment across all news organizations has been incredibly positive for Bernie. In our news site analysis, Bernie stuck out with far more positive sentiment headlines than Hillary or Trump (second biggest surprise that’s really no surprise — that’s there’s any type of news slant at all).
Facebook ‘like’ data tells incredible story
As we saw above, Bernie and Trump both had more Facebook activity than Hillary… but look at what the numbers tell us here (just focusing on “likes”):
- Hillary had 18 percent more “likes” on positive headlines than negative ones
- The “likes” on the Bernie’s positive-sentiment headlines nearly exceed the “likes” on Trump’s negative headlines
- Bernie had 77 percent more “likes” on positive-sentiment headlines vs. his negative-sentiment ones, demolishing both Hillary and Trump in this regard
- Trump has an incredible amount of “likes” on negative headlines — 15 percent more negative than positive likes.
What the Buzz Looks Like
We wanted to look at the most-talked-about subjects for each candidate. Again, we focused our search on articles on all types of sites with headlines that explicitly contained a candidate’s first and last names. Check out these great word clouds:
Can’t escape those emails. In the first set, only the “stop” words were removed — words which are not significant to search queries, such as and, only, but, the, have, etc.
- Hillary’s email scandal is the most-talked-about subject in her word cloud
- “GOP” is the number one word in Trump headlines
- Overall, Bernie’s words are far less scandal-ridden than the words in Hillary’s or Trump’s
Johnny Depp & Donald Trump? Here we pulled out words that appeared in headlines mentioning a specific candidate but were never mentioned in the other two candidates’ headlines.
- “Apprentice,” “Macy’s” and “Depp” stand out for Trump (if you’re wondering how Depp fits into this, check out the Funny or Die short featuring the actor)
- “Jerry’s,” “David’s” and “postal” stand out in Bernie’s cloud (if you’re wondering what’s “David’s” refers to, check out this SNL skit starring Larry David)
- “Blumenthal,” “pantsuit” and “Britney” stand out in Hillary’s cloud (“pantsuit?” We Googled it for you)
Politically charged words
Hot-button topics. Here we pulled a list of the top 200 politically charged words that we saw across all of the headlines we analyzed.
- The dominant words in Trump’s cloud: Bush, attacks, war, Twitter, race and Muslim
- The dominant words in Bernie’s cloud: black, socialist, race, political, socialism and SNL
- The dominant words in Hillary’s cloud: email, Benghazi, black, Bush, gun and women
Look closer to find many more interesting terms in each candidate’s word cloud.
Download the complete infographic here.
If you want more details on this study or how we collected the information, email us at firstname.lastname@example.org.
What should we look at next? Let us know on Twitter.