Thursday, November 3, 2016

How To Be Twee About Tweeting

When science sounds like a Leslie Gore song, you might be doing it wrong.  (Science Daily:  Make America tweet again)

Computer scientists from the University of Utah's College of Engineering have developed what they call "sentiment analysis" software that can automatically determine how someone feels based on what they write or say.  To test out the accuracy of this software's machine-learning model, the team used it to analyze the individual sentiments of more than 1.6 million (and counting) geo-tagged tweets about the U.S. presidential election over the last five months.  A database of these tweets is then examined to determine whether states and their counties are leaning toward the Republicans or Democrats.

- Science Daily

I know what you're feeling because ... it's my party and I'll cry if I want to.  (Leslie Gore)


Use of 1.6 million tweets sounds impressive but, even if each tweet represents one American, that number of tweets would still represent less than 1/150th of the population.  Given that pitiful sampling, there shouldn't be any predictive value in the data obtained.

In fact, there wasn't (larfs).

Some interesting facts about this year's U.S. presidential election based on a sample of what people are tweeting: 

• Based on the number of positive tweets posted since June toward each party, the computer model predicts that Hillary Clinton will win the presidential election.

- Science Daily

Now we see it didn't successfully predict anything yet but we do observe it would have been more useful predicting which team would win the World Series since we could have made some money from that knowledge.


There may actually have been some valid science in this ... but they don't tell us what it was.  Predictions are all very well but your Aunt Mabel can give you some of those.  We want to know what basis justifies calling the predictions accurate.

Then those tweets were sifted through the team's "sentiment analysis" software where each tweet was analyzed and assigned a score from 0 to 1 where 0 is the most negative sentiment, 1 is the most positive sentiment, and 0.5 is neutral.  The scores are then collected in a database that can calculate a state or county's political leanings in real time based on the tweets. The database is constantly updated with new tweets.  To measure the accuracy of the model, the team compared its results to the New York Times Upshot election forecast website and found the state-by-state analysis was very similar.

"I think it works really well.  It matches up with the major events that happened during this election season. That's a good indicator that the results are accurate," says Li.  "We're hoping to develop some more scientific measurements to confirm this observation for an upcoming paper, but the early results are very positive."

- Science Daily


Arrggghhh ... frustration as they tell us the predictive results were generally accurate but they don't give us any details.

The link is there for you to review.  The science may actually have done something well and the summary wasn't so good at conveying it; maybe I have misinterpreted that which was conveyed; maybe Minnie Mouse should get a sex change operation for Millennial significance.  Unknown.

No comments: