Two Arizona State University computer science doctoral students are developing computer models for analytical systems that can gather data generated by Twitter and organize it in a way that can lead to analysis of public opinion. ASU Computer Informatics professor Subbarao Kambhampati will talk about the twitter analysis tools.
Ted Simons: ASU computer science doctoral students are working on research to use data from Twitter to gauge and analyze public opinion. ASU computer informatics professor Subbarao Kambhampati is here to talk about it. Welcome.
Subbarao Kambhampati: Thank you.
Ted Simons: Talk to us about using Twitter to analyze public opinion.
Subbarao Kambhampati: I was thinking of a time when people tried to figure out what an event- how is an event actually received by just looking at the town square and what the people in the town square say. Twitter has become the modern day town square. People can get on and start expressing themselves. We were watching the fact that oftentimes, let's say there was presidential debate going on last year, they would look at what people were saying about various, for example, the Big Bird thing became a big issue, for example. One of the questions, there are obviously journalists and some of the time is going to be invested into actually counting which parts of the event are being talked about, which of the tweets, and whether they are positive or negative, rather than going to the higher level. We thought we can use the computer science techniques to support this actual alignment in sentiment analysis. And then the journalists can do more of an interesting analysis of public opinion.
Ted Simons: How do you do a higher level of analysis, get these twitter feeds segmented and fragmented. How are the metrics?
Subbarao Kambhampati: So the way the twitter has hosts, you can get copies of tweets in real time. Of course we are not rich enough to get them, we get a fraction of it, a sample of it. And many of these- the public event in particular, you know that there is a prespecified tag to which people are tweeting. So you do know what tweets are relevant to this particular event. What you don't know is which part of the event are they talking about. For example, something as simple as the shutdown, it's a single-point issue. Something more complicated like an entire debate, something like a speech by Obama, this could have multiple different things going, and you want to be able to see which pieces are being talked about by which of the tweets.
Ted Simons: How do you define what's being said? Twitter has a language all it's own. You have to figure out and align fragmented words, slang words, words that aren't even words.
Subbarao Kambhampati: The thing I'd like to say is the following. If you look at the metaphor of the search engines, when you're looking for something on Google, how could the search engine know the specific query I'm looking for. Mostly they're doing word-level analysis and trying to show you files. And what's interesting is the sweet spot where the search engine gets you close to something that you're looking for, and then you can go through the documents yourself actually doing an analysis of English. In twitter they only want 45 words and there is lots of slang and so on. What's interesting is because there are so many of them and essentially looking at- it's a public event with a large number of people tweeting. So you can still get, even if you don't get specific tweets or if you don't get the exact nuance correctly, you will get the sentiment expressed by a large number of people.
Ted Simons: Indeed, a large number of people. But are you getting public opinion in general or public opinion on the twitter-verse? You have to figure out, who are these people tweeting.
Subbarao Kambhampati: I am a computer science professor, I'm not a journalist. What we are not doing here is getting an exact carefully selected sample of people. The work you are doing is not going to redo the careful studies of focus groups, Gallup Polls and so on. They express opinions whether you want it or not. More in journalism, in fact if you look at the "New York Times," many of the events that happen, people lead by saying what was the twitter universe saying. For example, there was like a beauty pageant and somebody from Indian origin won. The interesting thing is that most people probably shrugged about what happened but a few people did get onto the twitter. The media did report it. All we're saying is that you can support analyzing the self-selected people who watched and what they are saying. If you want to use Twitter to see how a particular public event recieved, we will support it from a technical view.
Ted Simons: Computational science in general, how far can it go to understand human behavior, human thought, human action?
Subbarao Kambhampati: That's one of the most fascinating things that's going on right now in the context of twitter. People wanted you to express their opinions. But since there are so many people expressing opinions at the same time, twitter has had to become a footprint of the human behaviors. It was started some time back, we would know different people in different countries sleep at different times. So these guys would look at the mass of tweets and decide well, this is the times when people are sleeping and these are the times when they are waking up, and so on. Because of the sense of the numbers involved, you do get a good understanding of the footprint of the collective human behavior, not a single person's behavior. I may not know what you are trying to say by your tweet, but for the mass of our audience actually tweeting about this event, I can get a general idea of which topics that were received well.
Ted Simons: So very quickly, response so far to the research.
Subbarao Kambhampati: So one of the interesting things is that we started by looking at- so there are two different aspects from a technical point of view. One is that the tweets are going on and you need to connect to which part of the event they are talking about. There is the alignment work and the sentimental analysis part. So we are focusing on presenting the technical details of this work to the community, not so much of talking to journalists to see if they can do this. This is like a good beginning, I guess.
Ted Simons: Good beginning, good information, good to have you here. Thank you so much for joining us.
Subbarao Kambhampati: Thank you.
Ted Simons: I'm Ted Simons, thank you so much for joining us.
Subbarao Kambhampati:Computer Informatics Professor, Arizona State University;