10935_Sentiment analytics – Lexicons construction and analysis

luanvantotnghiep.com

Scholars’ Mine
Scholars’ Mine
Masters Theses
Student Theses and Dissertations
Spring 2017
Sentiment analytics: Lexicons construction and analysis
Sentiment analytics: Lexicons construction and analysis
Bo Yuan
Follow this and additional works at: https://scholarsmine.mst.edu/masters_theses
Part of the Technology and Innovation Commons
Department:
Department:
Recommended Citation
Recommended Citation
Yuan, Bo, “Sentiment analytics: Lexicons construction and analysis” (2017). Masters Theses. 7668.
https://scholarsmine.mst.edu/masters_theses/7668
This thesis is brought to you by Scholars’ Mine, a service of the Missouri S&T Library and Learning Resources. This
work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the
permission of the copyright holder. For more information, please contact scholarsmine@mst.edu.

SENTIMENT ANALYTICS: LEXICONS CONSTRUCTION AND ANALYSIS

by

BO YUAN

A THESIS

Presented to the Faculty of the Graduate School of the

MISSOURI UNIVERSITY OF SCIENCE AND TECHNOLOGY

In Partial Fulfillment of the Requirements for the Degree

MASTER OF SCIENCE IN INFORMATION SCIENCE AND TECHNOLOGY

2017

Approved by

Keng Siau, Advisor
Fiona Nah
Michael Gene Hilgers
Pei Yin

iii
ABSTRACT
With the increasing amount of text data, sentiment analysis (SA) is becoming
more and more important. An automated approach is needed to parse the online reviews
and comments, and analyze their sentiments. Since lexicon is the most important
component in SA, enhancing the quality of lexicons will improve the efficiency and
accuracy of sentiment analysis. In this research, the effect of coupling a general lexicon
with a specialized lexicon (for a specific domain) and its impact on sentiment analysis
was presented. Two special domains and one general domain were studied. The two
special domains are the petroleum domain and the biology domain. The general domain
is the social network domain. The specialized lexicon for the petroleum domain was
created as part of this research. The results, as expected, show that coupling a general
lexicon with a specialized lexicon improves the sentiment analysis. However, coupling a
general lexicon with another general lexicon does not improve the sentiment analysis.

iv
ACKNOWLEDGMENTS
I would like to express the deepest appreciation to my advisor, Professor Keng
Siau, who has the attitude and the substance of a genius: he continually and convincingly
conveyed a spirit of adventure in regard to research and scholarship and an excitement in
regard to teaching. Without his guidance and persistent help, this thesis would not have
been possible.
I would like to thank my committee members, Professor Fiona Nah, Professor
Michael Gene Hilgers, and Professor Pei Yin. They helped me in this journey and are
concerned about my research progress and my well-being.
Finally, I would like to thank all my friends, IST staff, and my families for
helping me survive all the stress during the last two years and not letting me give up.

v
TABLE OF CONTENTS
Page
ABSTRACT
……………………………………………………………………………………………………….. iii
ACKNOWLEDGMENTS ……………………………………………………………………………………. iv
LIST OF ILLUSTRATIONS
………………………………………………………………………………… vi
LIST OF TABLES
……………………………………………………………………………………………… vii
NOMENCLATURE ………………………………………………………………………………………….. viii
SECTION
1. INTRODUCTION
………………………………………………………………………………………… 1
1.1. SENTIMENT ANALYSIS …………………………………………………………………….. 1
1.2. SENTIMENT LEXICON ……………………………………………………………………….. 1
1.3. DESIGN SCIENCE ……………………………………………………………………………….. 2
2. LITERATURE REVIEW
………………………………………………………………………………. 4
2.1. SENTIMENT ANALYSIS ……………………………………………………………………… 4
2.2. LEXICON …………………………………………………………………………………………… 14
2.3. APPLICATIONS OF SA
………………………………………………………………………. 15
3. METHODOLOGY
……………………………………………………………………………………… 20
3.1. IDENTIFY THE PROBLEM ………………………………………………………………… 20
3.2. SOLUTIONS ………………………………………………………………………………………. 20
3.2.1. Original Data Extraction ……………………………………………………………… 20
3.2.2. LDA Model and NLP ………………………………………………………………….. 21
3.2.3. The Calculation of Polarity Scores
………………………………………………… 21
4. EVALUATION AND COMPARISON …………………………………………………………. 22
4.1. METHOD …………………………………………………………………………………………… 22
4.2. PETROLEXICON, BIOLEXICON AND SOCIALSENT LEXICON ………… 22
4.3. RESULTS …………………………………………………………………………………………… 23
5. DISCUSSIONS ………………………………………………………………………………………….. 25
6. CONTRIBUTIONS AND FUTURE RESEARCH
………………………………………….. 26
BIBLIOGRAPHY
………………………………………………………………………………………………. 27
VITA ……………………………………………………………………………………………………………….. 32

vi
LIST OF ILLUSTRATIONS
Figure

Page
1.1. SA Lexicon Network …………………………………………………………………………………….. 2
2.1. Sentiment Analysis Techniques
………………………………………………………………………. 5
2.2. Commonly Used Sentiment Analysis Methods …………………………………………………. 9
2.3. Applications of Sentiment Analysis……………………………………………………………….. 16
4.1. Analysis Procedure ……………………………………………………………………………………… 22

vii
LIST OF TABLES
Table

Page
2.1. Sentiment Analysis Techniques
………………………………………………………………………. 5
2.2. Commonly Used Sentiment Analysis Methods ……………………………………………….. 10
2.3. Applications of Sentiment Anakysis
………………………………………………………………. 16
4.1. Results for Petrolexicon
……………………………………………………………………………….. 23
4.2. Results for Biolexicon………………………………………………………………………………….. 24
4.3. Results for SocialSent ………………………………………………………………………………….. 24

viii
NOMENCLATURE
Symbol
Description


Dirichlet priori
θ
a multinomial distribution
ϕ
a multinomial distribution

1. INTRODUCTION
1.1. SENTIMENT ANALYSIS
Generally, data mining is the process of analyzing data in order to gain some
goals and integrate it into useful information (Palace, 1996). Text mining is to use various
mining algorithms to process useful information from the text (Text Mining, 2015). After
text mining, sentiment analysis came out with more advanced technology for more
accurate text mining. Sentiment analysis is to recognize and extract meaningful
information using natural language processing (NLP) and computational linguistics from
data. The application of sentiment analysis is happening in marketing, customer service,
education and even energy fields (Sentiment analysis, 2015). Sentiment analysis is,
undoubtedly, the advanced method in text mining, especially online social media data. As
the Internet is developing rapidly, it is common to find reviews or comments of products,
services, events, and brand names online (Matheus Araújo; Pollyanna Gonçalves;
Meeyoung Cha; Fabrício Benevenuto, 2014). The goal of sentiment analysis is to identify
the attitude of customers according to the polarity of the reviews and comments that they
left online. Obviously, sentiment analysis created a new type of data. Data will be never
only numerical digits but reviews and comments. It makes the contribution to gain what
people think about the subject. This information may be from tweets, blogs, and new
articles. A huge amount of sentences, conversations, product reviews and posts on social
media are produced every second. They are all data which can be analyzed and provide
much information to people. People here can refer to those in companies, costumers or
users who experienced some products.

1.2. SENTIMENT LEXICON
Lexicon is an important part after cleaning data and before feature selection in
sentiment analysis. So lexicon/corpus construction is generally viewed as a prerequisite
for sentiment analysis. Since the middle of 20th century, many lexicons were built and
developed such as Harvard Inquirer, Linguistic Inquiry and Word Counts, MPQA
Subjectivity Lexicon, Bing Liu’s Opinion Lexicon and SentiWordNet (Matheus Araújo;
Pollyanna Gonçalves; Meeyoung Cha; Fabrício Benevenuto, 2014).

2
However, there are few specialized lexicons for specialized domains. The two
specialized lexicons are biolexicon and socialsent. As part of this research, a specialized
lexicon, petrolexicon, was developed for the petroleum industry. The idea is to establish a
SA lexicon network. The network where its center is SentiWordNet and SentiWordNet
can be coupled with other domain lexicons such as business domain lexicon and
petroleum domain lexicon. (Figure 1.1).

Figure 1.1. SA Lexicon Network

1.3. DESIGN SCIENCE
Design science research (DSR) focuses on exploring new methods for problems
known or unknown (Alan R. Hevner, Salvatore T. March, Jinsoo Park, Sudha Ram,
2004). In this research, design science method will be used to structure methodology. The
differences between DSR and widespread qualitative and quantitative methods have two
key points: 1) DSR is trying to solve a generic problem and considered as an activity for
testing hypothesis for future research. 2) The latter aims to explore real-life situations and
come up with a theory that explains the current or past problems (Alan R. Hevner,
Salvatore T. March, Jinsoo Park, Sudha Ram, 2004). Meanwhile, there are several steps
to be followed if design science is used: 1) Start a specific space and find a solution. 2)
Central
lexicons
(SentiW
ordNet;
Bing’s
lexicon)
Edu-
lexic
on
……
Biole
xicon
Petro
lexic
on
Cs-
lexic
on

3
Generalize the problem and solution when moving to the generic space. (Alan R. Hevner,
Salvatore T. March, Jinsoo Park, Sudha Ram, 2004).
In this paper, the design science method was used to guide the research. After a
thorough literature review, the specialized lexicon, petrolexicon, was constructed for the
petroleum industry. This is followed by an analysis of the three lexicons — petrolexicon,
biolexicon, and socialsent — in text analysis. Finally, the suggestions on how to improve
lexicon creation and the future research directions for sentiment analysis were presented.

4
2. LITERATURE REVIEW
2.1. SENTIMENT ANALYSIS
There are some main sentiment analysis techniques and methods such as machine
learning, lexical dictionaries, natural language processing, psychometric scale,
imagematics, and cloud-based technique (Matheus Araújo; Pollyanna Gonçalves;
Meeyoung Cha; Fabrício Benevenuto, 2014). The machine learning needs a huge data
resource due to the training part. Linguistic method is much easier than machine learning
in the terms of operation and comprehension. Nowadays, these two methods are usually
combined with each other. For example, in ‘Sentiment Analysis-A Study on Product
Features’ (Meng, 2012), unsupervised and supervised machine learning include many
linguistic rules and constraints that could improve the accuracy of calculations and
classifications. Psychometric scale method is a more specific area. It mainly analyzes the
mood of people and introduces the new smile or cry index as a formalized measure of
societal happiness and sadness. Therefore, it is sometimes combined with lexical
dictionaries. Lexical dictionary method is a development of lexical affinity and linguistic
method to some extent. The simple method can be easy to operate if you are a beginner.
It does not require too many data resources or calculations. Natural language processing
is a technique that can implement the interaction between the human and computer. It can
help us analyze the polarity of texts. SenticNet is based on the techniques. It is an
approach that classifies texts as positive or negative (Matheus Araújo; Pollyanna
Gonçalves; Meeyoung Cha; Fabrício Benevenuto, 2014).
Sentiment analysis techniques can be broadly classified into two categories –
Machine Learning and Linguistic Method (as shown in Figure 2.1). Table 2.1 lists some
papers in these two categories.
Machine learning is the most popular method right now in sentiment analysis
area. In machine learning, there are also many techniques such as Support Vector
Machine, Decision Tree, Neural Network Learning and so on. Also supervised machine
learning and unsupervised machine learning are also playing an important role in
machine learning.

5

Figure 2.1. Sentiment Analysis Techniques

Table 2.1. Sentiment Analysis Techniques

Paper Title
Techniques Used

Machine
Learning
A Novel Hybrid HDP-LDA
Model for Sentiment Analysis
(Wanying Ding, Xiaoli Song,
Lifan Guo, Zunyan Xiong,
Xiaohua Hu, 2013)
This paper proposes a novel hybrid
Hierarchical Dirichlet Process-Latent
Dirichlet
Allocation
(HDP-LDA)
model. This model can automatically
determine the number of aspects,
distinguish
factual
words
from
opinioned words, and effectively
extracts the aspect specific sentiment
words.
Deep Learning for the Web
(Kyomin Jung, Byoung-Tak
Zhang, Prasenjit Mitra, 2015)

Deep learning is a machine learning
technology that automatically extracts
higher-level representations from raw
data by stacking multiple layers of
neuron-like units. The stacking allows
for
extracting
representations
of
increasingly complex features without
time-consuming,
offline
feature
engineering.

Sentiment
Analysis
Techniques
Machine
Learning
Linguistic
Method

6
Table 2.1. Sentiment Analysis Techniques (Cont.)

Paper Title
Techniques Used

Machine
Learning
iFeel: A Web System that
Compares
and
Combines
Sentiment Analysis Methods
(Matheus Araújo; Pollyanna
Gonçalves; Meeyoung Cha;
Fabrício Benevenuto, 2014)

iFeel, a Web application system is
introduced in this paper. iFeel can
access
seven
existing
sentiment
analysis methods: Happiness Index,
SentiWordNet, PANAS-t, Sentic-Net,
and SentiStrength, SASA, Emoticons.
iFeel can combine these methods to
achieve high F-measure.
A Comparative Study of
Feature
Selection
and
Machine
Learning
Techniques
for
Sentiment
Analysis
(Anuj
Sharma,
Shubhamoy Dey, 2012)
In this paper, machine learning based
on Naïve Bayes, Support Vector
Machine, Maximum Entropy, Decision
Tree, K-Nearest Neighbor, Winnow,
and Adaboost is applied.
Sentence-based
Plot
Classification
for
Online
Review Comments (Hidenari
IWAI, Yoshinori HIJIKATA,
Kaori
IKEDA,
Shogo
NISHIDA, 2014)
Many shopping sites provide functions
to submit a user review for a purchased
item. Reviews of items, including
stories such as novels and movies
sometimes contain spoilers (undesired
and revealing plot descriptions) along
with the opinions of the review author.
A system was proposed. Users see
reviews
without
seeing
plot
descriptions. This system classifies
each sentence in a user review as plot-
reviews. Five common machine-
learning algorithms were tested to
ascertain the appropriate algorithm to
address this problem.

7
Table 2.1. Sentiment Analysis Techniques (Cont.)

Paper Title
Techniques Used

Machine
Learning

Sentiment analysis in twitter
using
machine
learning
techniques (Neethu M S,
Rajasree R, 2013)
The twitter posts about electronic
products like mobiles, laptops and so
on are analyzed by machine learning.
Sentiment
analysis
of
Facebook
statuses
using
Naive Bayes classifier for
language learning (Christos
Troussas, Maria Virvou, Kurt
Junshean Espinosa, Kevin
Llaguno, Jaime Caro, 2013)
This paper uses Naïve Bayes Classifier
to pattern the educational process and
experimental results.
Resolving
Inconsistent
Ratings and Reviews on
Commercial Webs Based on
Support
Vector
Machines
(Xiaojing Shi, Xun Liang,
2015)
852,071 ratings and reviews from the
Taobao website are the dataset. The
support vector machine is used to
solving
inconsistent
ratings
and
reviews.
Sentiment Word Identification
Using the Maximum Entropy
Model (Xiaoxu Fei, Huizhen
Wang, Jingbo Zhu, 2010)
The maximum-entropy classification
model
is
constructed
to
detect
sentiment
words
in
an
opinion
sentence.

8
Table 2.1. Sentiment Analysis Techniques (Cont.)

Paper Title
Techniques Used

Machine
Learning

Sentiment
Analysis
of
Twitter Data Using Machine
Learning Approaches and
Semantic Analysis (Geetika
Gautam,
Divakar
yadav,
2014)
Dataset was preprocessed first, after
that extracted the adjective from the
dataset that has some meaning which is
called feature vector, then selected the
feature vector list and thereafter SVM,
Naive Bayes,
Maximum
entropy
corporation with WordNet are used to
extract synonyms for the content
feature.

Linguistic
Method
Pathways for irony detection
in tweets (Larissa A. de
Freitas, Aline A. Vanin,
Denise N. Hogetop, Marco
N.
Bochernitsan,
Renata
Vieira, 2014)
After observing the general data
obtained and a corpus constituted by
tweets, a set of patterns that might
suggest ironic/sarcastic statements are
proposed. The extracted texts for each
pattern were analyzed by a judge in
order to classify whether those texts
represent ironic/sarcastic statements or
not.
Big Data Sentiment Analysis
using Hadoop (Ramesh R,
Divya G, Divya D, Merin K
Kurian,
Vishnuprabha
V,
2015)
Sentiment Analysis on Big Data is
achieved by collaborating Big Data
with hadoop. The proposed approach
is to identify texts into positive,
negative and neutral position with
Hadoop, which is a dictionary-based
technique.

9
Figure 2.2 depicts the commonly used sentiment analysis methods.
Representative papers are listed in Table 2.2.
As seen below, commonly used sentiment analysis methods are machine learning,
lexical dictionaries, natural language processing, and psychometric scale. Natural
language processing is not only applied to the big data area but also statistics and finance.
It is useful to help researchers to recognize words, sentences, and paragraphs through
computers. It has some popular tools here: OpenNLP, FudanNLP, Language Technology
Platform (LTP). There are some difficult points during applying NLP. How to recognize
every word is the first difficult. Since there are more than one meaning for many words.
How to recognize the meaning of every word is another difficult.

Figure 2.2. Commonly Used Sentiment Analysis Methods

Commonly Used
Sentiment
Analysis methods
Machine Learning
Lexical
Dictionaries
Natural Language
Processing
Psychometric
Scale
Other Methods
Coding
Imagematics
Kernel Method
Cloud-Based
Program
A Fuzzy Logic
Approach

10
Table 2.2. Commonly Used Sentiment Analysis Methods

Paper Title
Techniques Used
Machine
Learning
Same as those in Table 2.1.

Lexical
Dictionaries
Big
Data
Sentiment
Analysis
using
Hadoop
(Ramesh R, Divya G, Divya
D,
Merin
K
Kurian,
Vishnuprabha V, 2015)
Sentiment Analysis on Big Data is
achieved by collaborating Big Data
with hadoop. The focus of this
research was to device an approach
that can perform Sentiment Analysis
quicker because vast amount of data
needs to be analyzed. Also, it had to
ensure
that
accuracy
is
not
compromised
too
much
while
focusing on speed.
Microblogging
sentiment
analysis with lexical based
and
machine
learning
approaches
(Maharani,
2013)
There are two main methods, which
are lexical based machine learning
and model based. This research is
trying to classify tweets using those
two methods.
Chinese
sentiment
classification using a neural
network tool — Word2vec
(Zengcai
Su, Hua Xu,
Dongwen Zhang, Yunfeng
Xu, 2014)
The neural network models based on
word2vec is constructed to learn the
vector representations in a higher
dimension.

11
Table 2.2. Commonly Used Sentiment Analysis Methods (Cont.)

Paper Title
Techniques Used

Lexical
Dictionaries
Analysing
market
sentiment in financial news
using lexical approach (Tan
Li Im, Phang Wai San,
Chin Kim On, Rayner,
Patricia Anthony, 2013)
A lexicon-based approach to analyze
financial news.
Emotions on Facebook
A
Content
Analysis
of
Mexico’s Starbucks Page
(Anatoliy Gruzd, Jenna
Jacobson,
Philip
Mai,
Barry Wellman, 2015)
Emoticons are the newly-developing
language for sentiment analysis. It is
simple to detect the polarity. But it is a
huge project to establish a good-
running emoticon-dictionary.

Natural
Language
Processing
iFeel: A Web System that
Compares and Combines
Sentiment
Analysis
Methods (Matheus Araújo;
Pollyanna
Gonçalves;
Meeyoung Cha; Fabrício
Benevenuto, 2014)

iFeel, a Web application system is
introduced in this paper. iFeel can
access to seven existing sentiment
analysis methods: Happiness Index,
SentiWordNet, PANAS-t, Sentic-Net,
and SentiStrength, SASA, Emoticons.
iFeel can combine these methods to
achieve high F-measure.
A Localization Toolkit for
Sentic Net (Yunqing Xia,
Xiaoyu Li, Erik Cambria,
Amir Hussain, 2014)
A toolkit for creating non-English
versions of SenticNet in a time- and
cost-effective way is proposed.

12
Table 2.2. Commonly Used Sentiment Analysis Methods (Cont.)

Paper Title
Techniques Used

Natural
Language
Processing
Enhanced SenticNet with
Affective
Labels
for
Concept-Based Opinion
Mining (Soujanya Poria,
Alexander
Gelbukh,
Amir Hussain, Newton
Howard, Dipankar Das,
Sivaji
Bandyopadhyay,
2013)
Enhanced SenticNet with Affective
Labels for Concept-Based Opinion
Mining (Soujanya Poria, Alexander
Gelbukh, Amir Hussain, Newton
Howard,
Dipankar
Das,
Sivaji
Bandyopadhyay, 2013)

Psychometric
Scale
Collective
Smile:
Measuring
Societal
Happiness
from
Geolocated
Images
(Saeed
Abdullah,
Elizabeth L. Murnane,
Jean
M.R.
Costa,
Tanzeem
Choudhury,
2015)
This paper introduces the Smile Index
as a standard measurement of general
happiness in society.
iFeel: A Web System that
Compares and Combines
Sentiment
Analysis
Methods
(Matheus
Araújo;
Pollyanna
Gonçalves;
Meeyoung
Cha;
Fabrício
Benevenuto, 2014)

iFeel, a Web application system is
introduced in this paper. iFeel can
access to seven existing sentiment
analysis methods: Happiness Index,
SentiWordNet, PANAS-t, Sentic-Net,
and SentiStrength, SASA, Emoticons.
iFeel can combine these methods to
achieve high F-measure.

13
Table 2.2. Commonly Used Sentiment Analysis Methods (Cont.)

Paper Title
Techniques Used

Psychometric
Scale
Emotions
on
Facebook
A
Content
Analysis of Mexico’s
Starbucks Page (Anatoliy
Gruzd, Jenna Jacobson,
Philip
Mai,
Barry
Wellman, 2015)
Emoticons are the newly-developing
language for sentiment analysis. It is
simple to detect the polarity. But it is a
huge project to establish a good-
running emoticon-dictionary.

Current New
Methods
Tweeting Live Shows: A
Content
Analysis
of
Live-Tweets from Three
Entertainment Programs
(Qihao
Ji,
Danyang
Zhao, 2015)
In terms of the coding schema, each
tweet was categorized by its Language
(whether a tweet was written in
English), Relevancy (whether it was
relevant to the show), Nature of Tweet
(whether it was a retweet, a tweet sent
to a specific user, or a tweet sent to
other users), and Character Name
(whether the tweet contained any
character’s name from the show). Then
coding procedure was processed.
Towards
Social
Imagematics: sentiment
analysis
in
social
multimedia
(Quanzeng
You, Jiebo Luo, 2013)
This paper looks at not only textual but
visual features in sentiment analysis.

14
Table 2.2. Commonly Used Sentiment Analysis Methods (Cont.)

Paper Title
Techniques Used

Current New
Methods
Enhanced
Factored
Sequence
Kernel
for
Sentiment Classification
(Luis
Trindade,
Hui
Wang,
William
Blackburn,
Philip
S.
Taylor, 2014)

A very active line of work focuses on
the application of existing machine
learning methods to sentiment analysis
problems, for example support vector
machine, which is a popular kernel
method for text classification. This
paper focuses on sequence kernels,
which
have
been
successfully
employed for various natural language
processing tasks including sentiment
analysis.
Tweeting Live Shows: A
Content
Analysis
of
Live-Tweets from Three
Entertainment Programs
(Qihao
Ji,
Danyang
Zhao, 2015)
For data collection, DiscovertextTM, a
cloud-based program was used.
A Fuzzy Logic Approach
for Opinion Mining on
Large Scale Twitter Data
(Li Bing, Keith C. C.
Chan, 2014)
This paper proposes a novel matrix-
based fuzzy algorithm, called the
FMM system, to mine the defined
multi- layered Twitter data.

2.2. LEXICON
Lexicon, as mentioned above, is an important tool that plays a role in sentiment
analysis. Among existing lexicons, SentiWordNet is the most well-known and the most
popular. SentiWordNet has three sentiment levels for each opinion word: positivity,
negativity, and objectivity (dell’Informazione). SentiWordNet has developed from

15
version 1.0 to version 3.0. There are some differences between SentiWordNet 1.0 and
3.0: (1) versions of WordNet, (2) algorithms used for annotating WordNet automatically,
which now can refine the scores randomly. SentiWordNet 3.0 is trying to the improve
part (2) (dell’Informazione).

2.3. APPLICATIONS OF SA
Same argue that sentiment analysis originates from customer products and
services. Amazon.com is a representative example. Twitter and Facebook are also a hot
and popular sites for many sentiment analysis applications.
The applications for sentiment analysis are many. Thousands of text documents
can be processed by sentiment analysis in minutes, compared to the hours it would take a
team of people to manually complete. The data can be words, sentences, or paragraphs. In
China, sentiment analysis is called feeling analysis directly. It suggests that what feelings
or mood people have can be analyzed. Digital numbers, on the other hand, cannot tell us
what people feel. They can only tell us sales volume or the marketing distribution.
Because SA can be efficient and can produce relatively high and reliable accuracy, many
businesses and researchers are adopting text and sentiment analysis and combining them
into their own research processes.
In business, the most widely used applications are in financial and sale marketing.
For example, the Stock Sonar (www. Thestocksonar.com). It is a sentiment system where
positive and negative assessments for each stock are updated every minute. In China,
Yun Ma, Alibaba’s CEO just created a miracle on Nov. 11th. There was a nation-wide
shopping holiday on Taobao, Alibaba’s shopping website, the biggest online shopping in
China. There was 100 billion RMB sales volume in one minute after the online shopping
holiday opened. Every product there has customer reviews and the customer reviews
have already been summarized and separated into different groups: good product, bad
product, nice looking, useful, and bad quality…customers can check them more easily
than amazon. Because there are only raw data on Amazon, it is not easy for customers to
find if there are some bad reviews. Sentiment applications in health care almost and
mainly focus on reviews of drugs or health care service from patients. Figure and table
2.3 depicts some of the application areas for sentiment analysis.

16

Figure 2.3. Applications of Sentiment Analysis

Table 2.3. Applications of Sentiment Analysis

Paper Title
Applications

Business
A
Large-Scale
Sentiment
Analysis for Yahoo! Answers
(Onur Kucuktunc, B. Barla
Cambazoglu, Ingmar Weber,
Hakan
Ferhatosmanoglu,
2012)
This paper uses a sentiment extraction
tool to investigate the information like
gender, education level, and age in a
large online question-answering site.
Analyzing what can affect the mood of
customers
will
be
applied
in
advertisement, recommendation, and
search.
Emotions
on
Facebook
A
Content Analysis of Mexico’s
Starbucks
Page
(Anatoliy
Gruzd, Jenna Jacobson, Philip
Mai, Barry Wellman, 2015)
Emoticons are the newly-developing
language for sentiment analysis. It is
simple to detect the polarity. But it is a
huge project to establish a good-
running emoticon-dictionary.

Applications
Business
Health care
Education
Energy
Politics

Đánh giá post

Để lại một bình luận

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *