Skip to main navigation Skip to search Skip to main content

The Effects of Natural Language Processing on Big Data Analysis: Sentiment Analysis Case Study

  • Princess Sumaya University for Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

24 Scopus citations

Abstract

The social networks are one of the main sources of big data. Continuously, it produce huge volume of variety types of data at high velocity rates. This huge volume of data contains valuable information that requires efficient and scalable analysis techniques to be extracted. Hadoop/MapReduce is considered the most suitable framework for handling big data because of its scalability, reliability and simplicity. One of the basic applications to extract valuable information from data is the sentiment analysis. The sentiment analysis studies peoples' opinion by classifying their written text into positive or negative polarity. In this work, a sentiment analysis method for analyzing a Twitter data set is analyzed. The method uses the Naive Bayes algorithm for classifying the text into positive and negative polarity. Several linguistic and NLP preprocessing techniques were applied on the data set. The aim of these preprocessing techniques is to study their effects on the quality of big data classification. The applied preprocessing techniques have achieved an enhancement in the classification accuracy of the Naive Bayes algorithm. The experiments prove that the performance of the sentiment analysis is enhanced by 5% using NLP and linguistic processing, yielding an accuracy of 73 % on the used data set.

Original languageEnglish
Title of host publicationACIT 2018 - 19th International Arab Conference on Information Technology
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728103853
DOIs
StatePublished - 2 Jul 2018
Externally publishedYes
Event19th International Arab Conference on Information Technology, ACIT 2018 - Werdanye, Lebanon
Duration: 28 Nov 201830 Nov 2018

Publication series

NameACIT 2018 - 19th International Arab Conference on Information Technology

Conference

Conference19th International Arab Conference on Information Technology, ACIT 2018
Country/TerritoryLebanon
CityWerdanye
Period28/11/1830/11/18

Keywords

  • Big Data
  • Mahout
  • MapReduce Framework
  • Naive Bayes
  • Natural Language Processing
  • Sentiment Analysis

Fingerprint

Dive into the research topics of 'The Effects of Natural Language Processing on Big Data Analysis: Sentiment Analysis Case Study'. Together they form a unique fingerprint.

Cite this