Skip to main navigation Skip to search Skip to main content

A large-scale sentiment data classification for online reviews under apache spark

  • Princess Sumaya University for Technology
  • University of Jordan

Research output: Contribution to journalConference articlepeer-review

36 Scopus citations

Abstract

new platforms and tools that can handle large volumes of data. In this paper, we present new evaluation experiments of sentiment analysis for a large-scale dataset of online customer's reviews under Apache Spark data Processing System. Apache Spark's scalable machine learning library (MLlib) is used and three classification techniques from the library are applied; Naïve Bayes, Support vector machine, and logistic regression. The results are evaluated using the accuracy metric. Experimental results show that Support vector machine classifier outperforms Naïve Bayes and logistic regression classifiers.

Original languageEnglish
Pages (from-to)183-189
Number of pages7
JournalProcedia Computer Science
Volume141
DOIs
StatePublished - 2018
Externally publishedYes
Event9th International Conference on Emerging Ubiquitous Systems and Pervasive Networks, EUSPN 2018 - Leuven, Belgium
Duration: 5 Nov 20188 Nov 2018

Keywords

  • Apache spark
  • Big Data
  • MLlib
  • Machine Learning
  • Sentiment

Fingerprint

Dive into the research topics of 'A large-scale sentiment data classification for online reviews under apache spark'. Together they form a unique fingerprint.

Cite this