Abstract
new platforms and tools that can handle large volumes of data. In this paper, we present new evaluation experiments of sentiment analysis for a large-scale dataset of online customer's reviews under Apache Spark data Processing System. Apache Spark's scalable machine learning library (MLlib) is used and three classification techniques from the library are applied; Naïve Bayes, Support vector machine, and logistic regression. The results are evaluated using the accuracy metric. Experimental results show that Support vector machine classifier outperforms Naïve Bayes and logistic regression classifiers.
| Original language | English |
|---|---|
| Pages (from-to) | 183-189 |
| Number of pages | 7 |
| Journal | Procedia Computer Science |
| Volume | 141 |
| DOIs | |
| State | Published - 2018 |
| Externally published | Yes |
| Event | 9th International Conference on Emerging Ubiquitous Systems and Pervasive Networks, EUSPN 2018 - Leuven, Belgium Duration: 5 Nov 2018 → 8 Nov 2018 |
Keywords
- Apache spark
- Big Data
- MLlib
- Machine Learning
- Sentiment
Fingerprint
Dive into the research topics of 'A large-scale sentiment data classification for online reviews under apache spark'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver