Skip to main navigation Skip to search Skip to main content

Selecting the Best Compiler Optimization by Adopting Natural Language Processing

  • NED University of Engineering and Technology
  • Iqra University
  • Universidade Federal do Rio Grande do Sul

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Compiler is a tool that converts the high-level language into assembly code after enabling relevant optimizations. The automatic selection of suitable optimizations from an ample optimization space is a non-trivial task mainly accomplished through hardware profiling and application-level features. These features are then passed through an intelligent algorithm to predict the desired optimizations. However, collecting these features requires executing the application beforehand, which involves high overheads. With the evolution of Natural Language Processing (NLP), the performance of an application can be solely predicted at compile time via source code analysis. There has been substantial work in source code analysis using NLP, but most of it is focused on offloading the computation to suitable devices or detecting code vulnerabilities. Therefore, it has yet to be used to identify the best optimization sequence for an application. Similarly, most works have focused on finding the best machine learning or deep learning algorithms, hence ignoring the other important phases of the NLP pipeline. This paper pioneers the use of NLP to predict the best set of optimizations for a given application at compile time. Furthermore, this paper uniquely studies the impact of four vectorization and seven regression techniques in predicting the application performance. For most applications, we show that tfidf vectorization and huber regression result in the best outcomes. On average, the proposed technique predicts the optimal optimization sequence with a performance drop of 18%, achieving a minimum drop of merely 0.5% compared to the actual best combination.

Original languageEnglish
Pages (from-to)121700-121711
Number of pages12
JournalIEEE Access
Volume12
DOIs
StatePublished - 2024

Keywords

  • Compiler
  • natural language processing
  • optimization
  • regression
  • source code analysis
  • vectorization

Fingerprint

Dive into the research topics of 'Selecting the Best Compiler Optimization by Adopting Natural Language Processing'. Together they form a unique fingerprint.

Cite this