Performance-efficient Recommendation and Prediction Service for Big Data frameworks focusing on Data Compression and In-memory Data Storage Indicators

Hrachya Astsatryan; Arthur Lalayan; Aram Kocharyan; Daniel Hagimont

doi:10.12694/scpe.v22i4.1945

Authors

Hrachya Astsatryan Institute for Informatics and Automation Problems National Academy of Sciences of Armenia, Armenia
Arthur Lalayan National Polytechnic University of Armenia, Armenia
Aram Kocharyan Université Fédérale Toulouse Midi-Pyrénées, Toulouse, France
Daniel Hagimont Université Fédérale Toulouse Midi-Pyrénées, Toulouse, France

DOI:

https://doi.org/10.12694/scpe.v22i4.1945

Keywords:

Hadoop; Spark; MapReduce; data compression; in-memory file system

Abstract

The MapReduce framework manages Big Data sets by splitting the large datasets into a set of distributed blocks and processes them in parallel. Data compression and in-memory file systems are widely used methods in Big Data processing to reduce resource-intensive I/O operations and improve I/O rate correspondingly. The article presents a performance-efficient modular and configurable decision-making robust service relying on data compression and in-memory data storage indicators. The service consists of Recommendation and Prediction modules, predicts the execution time of a given job based on metrics, and recommends the best configuration parameters to improve Hadoop and Spark frameworks' performance. Several CPU and data-intensive applications and micro-benchmarks have been evaluated to improve the performance, including Log Analyzer, WordCount, and K-Means.

Author Biographies

Hrachya Astsatryan, Institute for Informatics and Automation Problems National Academy of Sciences of Armenia, Armenia

Associate Professor, Head of the Centre for Scientific Computing (http://csc.iiap.sci.am)
Arthur Lalayan, National Polytechnic University of Armenia, Armenia

Arthur Lalayan is currently Ph.D. student in computer science at the National Polytechnic University of Armenia (NPUA). He received his Bachelor’s degree and Master’s in informatics and computer science degree from NPUA in 2019 and 2021, respectively. His research interests include large scale data analytic and optimization.
Aram Kocharyan, Université Fédérale Toulouse Midi-Pyrénées, Toulouse, France

Aram Kocharyan received his Ph.D from Polytechnic National Institute of Toulouse and Institute for Informatics and Automation Problems of the National Academy of Sciences of Armenia in 2019. His main research interests are in Virtualization, Cloud Computing, and Operating Systems.
Daniel Hagimont, Université Fédérale Toulouse Midi-Pyrénées, Toulouse, France

Daniel Hagimont is a Professor at Polytechnic National Institute of Toulouse, France and a member of the IRIT laboratory, where he leads a group working on operating systems, distributed systems and middleware. He received a PhD from Polytechnic National Institute of Grenoble, France in 1993. After a postdoctorate at the University of British Columbia, Vancouver, Canada in 1994, he joined INRIA Grenoble in 1995.

Performance-efficient Recommendation and Prediction Service for Big Data frameworks focusing on Data Compression and In-memory Data Storage Indicators

Authors

DOI:

Keywords:

Abstract

Author Biographies

Downloads

Published

Issue

Section

announcement

Indexed In

SUBMIT

Metrics

Journal Information