Big Data 2.0 Processing Systems

Taxonomy and Open Challenges

Fuad Bajaber, Radwa Elshawi, Omar Batarfi, Abdulrahman Altalhi, Ahmed Barnawi, Sherif Sakr

Research output: Contribution to journalArticleResearchpeer-review

27 Citations (Scopus)

Abstract

Data is key resource in the modern world. Big data has become a popular term which is used to describe the exponential growth and availability of data. In practice, the growing demand for large-scale data processing and data analysis applications spurred the development of novel solutions from both the industry and academia. For a decade, the MapReduce framework, and its open source realization, Hadoop, has emerged as a highly successful framework that has created a lot of momentum in both the research and industrial communities such that it has become the defacto standard of big data processing platforms. However, in recent years, academia and industry have started to recognize the limitations of the Hadoop framework in several application domains and big data processing scenarios such as large scale processing of structured data, graph data and streaming data. Thus, we have witnessed an unprecedented interest to tackle these challenges with new solutions which constituted a new wave of mostly domain-specific, optimized big data processing platforms. In this article, we refer to this new wave of systems as Big Data 2.0 processing systems. To better understand the latest ongoing developments in the world of big data processing systems, we provide a taxonomy and detailed analysis of the state-of-the-art in this domain. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.

Original languageEnglish
Pages (from-to)379-405
Number of pages27
JournalJournal of Grid Computing
Volume14
Issue number3
DOIs
Publication statusPublished - 1 Sep 2016

Fingerprint

Taxonomies
Big data
Industry
Momentum
Availability
Processing

Keywords

  • Big data
  • Hadoop

Cite this

Bajaber, Fuad ; Elshawi, Radwa ; Batarfi, Omar ; Altalhi, Abdulrahman ; Barnawi, Ahmed ; Sakr, Sherif. / Big Data 2.0 Processing Systems : Taxonomy and Open Challenges. In: Journal of Grid Computing. 2016 ; Vol. 14, No. 3. pp. 379-405.
@article{865ca90fde7b479cad876174d1f98fe2,
title = "Big Data 2.0 Processing Systems: Taxonomy and Open Challenges",
abstract = "Data is key resource in the modern world. Big data has become a popular term which is used to describe the exponential growth and availability of data. In practice, the growing demand for large-scale data processing and data analysis applications spurred the development of novel solutions from both the industry and academia. For a decade, the MapReduce framework, and its open source realization, Hadoop, has emerged as a highly successful framework that has created a lot of momentum in both the research and industrial communities such that it has become the defacto standard of big data processing platforms. However, in recent years, academia and industry have started to recognize the limitations of the Hadoop framework in several application domains and big data processing scenarios such as large scale processing of structured data, graph data and streaming data. Thus, we have witnessed an unprecedented interest to tackle these challenges with new solutions which constituted a new wave of mostly domain-specific, optimized big data processing platforms. In this article, we refer to this new wave of systems as Big Data 2.0 processing systems. To better understand the latest ongoing developments in the world of big data processing systems, we provide a taxonomy and detailed analysis of the state-of-the-art in this domain. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.",
keywords = "Big data, Hadoop",
author = "Fuad Bajaber and Radwa Elshawi and Omar Batarfi and Abdulrahman Altalhi and Ahmed Barnawi and Sherif Sakr",
year = "2016",
month = "9",
day = "1",
doi = "10.1007/s10723-016-9371-1",
language = "English",
volume = "14",
pages = "379--405",
journal = "Journal of Grid Computing",
issn = "1570-7873",
publisher = "Springer Netherlands",
number = "3",

}

Bajaber, F, Elshawi, R, Batarfi, O, Altalhi, A, Barnawi, A & Sakr, S 2016, 'Big Data 2.0 Processing Systems: Taxonomy and Open Challenges', Journal of Grid Computing, vol. 14, no. 3, pp. 379-405. https://doi.org/10.1007/s10723-016-9371-1

Big Data 2.0 Processing Systems : Taxonomy and Open Challenges. / Bajaber, Fuad; Elshawi, Radwa; Batarfi, Omar; Altalhi, Abdulrahman; Barnawi, Ahmed; Sakr, Sherif.

In: Journal of Grid Computing, Vol. 14, No. 3, 01.09.2016, p. 379-405.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Big Data 2.0 Processing Systems

T2 - Taxonomy and Open Challenges

AU - Bajaber, Fuad

AU - Elshawi, Radwa

AU - Batarfi, Omar

AU - Altalhi, Abdulrahman

AU - Barnawi, Ahmed

AU - Sakr, Sherif

PY - 2016/9/1

Y1 - 2016/9/1

N2 - Data is key resource in the modern world. Big data has become a popular term which is used to describe the exponential growth and availability of data. In practice, the growing demand for large-scale data processing and data analysis applications spurred the development of novel solutions from both the industry and academia. For a decade, the MapReduce framework, and its open source realization, Hadoop, has emerged as a highly successful framework that has created a lot of momentum in both the research and industrial communities such that it has become the defacto standard of big data processing platforms. However, in recent years, academia and industry have started to recognize the limitations of the Hadoop framework in several application domains and big data processing scenarios such as large scale processing of structured data, graph data and streaming data. Thus, we have witnessed an unprecedented interest to tackle these challenges with new solutions which constituted a new wave of mostly domain-specific, optimized big data processing platforms. In this article, we refer to this new wave of systems as Big Data 2.0 processing systems. To better understand the latest ongoing developments in the world of big data processing systems, we provide a taxonomy and detailed analysis of the state-of-the-art in this domain. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.

AB - Data is key resource in the modern world. Big data has become a popular term which is used to describe the exponential growth and availability of data. In practice, the growing demand for large-scale data processing and data analysis applications spurred the development of novel solutions from both the industry and academia. For a decade, the MapReduce framework, and its open source realization, Hadoop, has emerged as a highly successful framework that has created a lot of momentum in both the research and industrial communities such that it has become the defacto standard of big data processing platforms. However, in recent years, academia and industry have started to recognize the limitations of the Hadoop framework in several application domains and big data processing scenarios such as large scale processing of structured data, graph data and streaming data. Thus, we have witnessed an unprecedented interest to tackle these challenges with new solutions which constituted a new wave of mostly domain-specific, optimized big data processing platforms. In this article, we refer to this new wave of systems as Big Data 2.0 processing systems. To better understand the latest ongoing developments in the world of big data processing systems, we provide a taxonomy and detailed analysis of the state-of-the-art in this domain. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.

KW - Big data

KW - Hadoop

UR - http://www.scopus.com/inward/record.url?scp=84976271964&partnerID=8YFLogxK

U2 - 10.1007/s10723-016-9371-1

DO - 10.1007/s10723-016-9371-1

M3 - Article

VL - 14

SP - 379

EP - 405

JO - Journal of Grid Computing

JF - Journal of Grid Computing

SN - 1570-7873

IS - 3

ER -