lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Modeling performance of Hadoop applications: A journey from queueing networks to stochastic well formed nets
Politecnico di Milano, Italy.
University Center for Defense, Spain.
Politecnico di Milano, Italy.
Sharif University of Technology, Iran.
Visa övriga samt affilieringar
2016 (Engelska)Ingår i: Algorithms and Architectures for Parallel Processing: 16th International Conference, ICA3PP 2016, Granada, Spain, December 14-16, 2016, Proceedings / [ed] Jesus Carretero, Javier Garcia-Blas, Ryan K.L. Ko, Peter Mueller, Koji Nakano, Springer, 2016, s. 599-613Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Nowadays, many enterprises commit to the extraction of actionable knowledge from huge datasets as part of their core business activities. Applications belong to very different domains such as fraud detection or one-to-one marketing, and encompass business analytics and support to decision making in both private and public sectors. In these scenarios, a central place is held by the MapReduce framework and in particular its open source implementation, Apache Hadoop. In such environments, new challenges arise in the area of jobs performance prediction, with the needs to provide Service Level Agreement guarantees to the end-user and to avoid waste of computational resources. In this paper we provide performance analysis models to estimate MapReduce job execution times in Hadoop clusters governed by the YARN Capacity Scheduler. We propose models of increasing complexity and accuracy, ranging from queueing networks to stochastic well formed nets, able to estimate job performance under a number of scenarios of interest, including also unreliable resources. The accuracy of our models is evaluated by considering the TPC-DS industry benchmark running experiments on Amazon EC2 and the CINECA Italian supercomputing center. The results have shown that the average accuracy we can achieve is in the range 9–14%.

Ort, förlag, år, upplaga, sidor
Springer, 2016. s. 599-613
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10048
Nyckelord [en]
MapReduce, Performance Models
Nationell ämneskategori
Programvaruteknik
Forskningsämne
Datavetenskap, Programvaruteknik
Identifikatorer
URN: urn:nbn:se:lnu:diva-68933DOI: 10.1007/978-3-319-49583-5_47Scopus ID: 2-s2.0-85007258340ISBN: 978-3-319-49582-8 (tryckt)ISBN: 978-3-319-49583-5 (digital)OAI: oai:DiVA.org:lnu-68933DiVA, id: diva2:1160184
Konferens
16th International Conference, ICA3PP 2016, Granada, Spain, December 14-16, 2016
Tillgänglig från: 2017-11-24 Skapad: 2017-11-24 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Personposter BETA

Perez-Palacin, Diego

Sök vidare i DiVA

Av författaren/redaktören
Perez-Palacin, Diego
Programvaruteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 84 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf