lnu.sePublikasjoner
Endre søk
Begrens søket
12 1 - 50 of 75
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Abraham, Erika
    et al.
    RWTH Aachen University, Germany.
    Bekas, Costas
    IBM Research, Switzerland.
    Brandic, Ivona
    Vienna University of Technology, Austria.
    Genaim, Samir
    Complutense University of Madrid, Spain.
    Johnsen, Einar
    University of Oslo, Norway.
    Kondov, Ivan
    Karlsruhe Institute of Technology, Germany.
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Streit, Achim
    Karlsruhe Institute of Technology, Germany.
    Preparing HPC Applications for Exascale: Challenges and Recommendations2015Inngår i: Proceedings: 2015 18th International Conference on Network-Based Information Systems, NBiS 2015 / [ed] Barolli, L; Takizawa, M; Hsu, HH; Enokido, T; Xhafa, F, IEEE conference proceedings, 2015, s. 401-406Konferansepaper (Fagfellevurdert)
    Abstract [en]

    While the HPC community is working towards the development of the first Exaflop computer (expected around 2020), after reaching the Petaflop milestone in 2008 still only few HPC applications are able to fully exploit the capabilities of Petaflop systems. In this paper we argue that efforts for preparing HPC applications for Exascale should start before such systems become available. We identify challenges that need to be addressed and recommend solutions in key areas of interest, including formal modeling, static analysis and optimization, runtime analysis and optimization, and autonomic computing. Furthermore, we outline a conceptual framework for porting HPC applications to future Exascale computing systems and propose steps for its implementation.

  • 2.
    Achilleos, Achilleas
    et al.
    Frederick University, Cyprus.
    Mettouris, Christos
    University of Cyprus, Cyprus.
    Yeratziotis, Alexandros
    University of Cyprus, Cyprus.
    Papadopoulos, George
    University of Cyprus, Cyprus.
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Huber, Florian
    SYNYO GmbH, Austria.
    Jäger, Bernhard
    SYNYO GmbH, Austria.
    Leitner, Peter
    SYNYO GmbH, Austria.
    Ocsovszky, Zsófia
    BioTalentum Ltd, Hungary.
    Dinnyés, András
    BioTalentum Ltd, Hungary.
    SciChallenge: A Social Media Aware Platform for Contest-Based STEM Education and Motivation of Young Students2019Inngår i: IEEE Transactions on Learning Technologies, ISSN 1939-1382, E-ISSN 1939-1382, Vol. 12, nr 1, s. 98-111Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Scientific and technological innovations have become increasingly important as we face the benefits and challenges of both globalization and a knowledge-based economy. Still, enrolment rates in STEM degrees are low in many European countries and consequently there is a lack of adequately educated workforce in industries. We believe that this can be mainly attributed to pedagogical issues, such as the lack of engaging hands-on activities utilized for science and math education in middle and high schools. In this paper, we report our work in the SciChallenge European project, which aims at increasing the interest of pre-university students in STEM disciplines, through its distinguishing feature, the systematic use of social media for providing and evaluation of the student-generated content. A social media-aware contest and platform were thus developed and tested in a pan-European contest that attracted >700 participants. The statistical analysis and results revealed that the platform and contest positively influenced participants STEM learning and motivation, while only the gender factor for the younger study group appeared to affect the outcomes (confidence level – p<.05).

  • 3.
    Alam Khan, Fakhri
    et al.
    University of Vienna.
    Han, Yuzhang
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Brezany, Peter
    University of Vienna.
    An Ant-Colony-Optimization Based Approach for Determination of Parameter Significance of Scientific Workflows2010Inngår i: 24th IEEE International Conference on Advanced Information Networking and Applications, IEEE, 2010, s. 1241-1248Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In the process of a scientific experiment a workflow is executed multiple times using various values of the parameters of activities. For real-world workflows that may contain hundreds of activities, each having several parameters, it is practically not feasible to conduct a parameter sensitivity study by simply following a ”brute-force approach” (that is experimental evaluation of all possible cases). We believe that a heuristic-guided approach enables to find a near-optimal solution using a reasonable amount of resources without the need for the evaluation of all possibilities. In this paper we present a novel methodology for determination of parameter significance of scientific workflows that is based on Ant Colony Optimization (ACO). We refer to our methodology, which is a customization of ACO for Parameter Significance determination, as ACO4PS. We use ACO4PS to identify (1) which parameter strongly affects the overall result of the workflow and (2) for which combination of parameter values we obtain the expected result. ACO4PS generates a list of all workflow parameters sorted by significance as well as is capable of generating a subset of significant parameters. We empirically evaluate our methodology using a real-world scientific workflow that deals with the Non-Invasive Glucose Measurement.

  • 4.
    Alam Khan, Fakhri
    et al.
    University of Vienna.
    Han, Yuzhang
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Brezany, Peter
    University of Vienna.
    Estimation of Parameters Sensitivity for Scientific Workflows2009Inngår i: 2009 International Conference on Parallel Processing Workshops, IEEE, 2009, s. 457-462Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Usually workflow activities in the scientific domain depend on a collection of parameters. These parameters determine the output of the activity, and consequently the output of the whole workflow. In the scientific domain, workflows have exploratory nature and are used to understand a scientific phenomenon or answer scientific questions. In the process of a scientific experiment a workflow is executed multiple times using various values of the parameters of activities. It is relevant to identify (1) which parameter strongly affects the overall result of the workflow and (2) for which combination of parameter values we obtain the expected result. Foreseeing these issues, in this paper we present our methodology to estimate the significance of all scientific workflow parameters as well as to estimate the most significant parameter to the workflow. The estimation of parameter significance will enable the scientist to fine tune, and optimize his results efficiently. Furthermore, we empirically validate our methodology on Non-Invasive Glucose Measurement workflow and discuss our results. The NIGM workflow uses the neural network model to calculate the glucose level in patient blood. The neural network model has a set of parameters, which affect the result of the workflow significantly. But, unfortunately the impact significance of these parameters is commonly unknown to the user. We present our approach for estimating and quantifying impact significance of neural network parameters.

  • 5.
    Alam Khan, Fakhri
    et al.
    University of Vienna.
    Han, Yuzhang
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Brezany, Peter
    University of Vienna.
    Provenance Support for Grid-Enabled Scientific Workflows2008Inngår i: Fourth International Conference on Semantics, Knowledge and Grid, IEEE Computer Society, 2008, s. 173-180Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The Grid is evolving and new concepts like Semantic Grid, Knowledge Grid are rapidly emerging, where humans and distributed machines share, exchange, and manage data and resources intelligently. Computational scientists typically use workflows to describe and manage scientific discovery processes. However, the credibility of the obtained results in the scientific community is questionable if the computational experiment is not reproducible. This issue is being addressed in our research reported in this paper via development of workflow provenance system for Grid-enabled scientific workflows. Workflow provenance collects data on workflow activities, data flow and workflow clients. Provenance information can be used to trace and test workflows and the data produced. Our approach supports reproducibility (i.e. to support re-enactment of workflow by an independent user) and dataflow visualization (i.e. visualization of statistical characteristics of input/output data). We illustrate our approach on the Non-Invasive Glucose Measurement (NIGM) application.

  • 6.
    Alsouda, Yasser
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för fysik och elektroteknik (IFE).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Kurti, Arianit
    RISE Interactive, Sweden.
    A Machine Learning Driven IoT Solution for Noise Classification in Smart Cities2018Inngår i: Machine Learning Driven Technologies and Architectures for Intelligent Internet of Things (ML-IoT), August 28, 2018, Prague, Czech Republic, Euromicro , 2018, s. 1-6Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a machine learning based method for noise classification using a low-power and inexpensive IoT unit. We use Mel-frequency cepstral coefficients for audio feature extraction and supervised classification algorithms (that is, support vector machine and k-nearest neighbors) for noise classification. We evaluate our approach experimentally with a dataset of about 3000 sound samples grouped in eight sound classes (such as, car horn, jackhammer, or street music). We explore the parameter space of support vector machine and k-nearest neighbors algorithms to estimate the optimal parameter values for classification of sound samples in the dataset under study. We achieve a noise classification accuracy in the range 85% -- 100%. Training and testing of our k-nearest neighbors (k = 1) implementation on Raspberry Pi Zero W is less than a second for a dataset with features of more than 3000 sound samples.

  • 7.
    Alsouda, Yasser
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Kurti, Arianit
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    IoT-based Urban Noise Identification Using Machine Learning: Performance of SVM, KNN, Bagging, and Random Forest2019Inngår i: Proceedings of the International Conference on Omni-Layer Intelligent Systems (COINS '19), New York: ACM Publications, 2019, s. 62-67Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Noise is any undesired environmental sound. A sound at the same dB level may be perceived as annoying noise or as pleasant music. Therefore, it is necessary to go beyond the state-of-the-art approaches that measure only the dB level and also identify the type of noise. In this paper, we present a machine learning based method for urban noise identification using an inexpensive IoT unit. We use Mel-frequency cepstral coefficients for audio feature extraction and supervised classification algorithms (that is, support vector machine, k-nearest neighbors, bootstrap aggregation, and random forest) for noise classification. We evaluate our approach experimentally with a data-set of about 3000 sound samples grouped in eight sound classes (such as car horn, jackhammer, or street music). We explore the parameter space of the four algorithms to estimate the optimal parameter values for classification of sound samples in the data-set under study. We achieve a noise classification accuracy in the range 88% - 94%.

  • 8. Ayguadé, Eduard
    et al.
    Pnevmatikatos, DionisiosEigenmann, RudolfLuján, MikelPllana, Sabri
    Topic 11: Multicore and Manycore Programming2012Konferanseproceedings (Fagfellevurdert)
    Abstract [en]

    Modern multicore and manycore systems enjoy the benefits of technology scaling and promise impressive performance. However, harvesting this potential is not straightforward. While multicore and manycore processors alleviate several problems that are related to single-core processors – known as memory-, power-, or instruction-level parallelism-wall – they raise the issue of the programmability and programming effort. This topic focuses on novel solutions for multicore and manycore programmability and efficient programming in the context of generalpurpose systems.

  • 9. Barolli, Leonard
    et al.
    Pllana, SabriUniversity of Vienna.Xhafa, Fatos
    2009 International Workshop on Multi-Core Computing Systems (MuCoCoS 2009): Message from the Workshop Co-Chairs2009Konferanseproceedings (Fagfellevurdert)
  • 10. Barolli, Leonard
    et al.
    Pllana, SabriUniversity of Vienna.Xhafa, Fatos
    International Conference on Complex, Intelligent and Software Intensive Systems, 2008. CISIS 2008.2008Konferanseproceedings (Fagfellevurdert)
  • 11.
    Benkner, Siegfried
    et al.
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Träff, Jesper
    University of Vienna.
    Tsigas, Philippas
    Chalmers University of Technology.
    Dolinsky, Uwe
    Codeplay Software.
    Augonnet, Cedric
    INRIA Bordeaux.
    Bachmayer, Beverly
    Intel GmbH.
    Kessler, Christoph
    Linköping University.
    Moloney, David
    Movidius.
    Osipov, Vitaly
    Karlsruhe Institute of Technology.
    PEPPHER: Efficient and Productive Usage of HybridComputing Systems2011Inngår i: IEEE Micro, ISSN 0272-1732, E-ISSN 1937-4143, Vol. 31, nr 5, s. 28-41Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    PEPPHER, a three-year European FP7 project, addresses efficient utilization of hybrid (heterogeneous) computer systems consisting of multicore CPUs with GPU-type accelerators. This article outlines the PEPPHER performance-aware component model, performance prediction means, runtime system, and other aspects of the project. A larger example demonstrates performance portability with the PEPPHER approach across hybrid systems with one to four GPUs.

  • 12.
    Benkner, Siegfried
    et al.
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Träff, Jesper
    University of Vienna.
    Tsigas, Philippas
    Chalmers University of Technology.
    Richards, Andrew
    Namyst, Raymond
    Bachmayer, Beverly
    Intel GmbH.
    Kessler, Christoph
    Linköping University.
    Moloney, David
    Movidius.
    Sanders, Peter
    The PEPPHER Approach to Programmability andPerformance Portability for Heterogeneous many-core Architectures2012Inngår i: Applications, Tools and Techniques on the Road to Exascale Computing / [ed] Koen De Bosschere, Erik H. D'Hollander, Gerhard R. Joubert, David Padua, Frans Peters, Mark Sawyer, IOS Press, 2012, s. 361-368Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The European FP7 project PEPPHER is addressing programmability and performance portability for current and emerging heterogeneous many-core architectures. As its main idea, the project proposes a multi-level parallel execution model comprised of potentially parallelized components existing in variants suitable for different types of cores, memory configurations, input characteristics, optimization criteria, and couples this with dynamic and static resource and architecture aware scheduling mechanisms. Crucial to PEPPHER is that components can be made performance aware, allowing for more efficient dynamic and static scheduling on the concrete, available resources. The flexibility provided in the software model, combined with a customizable, heterogeneous, memory and topology aware run-time system is key to efficiently exploiting the resources of each concrete hardware configuration. The project takes a holistic approach, relying on existing paradigms, interfaces, and languages for the parallelization of components, and develops a prototype framework, a methodology for extending the framework, and guidelines for constructing performance portable software and systems – including paths to migration of existing software – for heterogeneous many-core processors. This paper gives a high-level project overview, and presents a specific example showing how the PEPPHER component variant model and resource-aware run-time system enable performance portability of a numerical kernel.

  • 13.
    Benkner, Siegfried
    et al.
    University of Vienna, Austria.
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Träff, Jesper
    University of Vienna, Austria.
    Tsigas, Philippas
    Chalmers University of Technology.
    Richards, Andrew
    Codeplay Software Limited, United Kingdom.
    Russell, George
    INRIA, France.
    Thibault, Samuel
    INRIA, France.
    Augonnet, Cedric
    INRIA, France.
    Namyst, Raymond
    INRIA, France.
    Cornelius, Herbert
    Intel Gmbh, France.
    Kessler, Christoph
    Linköping University.
    Moloney, David
    Movidius Ltd, Ireland.
    Sanders, Peter
    KarlsruherInstitut Für Technologie, Germany.
    Peppher: Performance Portability and Programmability for Heterogeneous Many-Core Architectures2017Inngår i: Programming multi-core and many-core computing systems / [ed] Sabri Pllana, Fatos Xhafa, John Wiley & Sons, 2017, 1, s. 243-260Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    PEPPHER takes a pluralistic and parallelization agnostic approach to programmability and performance portability for heterogeneous many-core architectures. The PEPPHER framework is in principle language independent but focuses on supporting C++ code with PEPPHER-specific annotations as pragmas or external annotations. The framework is open and extensible; the PEPPHER methodology details how new architectures are incorporated. The PEPPHER methodology consists of rules for how to extend the framework for new architectures. This mainly concerns adaptivity and autotuning for algorithm libraries, the necessary hooks and extensions for the run-time system and any supporting algorithms and data structures that this relies on. Offloading is a specific technique for programming heterogeneous platforms that can sometimes be applied with high efficiency. Offload as developed by the PEPPHER partner Codeplay is a particular, nonintrusive C++ extension allowing portable C++ code to support diverse heterogeneous multicore architectures in a single code base.

  • 14.
    Brandic, Ivona
    et al.
    University of Vienna, Austria.
    Pllana, Sabri
    University of Vienna, Austria.
    Benkner, Siegfried
    University of Vienna, Austria.
    An approach for the high-level specification of QoS-aware grid workflows considering location affinity2006Inngår i: Scientific Programming, ISSN 1058-9244, E-ISSN 1875-919X, Vol. 14, nr 3-4, s. 231-250Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Many important scientific and engineering problems may be solved by combining multiple applications in the form of a Grid workflow. We consider that for the wide acceptance of Grid technology it is important that the user has the possibility to express requirements on Quality of Service (QoS) at workflow specification time. However, most of the existing workflow languages lack constructs for QoS specification. In this paper we present an approach for high level workflow specification that considers a comprehensive set of QoS requirements. Besides performance related QoS, it includes economical, legal and security aspects. For instance, for security or legal reasons the user may express the location affinity regarding Grid resources on which certain workflow tasks may be executed. Our QoS-aware workflow system provides support for the whole workflow life cycle from specification to execution. Workflow is specified graphically, in an intuitive manner, based on a standard visual modeling language. A set of QoS-aware service-oriented components is provided for workflow planning to support automatic constraint-based service negotiation and workflow optimization. For reducing the complexity of workflow planning, we introduce a QoS-aware workflow reduction technique. We illustrate our approach with a real-world workflow for maxillo facial surgery simulation.

  • 15.
    Brandic, Ivona
    et al.
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Benkner, Siegfried
    University of Vienna.
    Specification, Planning, and Execution of QoS-aware Grid Workflows2009Inngår i: Market-Oriented Grid and Utility Computing / [ed] Rajkumar Buyya, Kris Bubendorfer, John Wiley & Sons, 2009, s. 309-334Kapittel i bok, del av antologi (Fagfellevurdert)
  • 16.
    Brandic, Ivona
    et al.
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Benkner, Siegfried
    University of Vienna.
    Specification, Planning, and Execution of QoS-awareGrid Workflows within the Amadeus Environment2008Inngår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 20, nr 4, s. 331-345Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Commonly, at a high level of abstraction Grid applications are specified based on the workflow paradigm. However, majority of Grid workflow systems either do not support Quality of Service (QoS), or provide only partial QoS support for certain phases of the workflow lifecycle. In this paper we present Amadeus, which is a holistic service-oriented environment for QoS-aware Grid workflows. Amadeus considers user requirements, in terms of QoS constraints, during workflow specification, planning, and execution. Within the Amadeus environment workflows and the associated QoS constraints are specified at a high level using an intuitive graphical notation. A distinguishing feature of our system is the support of a comprehensive set of QoS requirements, which considers in addition to performance and economical aspects also legal and security aspects. A set of QoS-aware service-oriented components is provided for workflow planning to support automatic constraint-based service negotiation and workflow optimization. For improving the efficiency of workflow planning we introduce a QoS-aware workflow reduction technique. Furthermore, we present our static and dynamic planning strategies for workflow execution in accordance with user-specified requirements. For each phase of the workflow lifecycle we experimentally evaluate the corresponding Amadeus components.

  • 17.
    Chozas, Adridan Calvo
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Memeti, Suejb
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV). Linnaeus Univ, S-35195 Vaxjo, Sweden..
    Using Cognitive Computing for Learning Parallel Programming: An IBM Watson Solution2017Inngår i: International Conference on Computational Science (ICCS 2017) / [ed] Koumoutsakos, P Lees, M Krzhizhanovskaya, V Dongarra, J Sloot, P, Elsevier, 2017, s. 2121-2130Konferansepaper (Fagfellevurdert)
    Abstract [en]

    While modern parallel computing systems provide high performance resources, utilizing them to the highest extent requires advanced programming expertise. Programming for parallel computing systems is much more difficult than programming for sequential systems. OpenMP is an extension of C++ programming language that enables to express parallelism using compiler directives. While OpenMP alleviates parallel programming by reducing the lines of code that the programmer needs to write, deciding how and when to use these compiler directives is up to the programmer. Novice programmers may make mistakes that may lead to performance degradation or unexpected program behavior. Cognitive computing has shown impressive results in various domains, such as health or marketing. In this paper, we describe the use of IBM Watson cognitive system for education of novice parallel programmers. Using the dialogue service of the IBM Watson we have developed a solution that assists the programmer in avoiding common OpenMP mistakes. To evaluate our approach we have conducted a survey with a number of novice parallel programmers at the Linnaeus University, and obtained encouraging results with respect to usefulness of our approach. (C) 2017 The Authors. Published by Elsevier B.V.

  • 18. Dokulil, Jirí
    et al.
    Bajrovic, Enes
    Pllana, Sabri
    University of Vienna.
    Sandrieser, Martin
    Bachmayer, Beverly
    High-level Support for Hybrid Parallel Execution of C++ Applications Targeting Intel® Xeon Phi™ Coprocessors2013Inngår i: Procedia Computer Science, Elsevier, 2013, s. 2508-2511Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The introduction of Intel® Xeon Phi™ coprocessors opened up new possibilities in development of highly parallel applications. Even though the architecture allows developers to use familiar programming paradigms and techniques, high-level development of programs that utilize all available processors (host+coprocessors) in a system at the same time is a challenging task.

    In this paper we present a new high-level parallel library construct which makes it easy to apply a function to every member of an array in parallel. In addition, it supports the dynamic distribution of work between the host CPUs and one or more coprocessors. We describe associated runtime support and use a physical simulation example to demonstrate that our library can facilitate the creation of C++ applications that benefit significantly from hybrid execution. Experimental results show that a single optimized source code is sufficient to simultaneously exploit all of the host's CPU cores and coprocessors efficiently.

  • 19.
    Fahringer, Thomas
    et al.
    University of Innsbruck, Austria.
    Jugravu, Alexandru
    University of Vienna, Austria.
    Pllana, Sabri
    University of Vienna, Austria.
    Prodan, Radu
    University of Vienna, Austria.
    Seragiotto Jr., Clovis
    University of Vienna, Austria.
    Truong, Hong-Linh
    University of Vienna, Austria.
    ASKALON: a tool set for clusterand Grid computing2005Inngår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 17, nr 2-4, s. 143-169Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Performance engineering of parallel and distributed applications is a complex task that iterates through various phases, ranging from modeling and prediction, to performance measurement, experiment management, data collection, and bottleneck analysis. There is no evidence so far that all of these phases should/can be integrated into a single monolithic tool. Moreover, the emergence of computational Grids as a common single wide-area platform for high-performance computing raises the idea to provide tools as interacting Grid services that share resources, support interoperability among different users and tools, and, most importantly, provide omnipresent services over the Grid. We have developed the ASKALON tool set to support performance-oriented development of parallel and distributed (Grid) applications. ASKALON comprises four tools, coherently integrated into a service-oriented architecture. SCALEA is a performance instrumentation, measurement, and analysis tool of parallel and distributed applications. ZENTURIO is a general purpose experiment management tool with advanced support for multi-experiment performance analysis and parameter studies. AKSUM provides semi-automatic high-level performance bottleneck detection through a special-purpose performance property specification language. The PerformanceProphet enables the user to model and predict the performance of parallel applications at the early stages of development. In this paper we describe the overall architecture of the ASKALON tool set and outline the basic functionality of the four constituent tools. The structure of each tool is based on the composition and sharing of remote Grid services, thus enabling tool interoperability. In addition, a data repository allows the tools to share the common application performance and output data that have been derived by the individual tools. A service repository is used to store common portable Grid service implementations. A general-purpose Factory service is employed to create service instances on arbitrary remote Grid sites. Discovering and dynamically binding to existing remote services is achieved through registry services. The ASKALON visualization diagrams support both online and post-mortem visualization of performance and output data. We demonstrate the usefulness and effectiveness of ASKALON by applying the tools to real-world applications.

  • 20.
    Grzonka, Daniel
    et al.
    Cracow University of Technology, Poland.
    Jakobik, Agnieszka
    Cracow University of Technology, Poland.
    Kołodziej, Joanna
    Research and Academic Computer Network (NASK), Poland.
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Using a Multi-Agent System and Artificial Intelligence for Monitoring and Improving the Cloud Performance and Security2018Inngår i: Future generations computer systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 86, s. 1106-1117Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Cloud Computing is one of the most intensively developed solutions  for large-scale distributed processing. Effective use of such environments, management of their high complexity and ensuring appropriate levels of Quality of Service (QoS) require advanced monitoring systems. Such monitoring systems have to support the scalability, adaptability and reliability of Cloud. Most of existing monitoring systems  do not incorporate any Artificial  Intelligence (AI) algorithms for supporting the change inside the task stream or environment itself. They  focus  only on monitoring or enabling the control of the system as a part of a separated service. An effective monitoring system for the Cloud environment should gather information about all stages of tasks processing and should actively control the monitored environment. In this paper, we present a novel Multi-Agent System based Cloud Monitoring (MAS-CM) model that supports the performance and security of tasks gathering, scheduling and execution processes in large-scale service-oriented environments. Such models are explicitly designed to control the performance and security objectives of the environment. In our work, we focus on prevention of unauthorized task injection and modification, optimization of scheduling process and maximization of resource usage.We evaluate the effectiveness of MAS-CM empirically using an evolutionary driven implementation of Independent Batch Scheduler and FastFlow framework. The obtained results demonstrate the effectiveness of the proposed approach and the performance improvement.

  • 21.
    Huber, Florian
    et al.
    SYNYO GmbH, Austria.
    Jäger, Bernhard
    SYNYO GmbH, Austria.
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Hrdlicka, Zdenek
    University of Chemistry and Technology, Czech Republic.
    Mettouris, Christos
    University of Cyprus, Cyprus.
    Papadopoulos, George
    University of Cyprus, Cyprus.
    Matevc, Tamara
    Jožef Stefan Institute, Slovenia.
    Ocsovszky, Zsófia
    BioTalentum, Hungary.
    Hajdu, Edina
    BioTalentum, Hungary.
    Gary, Chris
    Kinderbüro Universität Wien, Austria.
    Smith, Phil
    Teacher Scientist Network , UK.
    Pushing Stem-Education through a Social-Media-Based Contest Format - Experiences and Lessons-Learned from the H2020-Project SciChallenge2017Inngår i: INTED2017 Proceedings: 11th International Technology, Education and Development Conference, Valencia, Spain: IATED , 2017, s. 334-344Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Science education is tremendously shaping the present and future of modern societies. Thus, Europe needs all its talents to increase creativity and competitiveness. Especially young boys and girls have to be engaged to pursue careers in Science, Technology, Engineering and Mathematics (STEM). However, statistics still show that enrolment rates in STEM-based degree programs are decreasing. On the long run, this will lead to a workforce problem in the research and development based economy as well as in the scientific sector of all EU member states. But how can we manage it to get young people more interested in STEM?The EU-funded research project SciChallenge (project.scichallenge.eu) addresses this challenge by proposing a social-media-based STEM-contest for young people between 10 to 20 years. The contest pilot is currently running (until April 30th 2017). With its multi-level approach, SciChallenge aims at increasing the attractiveness of science education and careers among young girls and boys on a pan- European level.In the first part, the paper introduces the project and highlights the main steps of the preparation of the contest. This includes the development of the contest concept and the processual framework as well as the main steps that were done for preparing the contest. It also presents the resources that are provided for the participants. The second part of the paper highlights the idea, design and implementation of the digital contest platform (www.scichallenge.eu), which serves as the core of the contest. It will present for example the novel submission and rating system that utilize the power of social networking platforms such as Facebook, as well as the contest dashboards, a convenient, easy to use informational map for the users to observe the status of the contest and related information. Furthermore, it will show how intelligent social media syndication tools can support the awareness creation. The third part of the paper will provide a status update on the currently running contest pilot. It will provide a summary of the experiences that the consortium made with this novel approach. It will also elaborate on the main obstacles the consortium was facing and present the lessons-learned for a future implementation, before drawing preliminary conclusions in the final part regarding the question if such an approach can be a way to increase interest of young people in STEM-education and careers.

  • 22.
    Kessler, Christoph
    et al.
    Linköpings University.
    Dastgeer, Usman
    Linköpings University.
    Majeed, Mudassar
    Linköpings University.
    Furmento, Nathalie
    University of Bordeaux.
    Thibault, Samuel
    University of Bordeaux.
    Namyst, Raymond
    University of Bordeaux.
    Benkner, Siegfried
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Träff, Jesper Larsson
    Vienna University of Technology.
    Wimmer, Martin
    Leveraging PEPPHER Technology for Performance Portable Supercomputing2012Inngår i: Proceedings 2012 SC Companion: High Performance Computing, Networking Storage and Analysis SC Companion 2012, 2012, s. 1395-1396Konferansepaper (Fagfellevurdert)
    Abstract [en]

    PEPPHER is a 3-year EU FP7 project that develops a novel approach and framework to enhance performance portability and programmability of heterogeneous multi-core systems. Its primary target is single-node heterogeneous systems, where several CPU cores are supported by accelerators such as GPUs. This poster briefly surveys the PEPPHER framework for single-node systems, and elaborates on the prospectives for leveraging the PEPPHER approach to generate performance-portable code for heterogeneous multi-node systems.

  • 23.
    Kessler, Christoph
    et al.
    Linköping University.
    Dastgeer, Usman
    Majeed, Mudassar
    Furmento, Nathalie
    Thibault, Samuel
    Namyst, Raymond
    Benkner, Siegfried
    University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Träff, Jesper
    University of Vienna.
    Wimmer, Martin
    Poster: Leveraging PEPPHER Technology for Performance Portable Supercomputing2012Inngår i: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, IEEE, 2012, s. 1397-1397Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    PEPPHER is a 3-year EU FP7 project that develops a novel approach and framework to enhance performance portability and programmability of heterogeneous multi-core systems. Its primary target is single-node heterogeneous systems, where several CPU cores are supported by accelerators such as GPUs. This poster briefly surveys the PEPPHER framework for singlenode systems, and elaborates on the prospectives for leveraging the PEPPHER approach to generate performance-portable code for heterogeneous multi-node systems.

  • 24.
    Kessler, Christoph
    et al.
    Linköpings University.
    Dastgeer, Usman
    Linköpings University.
    Thibault, Samuel
    University of Bordeaux.
    Namyst, Raymond
    University of Bordeaux.
    Richards, Andrew
    Codeplay Software Ltd., Edinburgh.
    Dolinsky, Uwe
    Codeplay Software Ltd., Edinburgh.
    Benkner, Siegfried
    University of Vienna.
    Träff, Jesper
    Technical University of Vienna.
    Pllana, Sabri
    University of Vienna.
    Programmability and performance portability aspects of heterogeneous multi-/manycore systems2012Inngår i: Proceedings Design, Automation & Test in Europe: Dresden, Germany, March 12-16, 2012, IEEE Press, 2012, s. 1402-1408Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We discuss three complementary approaches that can provide both portability and an increased level of abstraction for the programming of heterogeneous multicore systems. Together, these approaches also support performance portability, as currently investigated in the EU FP7 project PEPPHER. In particular, we consider (1) a library-based approach, here represented by the integration of the SkePU C++ skeleton programming library with the StarPU runtime system for dynamic scheduling and dynamic selection of suitable execution units for parallel tasks; (2) a language-based approach, here represented by the Offload-C++ high-level language extensions and Offload compiler to generate platform-specific code; and (3) a component-based approach, specifically the PEPPHER component system for annotating user-level application components with performance metadata, thereby preparing them for performance-aware composition. We discuss the strengths and weaknesses of these approaches and show how they could complement each other in an integrational programming framework for heterogeneous multicore systems.

  • 25.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Li, Lu
    Linköping University.
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Kołodziej, Joanna
    Cracow University of Technology, Poland.
    Kessler, Christoph
    Linköping University.
    Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming Productivity, Performance, and Energy Consumption2017Inngår i: ProceedingARMS-CC '17 Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, New York, NY, USA: Association for Computing Machinery (ACM), 2017, s. 1-6Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Many modern parallel computing systems are heterogeneous at their node level. Such nodes may comprise general purpose CPUs and accelerators (such as, GPU, or Intel Xeon Phi) that provide high performance with suitable energy-consumption characteristics. However, exploiting the available performance of heterogeneous architectures may be challenging. There are various parallel programming frameworks (such as, OpenMP, OpenCL, OpenACC, CUDA) and selecting the one that is suitable for a target context is not straightforward. In this paper, we study empirically the characteristics of OpenMP, OpenACC, OpenCL, and CUDA with respect to programming productivity, performance, and energy. To evaluate the programming productivity we use our homegrown tool CodeStat, which enables us to determine the percentage of code lines required to parallelize the code using a specific framework. We use our tools MeterPU and x-MeterPU to evaluate the energy consumption and the performance. Experiments are conducted using the industry-standard SPEC benchmark suite and the Rodinia benchmark suite for accelerated computing on heterogeneous systems that combine Intel Xeon E5 Processors with a GPU accelerator or an Intel Xeon Phi co-processor.

  • 26.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    A machine learning approach for accelerating DNA sequence analysis2018Inngår i: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 32, nr 3, s. 363-379Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The DNA sequence analysis is a data and computationally intensive problem and therefore demands suitable parallel computing resources and algorithms. In this paper, we describe an optimized approach for DNA sequence analysis on a heterogeneous platform that is accelerated with the Intel Xeon Phi. Such platforms commonly comprise one or two general purpose host central processing units (CPUs) and one or more Xeon Phi devices. We present a parallel algorithm that shares the work of DNA sequence analysis between the host CPUs and the Xeon Phi device to reduce the overall analysis time. For automatic worksharing we use a supervised machine learning approach, which predicts the performance of DNA sequence analysis on the host and device and accordingly maps fractions of the DNA sequence to the host and device. We evaluate our approach empirically using real-world DNA segments for human and various animals on a heterogeneous platform that comprises two 12-core Intel Xeon E5 CPUs and an Intel Xeon Phi 7120P device with 61 cores.

  • 27.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Accelerating DNA Sequence Analysis using Intel(R) Xeon Phi(TM)2015Inngår i: 2015 IEEE TRUSTCOM/BIGDATASE/ISPA, IEEE Press, 2015, Vol. 3, s. 222-227Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Genetic information is increasing exponentially, doubling every 18 months. Analyzing this information within a reasonable amount of time requires parallel computing resources. While considerable research has addressed DNA analysis using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we present an algorithm for large-scale DNA analysis that exploits thread-level and the SIMD parallelism of the Intel Xeon Phi. We evaluate our approach for various numbers of cores and thread allocation affinities in the context of real-world DNA sequences of mouse, cat, dog, chicken, human and turkey. The experimental results on Intel Xeon Phi show speed-ups of up to 10× compared to a sequential implementation running on an Intel Xeon processor E5.

  • 28.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Analyzing large-scale DNA Sequences on Multi-core Architectures2015Inngår i: Proceedings: IEEE 18th International Conferenceon Computational Science and Engineering, CSE 2015 / [ed] Plessl, C; ElBaz, D; Cong, G; Cardoso, JMP; Veiga, L; Rauber, T, IEEE Press, 2015, s. 208-215Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Rapid analysis of DNA sequences is important in preventing the evolution of different viruses and bacteria during an early phase, early diagnosis of genetic predispositions to certain diseases (cancer, cardiovascular diseases), and in DNA forensics. However, real-world DNA sequences may comprise several Gigabytes and the process of DNA analysis demands adequate computational resources to be completed within a reasonable time. In this paper we present a scalable approach for parallel DNA analysis that is based on Finite Automata, and which is suitable for analysing very large DNA segments. We evaluate our approach for real-world DNA segments of mouse (2.7GB), cat (2.4GB), dog (2.4GB), chicken (1GB), human (3.2GB) and turkey (0.2GB). Experimental results on a dual-socket shared-memory system with 24 physical cores show speedups of up to 17.6x. Our approach is up to 3x faster than a pattern-based parallel approach that uses the RE2 library.

  • 29.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM), Institutionen för datavetenskap (DV).
    Combinatorial optimization of DNA sequence analysis on heterogeneous systems2017Inngår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 29, nr 7, artikkel-id e4037Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Analysis of DNA sequences is a data and computational intensive problem, and therefore, it requires suitable parallel computing resources and algorithms. In this paper, we describe our parallel algorithm for DNA sequence analysis that determines how many times a pattern appears in the DNA sequence. The algorithm is engineered for heterogeneous platforms that comprise a host with multi-core processors and one or more many-core devices. For combinatorial optimization, we use the simulated annealing algorithm. The optimization goal is to determine the number of threads, thread affinities, and DNA sequence fractions for host and device, such that the overall execution time of DNA sequence analysis is minimized. We evaluate our approach experimentally using real-world DNA sequences of various organisms on a heterogeneous platform that comprises two Intel Xeon E5 processors and an Intel Xeon Phi 7120P co-processing device. By running only about 5% of possible experiments, our optimization method finds a near-optimal system configuration for DNA sequence analysis that yields with average speedup of 1.6 ×  and 2 ×  compared with the host-only and device-only execution.

  • 30.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Combinatorial Optimization of Work Distribution on Heterogeneous Systems2016Inngår i: Proceedings of 45th International Conference on Parallel Processing Workshops (ICPPW 2016), IEEE Press, 2016, s. 151-160Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We describe an approach that uses combinatorial optimization and machine learning to share the work between the host and device of heterogeneous computing systems such that the overall application execution time is minimized. We propose to use combinatorial optimization to search for the optimal system configuration in the given parameter space (such as, the number of threads, thread affinity, work distribution for the host and device). For each system configuration that is suggested by combinatorial optimization, we use machine learning for evaluation of the system performance. We evaluate our approach experimentally using a heterogeneous platform that comprises two 12-core Intel Xeon E5 CPUs and an Intel Xeon Phi 7120P co-processor with 61 cores. Using our approach we are able to find a near-optimal system configuration by performing only about 5% of all possible experiments.

  • 31.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    HSTREAM: A directive-based language extension for heterogeneous stream computing2018Inngår i: 2018 21st IEEE International Conference on Computational Science and Engineering (CSE) / [ed] Pop, F; Negru, C; GonzalezVelez, H; Rak, J, IEEE, 2018, s. 138-145Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Big data streaming applications require utilization of heterogeneous parallel computing systems, which may comprise multiple multi-core CPUs and many-core accelerating devices such as NVIDIA GPUs and Intel Xeon Phis. Programming such systems require advanced knowledge of several hardware architectures and device-specific programming models, including OpenMP and CUDA. In this paper, we present HSTREAM, a compiler directive-based language extension to support programming stream computing applications for heterogeneous parallel computing systems. HSTREAM source-to-source compiler aims to increase the programming productivity by enabling programmers to annotate the parallel regions for heterogeneous execution and generate target specific code. The HSTREAM runtime automatically distributes the workload across CPUs and accelerating devices. We demonstrate the usefulness of HSTREAM language extension with various applications from the STREAM benchmark. Experimental evaluation results show that HSTREAM can keep the same programming simplicity as OpenMP, and the generated code can deliver performance beyond what CPUs-only and GPUs-only executions can deliver. 

  • 32.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    PAPA: A Parallel Programming Assistant Powered by IBM Watson Cognitive Computing Technology2018Inngår i: Journal of Computational Science, ISSN 1877-7503, E-ISSN 1877-7511, Vol. 26, s. 275-284Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The efficient utilization of the available resources in modern parallel computing systems requires advanced parallel programming expertise. However, parallel programming is more difficult than sequential programming. To alleviate the difficulties of parallel programming, high-level programming frameworks, such as OpenMP, have been proposed. Yet, there is evidence that novice parallel programmers make common mistakes that may lead to performance degradation or unexpected program behavior. In this paper, we present our cognitive Parallel Programming Assistant (PAPA) that aims at educating and assisting novice parallel programmers to avoid common OpenMP mistakes. PAPA combines different IBM Watson services to provide a dialog-based interaction (through text and voice) for programmers. We use the Watson Conversation service to implement the dialog-based interaction, and the Speech-to-Text and Text-to-Speech services to enable the voice interaction. The Watson Natural Language Understanding and WordsAPI Synonyms services are used to train PAPA with OpenMP-related publications. We evaluate our approach using a user experience questionnaire with a number of novice parallel programmers at Linnaeus University.

  • 33.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    PaREM: a Novel Approach for Parallel Regular Expression Matching2014Inngår i: 2014 IEEE 17th International Conference on Computational Science and Engineering (CSE), IEEE Press, 2014, s. 690-697Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for large problem sizes. In this paper, we describe a novel algorithm for parallel regular expression matching via deterministic finite automata. Furthermore, we present our tool PaREM that accepts regular expressions and finite automata as input and automatically generates the corresponding code for our algorithm that is amenable for parallel execution on shared-memory systems. We evaluate our parallel algorithm empirically by comparing it with a commonly used algorithm for sequential regular expression matching. Experiments on a dual-socket shared-memory system with 24 physical cores show speed-ups of up to 21× for 48 threads.

  • 34.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    The Potential of Intel Xeon Phi for DNA Sequence Analysis2015Inngår i: ACACES 2015: Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems, 2015, s. 263-266Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Genetic information is increasing exponentially, doubling every 18 months. Analyzing this information within a reasonable amount of time requires parallel computing resources. While considerable research has addressed DNA analysis using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we present an algorithm for large-scale DNA analysis that exploits the thread-level and the SIMD parallelism of the Intel Xeon Phi coprocessor. We evaluate our approach for various numbers of cores and thread allocation affinities in the context of real-world DNA sequences of mouse, cat, dog, chicken, human and turkey. The experimental results on Intel Xeon Phi show speed-ups of up to 10× compared to a sequential implementation running on an Intel Xeon processor E5.

  • 35.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Work Distribution of Data-Parallel Applications on Heterogeneous Systems2016Inngår i: High Performance Computing: ISC High Performance 2016 International Workshops, ExaComm, E-MuCoCoS, HPC-IODC, IXPUG, IWOPH, P^3MA, VHPC, WOPSSS, Frankfurt, Germany, June 19–23, 2016,  Revised Selected Papers / [ed] Michela Taufer, Bernd Mohr, Julian M. Kunkel, Springer, 2016, s. 69-81Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    Heterogeneous computing systems offer high peak performance and energy efficiency, and utilizing this potential is essential to achieve extreme-scale performance. However, optimal sharing of the work among processing elements in heterogeneous systems is not straightforward. In this paper, we propose an approach that uses combinatorial optimization to search for optimal system configuration in a given parameter space. The optimization goal is to determine the number of threads, thread affinities, and workload partitioning, such that the overall execution time is minimized. For combinatorial optimization we use the Simulated Annealing. We evaluate our approach with a DNA sequence analysis application on a heterogeneous platform that comprises two Intel Xeon E5 processors and an Intel Xeon Phi 7120P co-processor. The obtained results demonstrate that using the near-optimal system configuration, determined by our algorithm based on the simulated annealing, application performance is improved.

  • 36.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Binotto, Alécio
    IBM Research, Brazil.
    Kołodziej, Joanna
    Cracow University of Technology, Poland.
    Brandic, Ivona
    Vienna University of Technology, Austria.
    A Review of Machine Learning and Meta-heuristic Methods for Scheduling Parallel Computing Systems2018Inngår i: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications LOPAL 2018, New York, NY, USA: Association for Computing Machinery (ACM), 2018, artikkel-id 5Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Optimized software execution on parallel computing systems demands consideration of many parameters at run-time. Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this issue researchers have proposed different approaches that use heuristic search or machine learning. In this paper, we undertake a systematic literature review to aggregate, analyze and classify the existing software optimization methods for parallel computing systems. We review approaches that use machine learning or meta-heuristics for scheduling parallel computing systems. Additionally, we discuss challenges and future research directions. The results of this study may help to better understand the state-of-the-art techniques that use machine learning and meta-heuristics to deal with the complexity of scheduling parallel computing systems. Furthermore, it may aid in understanding the limitations of existing approaches and identification of areas for improvement.

  • 37.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Binotto, Alécio
    IBM Research, Brazil.
    Kołodziej, Joanna
    Cracow University of Technology, Poland.
    Brandic, Ivona
    Vienna University of Technology, Austria.
    Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review2019Inngår i: Computing, ISSN 0010-485X, E-ISSN 1436-5057, Vol. 101, nr 8, s. 893-936Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models. Furthermore, optimized software execution on parallel computing systems demands consideration of many parameters at compile-time and run-time. Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this issue researchers have proposed different approaches that use heuristic search or machine learning. In this paper, we undertake a systematic literature review to aggregate, analyze and classify the existing software optimization methods for parallel computing systems. We review approaches that use machine learning or meta-heuristics for software optimization at compile-time and run-time. Additionally, we discuss challenges and future research directions. The results of this study may help to better understand the state-of-the-art techniques that use machine learning and meta-heuristics to deal with the complexity of software optimization for parallel computing systems. Furthermore, it may aid in understanding the limitations of existing approaches and identification of areas for improvement.

  • 38.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ferati, Mexhid
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för informatik (IK).
    Kurti, Arianit
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Jusufi, Ilir
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    IoTutor: How Cognitive Computing Can Be Applied to Internet of Things Education2019Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present IoTutor that is a cognitive computing solution for education of students in the IoT domain. We implement the IoTutor as a platform-independent web-based application that is able to interact with users via text or speech using natural language. We train the IoTutor with selected scientific publications relevant to the IoT education. To investigate users' experience with the IoTutor, we ask a group of students taking an IoT master level course at the Linnaeus University to use the IoTutor for a period of two weeks. We ask students to express their opinions with respect to the attractiveness, perspicuity, efficiency, stimulation, and novelty of the IoTutor. The evaluation results show a trend that students express an overall positive attitude towards the IoTutor with majority of the aspects rated higher than the neutral value.

  • 39.
    Memeti, Suejb
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Kołodziej, Joanna
    Cracow University of Technology, Poland.
    Optimal Worksharing of DNA Sequence Analysis on Accelerated Platforms2016Inngår i: Resource Management for Big Data Platforms: Algorithms, Modelling, and High-Performance Computing Techniques, Springer, 2016, s. 279-309Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    In this chapter, we describe an optimized approach for DNA sequence analysis on a heterogeneous platform that is accelerated with the Intel Xeon Phi. Such platforms commonly comprise one or two general purpose CPUs and one (or more) Xeon Phi coprocessors. Our parallel DNA sequence analysis algorithm is based on Finite Automata and finds patterns in large-scale DNA sequences. To determine the optimal worksharing (that is, DNA sequence fractions for the host and accelerating device) we propose a solution that combines combinatorial optimization and machine learning. The objective function that we aim to minimize is the execution time of the DNA sequence analysis. We use combinatorial optimization to efficiently explore the system configuration space and determine with machine learning the near-optimal system configuration for execution of the DNA sequence analysis. We evaluate our approach empirically using real-world DNA segments of various organisms. For experimentation, we use an accelerated platform that comprises two 12-core Intel Xeon E5 CPUs and an Intel Xeon Phi 7120P accelerator with 61 cores.

  • 40.
    Perez, David
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Memeti, Suejb
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pllana, Sabri
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    A simulation study of a smart living IoT solution for remote elderly care2018Inngår i: 2018 Third International Conference on Fog and Mobile Edge Computing (FMEC), Barcelona, Spain: IEEE, 2018, s. 227-232Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We report a simulation study of a smart living IoT solution for elderly people living in their own houses. Our study was conducted in the context of BoIT project in Sweden that investigates the use of various IoT devices for remote housing and care-giving services. We focus on a carephone device that enables to establish a voice connection via IP with care givers or relatives. We have developed a simulation model to study the IoT solution for elderly care in the Vaxjo municipality in Sweden. The simulation model can be used to address various issues, such as determining the lack or excess of resources or long waiting times, and study the system behavior when the number of alarms is increased. Simulation results indicate that a 15% increase in the arrivals rate would cause unacceptable long waiting times for patients to receive the care.

  • 41.
    Pllana, Sabri
    University of Vienna.
    European Multicore Processing Projects: PEPPHER2010Inngår i: IEEE Micro, ISSN 0272-1732, E-ISSN 1937-4143, Vol. 30, nr 5, s. 99-99Artikkel i tidsskrift (Fagfellevurdert)
  • 42.
    Pllana, Sabri
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
    Barhen, Jacob
    Oak Ridge Natl Lab, Oak Ridge, TN USA.
    Introduction to the computing special issue: performance portability and tuning for multi-coreand many-core computing systems2013Inngår i: Computing, ISSN 0010-485X, E-ISSN 1436-5057, Vol. 96, nr 12 SI, s. 1113-1114Artikkel i tidsskrift (Fagfellevurdert)
  • 43.
    Pllana, Sabri
    et al.
    University of Vienna.
    Barhen, JacobOak Ridge National Laboratory.
    MuCoCoS 2012: 5th International Workshop on Multi-CoreComputing Systems: Performance Portability and Tuning2012Konferanseproceedings (Fagfellevurdert)
  • 44.
    Pllana, Sabri
    et al.
    University of Vienna.
    Barolli, LeonardXhafa, Fatos
    2010 International Workshop on Multi-Core Computing Systems (MuCoCoS 2010)2010Konferanseproceedings (Fagfellevurdert)
  • 45.
    Pllana, Sabri
    et al.
    University of Vienna, Austria.
    Barolli, LeonardFukuoka Institute of Technology, Japan.Xhafa, FatosTechnical University of Catalonia, Spain.
    2011 International Workshop on Multi-Core Computing Systems (MuCoCoS 2011)2011Konferanseproceedings (Fagfellevurdert)
  • 46.
    Pllana, Sabri
    et al.
    University of Vienna.
    Benkner, Siegfried
    Recent Developments in Multi-Core Computing Systems: Special issue of the journal "Scalable Computing: Practice and Experience"2008Collection/Antologi (Fagfellevurdert)
  • 47.
    Pllana, Sabri
    et al.
    University of Vienna.
    Benkner, Siegfried
    University of Vienna.
    Mehofer, Eduard
    University of Vienna.
    Natvig, Lasse
    NTNU.
    Xhafa, Fatos
    University of London.
    Agent-supported Programming of Multi-core Computing Systems2010Inngår i: Complex Intelligent Systems and Their Applications, Springer, 2010, s. 207-224Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    In this chapter we argue that an intelligent program development environment that proactively supports the user helps a mainstream programmer to overcome the difficulties of programming multicore computing systems. We propose a programming environment based on intelligent software agents that enables users to work at a high level of abstraction while automating low-level implementation activities. The programming environment supports program composition in a model-driven development fashion using parallel building blocks and proactively assists the user during major phases of program development and performance tuning. We highlight the potential benefits of using such a programming environment with usage scenarios. An experiment with a parallel building block on a Sun UltraSPARC T2 Plus processor shows how the system may assist the programmer in achieving performance improvements.

  • 48.
    Pllana, Sabri
    et al.
    University of Vienna.
    Benkner, Siegfried
    University of Vienna.
    Mehofer, Eduard
    University of Vienna.
    Natvig, Lasse
    NTNU.
    Xhafa, Fatos
    UPC.
    Towards an Intelligent Environment for Programming Multi-core Computing Systems2009Inngår i: Euro-Par 2008 Workshop - Parallel Processing: VHPC 2008, UNICORE 2008, HPPC 2008, SGS 2008, PROPER 2008, ROIA 2008, and DPA 2008, Las Palmas de Gran Canaria, Spain, August 25-26, 2008, Revised Selected Papers, Springer, 2009, s. 141-151Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this position paper we argue that an intelligent program development environment that proactively supports the user helps a mainstream programmer to overcome the difficulties of programming multi-core computing systems. We propose a programming environment based on intelligent software agents that enables users to work at a high level of abstraction while automating low-level implementation activities. The programming environment supports program composition in a model-driven development fashion using parallel building blocks and proactively assists the user during major phases of program development and performance tuning. We highlight the potential benefits of using such a programming environment with usage-scenarios. An experiment with a parallel building block on a Sun UltraSPARC T2 Plus processor shows how the system may assist the programmer in achieving performance improvements.

  • 49.
    Pllana, Sabri
    et al.
    University of Vienna.
    Benkner, Siegfried
    University of Vienna.
    Xhafa, Fatos
    Barolli, Leonard
    A Novel Approach for Hybrid Performance Modelingand Prediction of Large-Scale Computing Systems2009Inngår i: International Journal of Grid and Utility Computing (IJGUC), ISSN 1741-8488, Vol. 1, nr 4, s. 316-327Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We present a novel approach for hybrid performance modelling and prediction of large-scale parallel and distributed computing systems, which combines Mathematical Modelling (MathMod) and Discrete-Event Simulation (DES). We use MathMod to develop parameterised performance models for components of the system. Thereafter, we use DES to describe the structure of system and the interaction among its components. As a result we obtain a high-level performance model, which combines the evaluation speed of mathematical models with the structure awareness and fidelity of the simulation model. We evaluate empirically our approach with a real-world material science program that comprises more than 15,000 lines of code.

  • 50.
    Pllana, Sabri
    et al.
    University of Vienna.
    Benkner, Siegfried
    University of Vienna.
    Xhafa, Fatos
    Polytechnic University of Catalonia.
    Barolli, Leonard
    Fukuoka Institute of Technology.
    Automatic Performance Model Transformation from a Human-intuitive to a Machine-efficient Form2009Inngår i: Scalable Computing: Practice and Experience (SCPE), ISSN 1895-1767, Vol. 10, nr 1, s. 35-47Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We address the issue of the development of performance models for programs that may be executed on large-scale computing systems. The commonly used approaches apply non-standard notations for model specification and often require that the software engineer has a thorough understanding of the underlying performance modeling technique. We propose to bridge the gap between the performance modeling and software engineering by incorporating UML. In our approach we aim to permit the graphical specification of performance model in a human-intuitive fashion on one hand, but on the other hand we aim for a machine-efficient model evaluation. The user specifies graphically the performance model using UML. Thereafter, the transformation of the performance model from the human-usable UML representation to the machine-efficient C++ representation is done automatically. We describe our methodology and illustrate it with the automatic transformation of a sample performance model. Furthermore, we demonstrate the usefulness of our approach by modeling and simulating a real-world material science program.

12 1 - 50 of 75
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf