Please use this identifier to cite or link to this item:
http://rudar.ruc.dk/handle/1800/7445
|
| Title: | Parallel programming with CUDA |
| Other Titles: | Parallelprogrammering med CUDA |
| Authors: | Nielsen, Jon Mosegaard, Truls |
| Advisor: | Helsgaun, Keld |
| Keywords: | parallel programming programmering cuda gpu c nbody n-body barnes-hut barnes hut openmp |
| Examination Date: | 22-Mar-2012 |
| Issue Date: | 10-Apr-2012 |
| Abstract: | This report documents our master thesis project, which is about parallel programming with CUDA, the NVIDIA GPU architecture with support for general purpose computing.
The purpose of the thesis is to uncover the qualities of CUDA as a parallel computing platform, determining the possibilities and limitations of its ability to handle different types of algorithms. We examine this by performing a case study of two algorithms used in the computationally intensive field of n-body simulations. In our report we present the topics of our thesis through chapters containing overviews of the relevant theory. Based on this we investigate how CUDA performs using the embarrassingly parallel n-body all-pairs algorithm, as well as the Barnes-Hut algorithm, which is partially irregular with regards to parallelization due to its datastructure.
We have found that CUDA performs exceptionally well on n-body all-pairs, observing up to a 100x speed-up on an optimized GPU implementation compared to an implementation running on a computer with 16 CPU cores. The CUDA implementation of the Barnes-Hut algorithm also shows increased performance, as the most costly part of the algorithm is parallelizable. We find that although it is possible to implement an irregular algorithm in CUDA, doing so with success requires an understanding of CUDA programming and the CUDA model of parallelism.
We conclude that CUDA performs well on massively parallel problems and can be useful for irregular problems as well. Programming for it can be complex when optimizing or when the algorithm is not easily parallelized. The platform has a good performance and potential to accelerate suitable applications. |
| URI: | http://rudar.ruc.dk/handle/1800/7445 |
| Subject: | Thesis |
| Education: | Datalogi / Computer Science - Master thesis |
| Appears in Collections: | Datalogi rapporter / Computer Science Projects Projektrapporter og specialer / Projectreports and master thesis
|
This item is protected by original copyright
|
Items in RUDAR are protected by copyright, with all rights reserved, unless otherwise indicated.
|