Modern graphic processing units (GPU) are powerful parallel processing multi-coredevices that are found in most computers today. The increase in processing resourcesof the GPU, coupled with improvements and flexibility of the programming frameworks,has increased the interest in general purpose programming on theGPU(GPGPU).In this thesis, we investigate how the GPU architecture and its processing capabilitiescan be utilised in general purpose applications using the NVIDIA compute unified devicearchitecture (CUDA) framework. With the large number of CUDA applicationsbeing developed, we investigate how CUDA applications can share the GPU resourceand see what challenges are connected with concurrent applications executing on theGPU. As a basis for our investigation, we implement the advanced encryption standard(AES) to learn how to use the framework, and how to increase performance of aCUDA application.Our results show that there is little support for concurrency in the CUDA frameworkat present time, as the GPU accesses are serialised, with no support for preemptionor sharing of the resource. We have implemented a static scheduler to see if it couldimprove concurrency, but due to the hardware abstraction CUDA offers we could notcontrol the scheduling as we wanted. When developing the AES application we sawthe importance of memory optimisations in CUDA applications, and we present waysof optimising memory accesses together with optimisation concurrent execution betweenthe CPU and GPU.