Hybrid Sorting Algorithm on CPU and GPU with Application to Collision Detection

This thesis created a novel adaptive sorting algorithm that runs on CPU as well as GPU. It utilizes advantages from different underlying previously existing algorithms.


The field of adaptive sorting algorithms, that is, taking advantage of presortedness, is rather unexplored on GPUs. Hence, the main focus of this work is to create an adaptive sorting algorithm that would also work on GPU using NVIDIA's CUDA. Sorting can be found in numerous applications, such as databases, string processing, operations research, simulations, priority queues and graph algorithms. This work focuses on the field of collision detection in computer graphic animations. More specifically, the array of Axis Aligned Bounding Boxes (AABBs) projected onto a single axis, the x-axis, is being sorted. Unlike the field of theoretical informatics, this thesis is not interested in complexity classes, but in actual clock times, milliseconds of running times.

Considering the characteristics of animations, it is obvious that there is more information than single snapshots of data. The basic idea of the newly developed algorithm is to use information about sortedness from frame to frame, time-based. Areas of little movement between bounding boxes (rough idea: the blue edges in the false color screenshot above), which means high presortedness, can be sorted by an algorithm that has a fast best-case complexity. Areas of high movement (idea: the green, yellow and red area in the screenshot), which means relatively high unsortedness, can be sorted by an algorithm with a fast worst-case complexity. Merging these sorted subsets, we obtain the fully sorted input array.


Since the new algorithm, which is called AdaptiveFrameSort, is optimized for fast practical sorting times, it is useful to compare times with established algorithms. On CPU, it is being compared to STL's stable sort and Timsort, on GPU, it is being compared to Thrust's stable sort.
The three scenes that were examined, clothball, funnel, and clothcar, involve cloth structures that require fast triangle-based collision detection.

On CPU, AdaptiveFrameSort sorts many frames faster than STL's algorithm and in general even faster than the adaptive Timsort algorithm. In scenes with a high amount of polygons and little triangle movement (as the clothcar scene), through its adaptivity the algorithm even can be sorting faster on GPU than Thrust's algorithm, though only on older graphics cards. Thrust sorting is highly optimized and on newer hardware it takes only around 10 ms in order to sort around 1.3 million items in the collision detection scenario. This speed leaves little time for algorithm overhead and at the end it was not possible to sort the clothcar scene faster in that case.
All in all, the developed algorithm generally succeeds as a new adaptive hybrid algorithm.


Full version of the diploma thesis


This original work is copyright by University of Bremen.
Any software of this work is covered by the European Union Public Licence v1.2. To view a copy of this license, visit eur-lex.europa.eu.
Any other assets (3D models, movies, documents, etc.) are covered by the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit creativecommons.org.
If you use any of the assets or software to produce a publication, then you must give credit and put a reference in your publication.
If you would like to use our software in proprietary software, you can obtain an exception from the above license (aka. dual licensing). Please contact zach at cs.uni-bremen dot de.