Joey Thornberry
2024-09-19 04:00:00 UTC

Understanding Quicksort

tl,dr: Quicksort sorts a list by recursively dividing it into 'big' and 'small' sections, using very little code to do so.

Quicksort is a very competitive sorting algorithm that requires surprisingly little code to implement. Essentially, it requires two procedures: a partition function that takes a list and puts all the 'big' elements on the right and all the 'small' elements on the left, and a quicksort procedure that takes each of those sections and partitions them again recursively. As you might expect, partitioning each part of a list into 'big' and 'small' sections eventually sorts the entire list!

Step 1: Partitioning a List

So, how are we going to organize a list into 'big' and 'small' sections? In quicksort, we do it this way: we choose an arbitrary element, the 'pivot' element, that we will use to define what 'big' and 'small' are. If an element is less than the pivot element, we'll say that it belongs on the left side of the list, and if it's equal or greater, we'll say that it belongs on the right.

Now that we've defined where elements should go, it's time to put them in their place. Let's write some C.

Here's the uncommented code, just so you can see how few lines this function actually requires:

int partition(long * inputv, int p, int r) {
    int i = p - 1; 
    int j = r + 1;
    long x = inputv[p];

    while (1) {
        do i++; while (inputv[i] < x);
        do j--; while (inputv[j] > x);

        if ( i < j ) { 
            long tmp = inputv[i];
            inputv[i] = inputv[j];
            inputv[j] = tmp;
        } else return j;
    }
}

Now let's try and understand it. The code above is the standard partition strategy used in Quicksort: two index variables, one, i, that moves left to right, and one, j, that moves right to left. The two indexes, i and j, look for 'big' values and 'small' values, respectively, that aren't on the side they should be on. Once both i and j have locked in on misplaced elements, they just switch their elements! Because j moves right to left and i moves left to right, this will end up putting small elements on the left and large elements on the right. Here's the code again, but with comments explaining each step.

int partition(long * inputv, int p, int r) {
    /** This function arranges the input 
     *  list into two sections, where elements
     *  smaller than x (the first element) go
     *  in the left section, and bigger and 
     *  equal elements go in the right 
     *  section **/

    int i = p - 1; 
    // this index will move left to right
    // looking for big elements
    
    int j = r + 1;
    // this index will move right to left
    // looking for small elements
    
    long x = inputv[p];
    // this is the element we compare things
    // to (it's arbitrarily chosen to be the
    // element that is in the first position)

    while (1) { // loop until loop is broken

        // move i to the right until it
        // finds an element not smaller
        // than x
        do i++; while (inputv[i] < x);

        // move j to the left until it
        // finds an element not bigger 
        // than x
        do j--; while (inputv[j] > x);

        if ( i < j ) { 
        // if i and j have not met, swap
        // the elements they stopped at
            long tmp = inputv[i];
            inputv[i] = inputv[j];
            inputv[j] = tmp;
        } else return j;
        // if i and j have met, we're done
        // return j so further code knows 
        // where the partitions are
    }
}

Let's try this out.

It may not seem like much happened, but if you take a closer look you'll notice that our code put all the elements equal to or greater than 5, the original first element, on the right side, and all the elements smaller than 5 on the left. This is all we need! Imagine if I called partition again, but on just the last three elements:

By simply partitioning the right side of the subarray, we ended up sorting it! This should give us a clue on how to proceed.

Step 2: Recursion

Once we've wrapped our heads around the partition function, there isn't much farther to go! We just need a function, which we can call quick_sort, that recursively calls partition until the list is sorted. The quick_sort function will end up looking a lot like the merge_sort function from our merge-sort implementation, in that it splits the input list into two sections. The quick_sort function differs from merge_sort, however, in that it first calls partition and then splits the list, as opposed to merge_sort, which splits the list and sorts its sublists before merging them. Here's the code:

void quick_sort(long * inputv, int p, int r) {
    
    if (p >= r) return;
    // no need to sort a subarray of length 1

    int q = partition(inputv, p, r);
    // partition the array into big and small sections
    // q will be set to the last index of the small section

    quick_sort(inputv, p, q);
    // sort the small (left) section
    quick_sort(inputv, q + 1, r);
    // sort the big (right) section
}

And that's it! Let's put it in a program. (You can find my helper code on Github.)

#include "../clilonglist.h" // my command-line input utility functions
#include <stdlib.h>

int partition(long * inputv, int p, int r) {
    ...
}

int quick_sort(long * inputv, int p, int r) {
    ...
}

int main(int argc, char * argv[]) {

    int inputc = argc - 1;
    long * inputv = extract_long_list(inputc, argv);
    // use my utility functions to grab the 
    // command line arguments as an array of longs

    quick_sort(inputc, inputv);
    print_long_list(inputc, inputv);

    free(inputv);
}

Running it...

Sorted!

Conclusion

Althought Quicksort's theoretical worst-case running time is O(n2), its average, and usually expected, running time is O(n lg n), which makes it competitive with algorithms like Merge Sort and Heap Sort. Even better, Quicksort requires much less code than most other sorting algorithms, which makes it pretty useful!