Sorting

Sorting algorithms arrange elements in a specified order (ascending or descending). Sorting is one of the most fundamental operations in computing and is a prerequisite for many other algorithms (binary search, merge, kd-tree construction).

Comparison-Based Sorting

These algorithms compare elements pairwise. The optimal lower bound for comparison-based sorting is $\Omega(n \log n)$.

Proof sketch: there are $n!$ possible orderings of $n$ elements. A decision tree of comparisons must have at least $n!$ leaves. A binary tree with $n!$ leaves has height $\geq \log_2(n!) = \Theta(n \log n)$.

Merge Sort

Strategy: divide and conquer. Split the array in half; recursively sort each half; merge the two sorted halves.

def merge_sort(arr):
    if len(arr) <= 1:
        return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    return result + left[i:] + right[j:]

Time: $O(n \log n)$ in all cases.

Space: $O(n)$ auxiliary.

Stable: equal elements maintain their relative order.

Applications: external sorting (when data doesn’t fit in memory), sorting linked lists.

Quick Sort

Strategy: partition the array around a pivot; recursively sort each partition.

def quicksort(arr, lo=0, hi=None):
    if hi is None: hi = len(arr) - 1
    if lo < hi:
        p = partition(arr, lo, hi)
        quicksort(arr, lo, p - 1)
        quicksort(arr, p + 1, hi)

def partition(arr, lo, hi):
    pivot = arr[hi]
    i = lo - 1
    for j in range(lo, hi):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    arr[i+1], arr[hi] = arr[hi], arr[i+1]
    return i + 1

Time: $O(n \log n)$ average; $O(n^2)$ worst case (always picks smallest/largest as pivot).

Space: $O(\log n)$ stack space.

Not stable (standard implementation).

Optimizations:

  • Random pivot selection: avoids worst case in practice.
  • Median-of-three pivot: choose median of first, middle, last elements.
  • 3-way partition (Dutch National Flag): handles many duplicate keys efficiently.
  • Introsort (C++ std::sort): quicksort, switch to heapsort if depth exceeds $2 \log n$; switch to insertion sort for small subarrays.

Heap Sort

Strategy: build a max-heap; repeatedly extract the maximum.

Time: $O(n \log n)$ guaranteed.

Space: $O(1)$ (in-place).

Not stable.

Less cache-friendly than quicksort due to non-sequential memory access.

Insertion Sort

Strategy: build the sorted array one element at a time by inserting each new element into its correct position.

def insertion_sort(arr):
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
        while j >= 0 and arr[j] > key:
            arr[j+1] = arr[j]
            j -= 1
        arr[j+1] = key

Time: $O(n^2)$ worst case; $O(n)$ if already nearly sorted.

Space: $O(1)$.

Stable.

Use case: very small arrays (typically $n \leq 16$); nearly sorted data. Used as the base case in timsort and introsort.

Selection Sort

Find the minimum element and swap it to position 0; repeat for position 1, 2, …

Time: $O(n^2)$ always.

Not stable (standard implementation).

Use case: rarely; minimizes number of swaps (useful when swapping is expensive).

Bubble Sort

Repeatedly swap adjacent out-of-order elements.

Time: $O(n^2)$ worst case; $O(n)$ if already sorted (with early termination).

Stable. Rarely used in practice.

Counting Sort

For integer keys in range $[0, k)$: count occurrences of each key; compute prefix sums; place elements.

Time: $O(n + k)$.

Space: $O(n + k)$.

Stable.

Use case: large arrays of small integers.

Radix Sort

Sort integers digit by digit, from least significant to most significant (LSD radix sort), using counting sort for each digit.

Time: $O(d(n + b))$ where $d$ = number of digits, $b$ = base (typically 10 or 256).

For fixed-width integers (32-bit, 64-bit): $O(n)$.

Stable. Space: $O(n + b)$.

Timsort

The hybrid algorithm used in Python (sorted, list.sort) and Java (for objects).

Strategy: identify natural runs (already sorted subsequences); if a run is too short, extend it with insertion sort; merge runs using a merge sort variant with galloping mode for uneven merges.

Time: $O(n \log n)$ worst case; $O(n)$ best case (already sorted).

Stable. Space: $O(n)$.

Algorithm Comparison

Algorithm Best Average Worst Space Stable
Merge sort $O(n \log n)$ $O(n \log n)$ $O(n \log n)$ $O(n)$ Yes
Quick sort $O(n \log n)$ $O(n \log n)$ $O(n^2)$ $O(\log n)$ No
Heap sort $O(n \log n)$ $O(n \log n)$ $O(n \log n)$ $O(1)$ No
Insertion sort $O(n)$ $O(n^2)$ $O(n^2)$ $O(1)$ Yes
Timsort $O(n)$ $O(n \log n)$ $O(n \log n)$ $O(n)$ Yes
Counting sort $O(n+k)$ $O(n+k)$ $O(n+k)$ $O(n+k)$ Yes
Radix sort $O(dn)$ $O(dn)$ $O(dn)$ $O(n+b)$ Yes