排序（二）：归并排序与快速排序

归并排序：永远 O(n log n)

归并排序一字不差地套用分治：把数组对半拆开，递归排好每一半，再归并两个有序半段。拆分大约有 log2(n) 层，每层归并做 O(n) 的工作，于是在任何情况下都是 O(n log n)——最好、平均、最坏皆然。

核心是归并步骤：给定两段有序序列，反复取出较小的那个队首元素。它需要一个大小为 n 的额外数组——归并排序使用 O(n) 额外空间——而且它是稳定的（平局时优先取左段）。

// merge sorted a[lo..mid] and a[mid+1..hi] into a, using buffer tmp
void merge(int a[], int tmp[], int lo, int mid, int hi) {
    int i = lo, j = mid + 1, k = lo;
    while (i <= mid && j <= hi)
        tmp[k++] = (a[i] <= a[j]) ? a[i++] : a[j++]; // <= keeps it stable
    while (i <= mid) tmp[k++] = a[i++];   // leftovers, left run
    while (j <= hi)  tmp[k++] = a[j++];   // leftovers, right run
    for (int t = lo; t <= hi; t++) a[t] = tmp[t];
}

归并步骤：齐头走过两段有序序列，总是输出较小的队首。

快速排序：围绕基准划分

快速排序的拆分方式不同。选一个元素作基准（pivot），然后划分：重排数组，使比基准小的都在它左边、比它大的都在它右边。此时基准已落到最终位置；再对左右两部分递归。无需归并——合并是免费的。

Partition [ 3  7  1  6  2 ] with pivot = 2 (last element):

  scan, swapping elements < pivot to the front:
  [ 1  3  7  6 | 2 ]   ( 1 < 2, moved left )
  place pivot after the small ones:
  [ 1 | 2 | 3  7  6 ]
        ^pivot now in final spot; recurse on [1] and [3 7 6]

划分围绕基准切开；基准落到它的最终位置。

// Lomuto partition: pivot = a[hi]; returns pivot's final index
int partition(int a[], int lo, int hi) {
    int pivot = a[hi], i = lo;
    for (int j = lo; j < hi; j++)
        if (a[j] < pivot) std::swap(a[i++], a[j]); // small ones to front
    std::swap(a[i], a[hi]);                         // pivot into place
    return i;
}
void quicksort(int a[], int lo, int hi) {
    if (lo >= hi) return;                  // base case
    int p = partition(a, lo, hi);
    quicksort(a, lo, p - 1);
    quicksort(a, p + 1, hi);
}

快速排序原地排序——只有递归栈是额外空间。

基准的选择至关重要

如果基准能可靠地把数组切在中点附近，快速排序平均为 O(n log n)——而它紧凑的内层循环让它在实践中很快。但一个糟糕的基准（比如总取最小值）每次只切下一个元素，造成 n 层嵌套，最坏为 O(n^2)。平衡时快速排序使用 O(log n) 栈空间。

Good pivot (near middle)        Bad pivot (always smallest)

   [........n........]            [........n........]
     /           \                pivot | [....n-1....]
  [n/2]        [n/2]                       pivot | [..n-2..]
  / \           / \                              pivot | [n-3]
 ...           ...                                    ...
 ~log n levels, O(n) each       n levels, O(n) each
   => O(n log n)                  => O(n^2)

均衡的拆分给出 O(n log n)；极度不均的拆分退化为 O(n^2)。

选哪个——以及 std::sort 做了什么

需要保证 O(n log n) 且要稳定（如链表、外部/多键排序）？选归并排序。
排序内存中的数组、想要原始速度且额外空间少？快速排序（配好基准）通常胜出。
在实践中，直接调用 std::sort——它是 introsort：快速排序，但会回退到堆排序以躲开 O(n^2)，并在极小子数组上回退到插入排序。

#include <algorithm>
#include <vector>
std::vector<int> v = {5, 2, 8, 1};
std::sort(v.begin(), v.end());          // O(n log n), introsort
std::stable_sort(v.begin(), v.end());   // O(n log n), stable (merge-based)

优先用标准库；当平局必须保持顺序时改用 std::stable_sort。

这个道理比排序本身更宽广：了解一个算法如何达到它的复杂度——以及它在哪里会失败——能让你挑对工具，并放心把其余的交给库。下一篇我们问更深的问题：O(n log n) 是排序所能达到的极限吗？