理想的并行排序算法

一个知乎问题，如果某CPU有无限个核心，那么时间复杂度最小的排序算法是怎么样的？ (opens new window)

# 1. 最优结论

不需要无限个，只要 $O(n)$ 个核心，时间复杂度最小是 $O(\log n)$ 。

摘要：

The purpose of this paper is to describe a sorting network of size 0(n log n) and depth 0(log n).

A natural way of sorting is through consecutive halvings: determine the upper and lower halves of the set, proceed similarly within the halves, and so on. Unfortunately, while one can halve a set using only 0(n) comparisons, this cannot be done in less than log n (parallel) time, and it is known that a halving network needs (½)n log n comparisons.

It is possible, however, to construct a network of 0(n) comparisons which halves in constant time with high accuracy. This procedure (ε-halving) and a derived procedure (ε-nearsort) are described below, and our sorting network will be centered around these elementary steps.

摘要提到，排序网络 "size 0(n log n) and depth 0(log n)"，"0(n) comparisons"，核心数应该就是比较器数 $O(n)$ 个，深度也就是时间复杂度 $O(\log n)$ 。

但是具体怎么做，俺就不细看了，就当指个路。

另外时间复杂度一定不小于 $O(\log n)$ ，因为小于这个复杂度我们无法判断这个数组是否有序，而排序肯定比这个任务要强。

（感觉是这样的，不知道怎么严谨证明）

# 2. 时间复杂度不是最优

如果时间复杂度放宽一些， $O(\log^2n)$ 复杂度，这个其他回答提到过，就是双调排序（Bitonic sort），很好理解。

# 3. 核心数不是最优

如果核心数放宽一些， $O(n^2)$ 核心数，那么可以这么做：

假设核心排列成一个 $n\times n$ 矩阵，第 i 行第 j 列判断 $a_i<a_j$ ，然后第 i 行的信息 all-reduce（一种通讯算法，复杂度 $O(\log n)$ 就行）一下，就能得到 $a_i$ 的排名了，将 $a_i$ 写到排名位置即可。

这样总复杂度还是 $O(\log n)$ 。

← 有没有计算时间复杂度的算法 Codeforces 1305 题解 (A-G)→