計算機科學 1965

一種機器計算複數傅立葉級數的演算法

詹姆斯·庫利與約翰·圖基

把本要重複的求和只算一次——傅立葉變換便從 N² 降到 N log N。

Choose your version

In depth · the introduction

正是這個技巧，把一項慢得不可能的計算，變成你的手機一秒鐘做上千次的小事——並悄悄地，開啟了聲音、影像與無線的數位時代。

把這個想法拆開看

傅立葉變換，會拿過一個訊號——一段聲波、一束無線電脈衝——弄清楚它由哪些純粹的頻率構成。精確做到這一點的「配方」已有幾百年，卻慢得叫人受不了：要分析 N 個取樣，你得做大約 N × N 次乘法。資料翻一倍，工作量就翻四倍。真實訊號裡的 N 很大，而在 1960 年代初的計算機上，這幾乎是沒指望的。

庫利與圖基找到了一條捷徑。這項計算裡滿是重複——同樣的小乘積，被一算再算。把訊號對半分，再對半分，並複用其中共享的部分，他們就把代價從 N² 降到了大約 N × log N。對一千個取樣而言，這是「一百萬步」與「幾千步」之差。

它從哪裡來

1965 年，IBM 研究院的程式設計師詹姆斯·庫利，與普林斯頓的統計學家約翰·圖基，在《Mathematics of Computation》上發表了一篇短文。據庫利本人後來的回憶，這個想法是在一次美國政府科學顧問委員會的會上，由圖基隨手勾勒出來的：當時的一個緊要問題，是從微弱的地震儀記錄裡偵測遠方的核試驗——這意味著要快速地對很長的記錄做傅立葉變換。庫利寫下了程式，論文隨之而來。

時機就是一切。同樣的數學技巧此前早被發現過——高斯在約 1805 年就有了——卻一再被遺忘，因為在快速計算機出現以前，沒人迫切地需要它。庫利與圖基的發表，恰好在機器與問題都已大到使它不可或缺的那一刻——這一次，它留下來了。

它為何重要

幾乎任何與聲音、影像或無線電有關的數位事物，都在某處倚賴傅立葉變換——而 FFT，正是讓它快到「彷彿免費」的那一環。你的音樂能串流播放、照片能壓縮、手機能和基地台對話、醫院的掃描儀能把微弱的無線電回聲變成一幅影像，背後都有它的份。一套只是把算術重新排布的演算法，最終成了現代技術裡一面不聲不響的承重牆。

一個精確的類比

想像給一場龐大的循環賽計分，每名選手都和其餘每名選手交過手。笨辦法是把同樣的小計，一遍遍重新加。聰明的記分員會把選手分組，每組的小計只算一次，再在所有需要它的地方反覆取用。FFT 就是這位記分員：它看出暴力法反覆重算的那些和，每一個只算一次，再用區區幾輪——log N 輪，而非 N 輪——把它們合起來。

它在故事裡的位置

這背後的連續數學，要追溯到約瑟夫·傅立葉——他在 1822 年論證：任何訊號都能由簡單的波疊加而成（見本館的《熱的解析理論》）。而把這個想法變成計算機能存下的取樣，則倚賴夏農 1948 年資訊理論核心處的取樣定理。FFT 是那座讓這一切落地的橋——六十年過去，它依舊是地球上被執行得最多的演算法之一。

The original document

Original source text

J. W. Cooley & J. W. Tukey · Mathematics of Computation 19 (1965), pp. 297–301

Introduction

An efficient method for the calculation of the interactions of a 2^m factorial experiment was introduced by Yates and is widely known by his name.

The paper opens not with signals but with statistics: the same combinatorial bookkeeping that speeds up factorial experiments, generalized by I. J. Good, is what speeds up the Fourier transform, and the authors place their work squarely in that lineage.

In their full generality, Good's methods are applicable to certain problems in which one must multiply an N-vector by an N × N matrix which can be factored into m sparse matrices, where m is proportional to log N.

That sentence is the whole idea: the dense N × N transform matrix is rewritten as a product of about log N sparse matrices, so the matrix–vector multiply costs on the order of N log N operations rather than N².

The algorithm

The body derives the factorization for a composite length N = r₁·r₂·…, re-indexing inputs and outputs in mixed radix so the long sum becomes nested shorter sums tied together by twiddle factors W = e^(−2πi/N). Applied to a power of two, N = 2^m, this is the familiar radix-2 transform of m stages.

[ … ]

Numerical examples

Short numerical sections report measured running times on a contemporary IBM computer, confirming that the operation count grows like N log N in practice and that transforms previously too large to attempt finish in a fraction of a second. The complete five-page paper — derivation, the program's logic, and the timings — is available at the source below.

IBM Research · Mathematics of Computation, April 1965