Compact and efficient hardware architectures for implementing
lifting-based DWTs, including 1-D and 2-D versions of recursive and dual
scan architectures. The 1-D recursive architecture exploits
interdependencies among the wavelet coefficients by interleaving, on
alternate clock cycles using the same datapath hardware, the calculation
of higher order coefficients along with that of the first-stage
coefficients. The resulting hardware utilization exceeds 90% in the
typical case of a 5-stage 1-D DWT operating on 1024 samples. The 1-D dual
scan architecture achieves 100% datapath hardware utilization by
processing two independent data streams together using shared functional
blocks. The 2-D recursive architecture is roughly 25% faster than
conventional implementations, and it requires a buffer that stores only a
few rows of the data array instead of a fixed fraction (typically 25% or
more) of the entire array. The 2-D dual scan architecture processes the
column and row transforms simultaneously, and the memory buffer size is
comparable to existing architectures. The recursive and dual scan
architectures can be readily extended to the N-D case.