Mathematical technique used in data compression and analysis
An example of the 2D discrete wavelet transform that is used in JPEG2000For broader coverage of this topic, see
Wavelet.
In mathematics, a wavelet series is a representation of a square-integrable (real- or complex-valued) function by a certain orthonormal series generated by a wavelet. This article provides a formal, mathematical definition of an orthonormal wavelet and of the integral wavelet transform.[1][2][3][4]
A function ψ ∈ L 2 ( R ) {\displaystyle \psi \,\in \,L^{2}(\mathbb {R} )} is called an orthonormal wavelet if it can be used to define a Hilbert basis, that is, a complete orthonormal system for the Hilbert space of square-integrable functions on the real line.
The Hilbert basis is constructed as the family of functions { ψ j k : j , k ∈ Z } {\displaystyle \{\psi _{jk}:\,j,\,k\,\in \,\mathbb {Z} \}} by means of dyadic translations and dilations of ψ {\displaystyle \psi \,} , ψ j k ( x ) = 2 j 2 ψ ( 2 j x − k ) , {\displaystyle \psi _{jk}(x)=2^{\frac {j}{2}}\psi \left(2^{j}x-k\right),} for integers j , k ∈ Z {\displaystyle j,\,k\,\in \,\mathbb {Z} } .
If, under the standard inner product on L 2 ( R ) {\displaystyle L^{2}\left(\mathbb {R} \right)} , ⟨ f , g ⟩ = ∫ − ∞ ∞ f ( x ) g ( x ) ¯ d x , {\displaystyle \langle f,g\rangle =\int _{-\infty }^{\infty }f(x){\overline {g(x)}}dx,} this family is orthonormal, then it is an orthonormal system: ⟨ ψ j k , ψ l m ⟩ = ∫ − ∞ ∞ ψ j k ( x ) ψ l m ( x ) ¯ d x , = δ j l δ k m , {\displaystyle {\begin{aligned}\langle \psi _{jk},\psi _{lm}\rangle &=\int _{-\infty }^{\infty }\psi _{jk}(x){\overline {\psi _{lm}(x)}}dx,\\&=\delta _{jl}\delta _{km},\end{aligned}}} where δ j l {\displaystyle \delta _{jl}\,} is the Kronecker delta.
Completeness is satisfied if every function f ∈ L 2 ( R ) {\displaystyle f\,\in \,L^{2}\left(\mathbb {R} \right)} may be expanded in the basis as
with convergence of the series understood to be convergence in norm. Such a representation of f {\displaystyle f} is known as a wavelet series. This implies that an orthonormal wavelet is self-dual.
The integral wavelet transform is the integral transform defined as [ W ψ f ] ( a , b ) = 1 | a | ∫ − ∞ ∞ ψ ( x − b a ) ¯ f ( x ) d x {\displaystyle \left[W_{\psi }f\right](a,b)={\frac {1}{\sqrt {|a|}}}\int _{-\infty }^{\infty }{\overline {\psi \left({\frac {x-b}{a}}\right)}}f(x)dx\,} The wavelet coefficients c j k {\displaystyle c_{jk}} are then given by c j k = [ W ψ f ] ( 2 − j , k 2 − j ) {\displaystyle c_{jk}=\left[W_{\psi }f\right]\left(2^{-j},k2^{-j}\right)}
Here, a = 2 − j {\displaystyle a=2^{-j}} is called the binary dilation or dyadic dilation, and b = k 2 − j {\displaystyle b=k2^{-j}} is the binary or dyadic position.
The fundamental idea of wavelet transforms is that the transformation should allow only changes in time extension, but not shape, imposing a restriction on choosing suitable basis functions. Changes in the time extension are expected to conform to the corresponding analysis frequency of the basis function. Based on the uncertainty principle of signal processing,
where t {\displaystyle t} represents time and ω {\displaystyle \omega } angular frequency ( ω = 2 π f {\displaystyle \omega =2\pi f} , where f {\displaystyle f} is ordinary frequency).
The higher the required resolution in time, the lower the resolution in frequency has to be. The larger the extension of the analysis windows is chosen, the larger is the value of Δ t {\displaystyle \Delta t} .
When Δ t {\displaystyle \Delta t} is large
When Δ t {\displaystyle \Delta t} is small
In other words, the basis function ψ {\displaystyle \psi } can be regarded as an impulse response of a system with which the function x ( t ) {\displaystyle x(t)} has been filtered. The transformed signal provides information about the time and the frequency. Therefore, wavelet-transformation contains information similar to the short-time-Fourier-transformation, but with additional special properties of the wavelets, which show up at the resolution in time at higher analysis frequencies of the basis function. The difference in time resolution at ascending frequencies for the Fourier transform and the wavelet transform is shown below. Note however, that the frequency resolution is decreasing for increasing frequencies while the temporal resolution increases. This consequence of the Fourier uncertainty principle is not correctly displayed in the Figure.
This shows that wavelet transformation is good in time resolution of high frequencies, while for slowly varying functions, the frequency resolution is remarkable.
Another example: The analysis of three superposed sinusoidal signals y ( t ) = sin ( 2 π f 0 t ) + sin ( 4 π f 0 t ) + sin ( 8 π f 0 t ) {\displaystyle y(t)\;=\;\sin(2\pi f_{0}t)\;+\;\sin(4\pi f_{0}t)\;+\;\sin(8\pi f_{0}t)} with STFT and wavelet-transformation.
Wavelet compression[edit]Wavelet compression is a form of data compression well suited for image compression (sometimes also video compression and audio compression). Notable implementations are JPEG 2000, DjVu and ECW for still images, JPEG XS, CineForm, and the BBC's Dirac. The goal is to store image data in as little space as possible in a file. Wavelet compression can be either lossless or lossy.[5]
First a wavelet transform is applied. This produces as many coefficients as there are pixels in the image (i.e., there is no compression yet since it is only a transform). These coefficients can then be compressed more easily because the information is statistically concentrated in just a few coefficients. This principle is called transform coding. After that, the coefficients are quantized and the quantized values are entropy encoded and/or run length encoded.
A few 1D and 2D applications of wavelet compression use a technique called "wavelet footprints".[6][7]
Requirement for image compression[edit]For most natural images, the spectrum density of lower frequency is higher.[8] As a result, information of the low frequency signal (reference signal) is generally preserved, while the information in the detail signal is discarded. From the perspective of image compression and reconstruction, a wavelet should meet the following criteria while performing image compression:
Wavelet image compression system involves filters and decimation, so it can be described as a linear shift-variant system. A typical wavelet transformation diagram is displayed below:
The transformation system contains two analysis filters (a low pass filter h 0 ( n ) {\displaystyle h_{0}(n)} and a high pass filter h 1 ( n ) {\displaystyle h_{1}(n)} ), a decimation process, an interpolation process, and two synthesis filters ( g 0 ( n ) {\displaystyle g_{0}(n)} and g 1 ( n ) {\displaystyle g_{1}(n)} ). The compression and reconstruction system generally involves low frequency components, which is the analysis filters h 0 ( n ) {\displaystyle h_{0}(n)} for image compression and the synthesis filters g 0 ( n ) {\displaystyle g_{0}(n)} for reconstruction. To evaluate such system, we can input an impulse δ ( n − n i ) {\displaystyle \delta (n-n_{i})} and observe its reconstruction h ( n − n i ) {\displaystyle h(n-n_{i})} ; The optimal wavelet are those who bring minimum shift variance and sidelobe to h ( n − n i ) {\displaystyle h(n-n_{i})} . Even though wavelet with strict shift variance is not realistic, it is possible to select wavelet with only slight shift variance. For example, we can compare the shift variance of two filters:[9]
Biorthogonal filters for wavelet image compression Length Filter coefficients Regularity Wavelet filter 1 H0 9 .852699, .377402, -.110624, -.023849, .037828 1.068 G0 7 .788486, .418092, -.040689, -.064539 1.701 Wavelet filter 2 H0 6 .788486, .047699, -.129078 0.701 G0 10 .615051, .133389, -.067237, .006989, .018914 2.068By observing the impulse responses of the two filters, we can conclude that the second filter is less sensitive to the input location (i.e. it is less shift variant).
Another important issue for image compression and reconstruction is the system's oscillatory behavior, which might lead to severe undesired artifacts in the reconstructed image. To achieve this, the wavelet filters should have a large peak to sidelobe ratio.
So far we have discussed about one-dimension transformation of the image compression system. This issue can be extended to two dimension, while a more general term - shiftable multiscale transforms - is proposed.[10]
Derivation of impulse response[edit]As mentioned earlier, impulse response can be used to evaluate the image compression/reconstruction system.
For the input sequence x ( n ) = δ ( n − n i ) {\displaystyle x(n)=\delta (n-n_{i})} , the reference signal r 1 ( n ) {\displaystyle r_{1}(n)} after one level of decomposition is x ( n ) ∗ h 0 ( n ) {\displaystyle x(n)*h_{0}(n)} goes through decimation by a factor of two, while h 0 ( n ) {\displaystyle h_{0}(n)} is a low pass filter. Similarly, the next reference signal r 2 ( n ) {\displaystyle r_{2}(n)} is obtained by r 1 ( n ) ∗ h 0 ( n ) {\displaystyle r_{1}(n)*h_{0}(n)} goes through decimation by a factor of two. After L levels of decomposition (and decimation), the analysis response is obtained by retaining one out of every 2 L {\displaystyle 2^{L}} samples: h A ( L ) ( n , n i ) = f h 0 ( L ) ( n − n i / 2 L ) {\displaystyle h_{A}^{(L)}(n,n_{i})=f_{h0}^{(L)}(n-n_{i}/2^{L})} .
On the other hand, to reconstruct the signal x(n), we can consider a reference signal r L ( n ) = δ ( n − n j ) {\displaystyle r_{L}(n)=\delta (n-n_{j})} . If the detail signals d i ( n ) {\displaystyle d_{i}(n)} are equal to zero for 1 ≤ i ≤ L {\displaystyle 1\leq i\leq L} , then the reference signal at the previous stage ( L − 1 {\displaystyle L-1} stage) is r L − 1 ( n ) = g 0 ( n − 2 n j ) {\displaystyle r_{L-1}(n)=g_{0}(n-2n_{j})} , which is obtained by interpolating r L ( n ) {\displaystyle r_{L}(n)} and convoluting with g 0 ( n ) {\displaystyle g_{0}(n)} . Similarly, the procedure is iterated to obtain the reference signal r ( n ) {\displaystyle r(n)} at stage L − 2 , L − 3 , . . . . , 1 {\displaystyle L-2,L-3,....,1} . After L iterations, the synthesis impulse response is calculated: h s ( L ) ( n , n i ) = f g 0 ( L ) ( n / 2 L − n j ) {\displaystyle h_{s}^{(L)}(n,n_{i})=f_{g0}^{(L)}(n/2^{L}-n_{j})} , which relates the reference signal r L ( n ) {\displaystyle r_{L}(n)} and the reconstructed signal.
To obtain the overall L level analysis/synthesis system, the analysis and synthesis responses are combined as below:
h A S ( L ) ( n , n i ) = ∑ k f h 0 ( L ) ( k − n i / 2 L ) f g 0 ( L ) ( n / 2 L − k ) {\displaystyle h_{AS}^{(L)}(n,n_{i})=\sum _{k}f_{h0}^{(L)}(k-n_{i}/2^{L})f_{g0}^{(L)}(n/2^{L}-k)} .
Finally, the peak to first sidelobe ratio and the average second sidelobe of the overall impulse response h A S ( L ) ( n , n i ) {\displaystyle h_{AS}^{(L)}(n,n_{i})} can be used to evaluate the wavelet image compression performance.
Using a wavelet transform, the wavelet compression methods are adequate for representing transients, such as percussion sounds in audio, or high-frequency components in two-dimensional images, for example an image of stars on a night sky. This means that the transient elements of a data signal can be represented by a smaller amount of information than would be the case if some other transform, such as the more widespread discrete cosine transform, had been used.
While wavelet transforms offer theoretical advantages, their practical limitations have effectively limited wavelet compression to analyzing localized changes and transient signals. Despite decades of research, wavelet-based compression systems for common multimedia like audio and video do not consistently match the efficiency and perceptual quality of current Discrete Cosine Transform-based systems.[11]
For one-dimensional data like audio or ECGs, wavelets excel at representing and compressing transient signals—sudden, isolated events such as a drum hit in music or the sharp peaks in a heart rhythm. For example, the discrete wavelet transform has been successfully applied for the compression of electrocardiograph (ECG) signals.[12] However, for smooth, periodic signals, which make up much of typical audio, harmonic analysis in the frequency domain with Fourier-related transforms achieve better compression and sound quality. Compressing data that has both transient and periodic characteristics may be done with hybrid techniques that use wavelets along with traditional harmonic analysis. For example, the Vorbis audio codec primarily uses the modified discrete cosine transform to compress audio (which is generally smooth and periodic), however allows the addition of a hybrid wavelet filter bank for improved reproduction of transients.[13]
For higher-dimensional data, wavelet compression faces significant challenges. In video, for instance, modern compression techniques such as intra coding and motion compensation (predicting parts of an image based on what's next to it spatially and temporally) and mixed and dynamic block sizes become incredibly complex with wavelets because of their overlapping nature. This complexity translates to more processing power and slower speed, making them less practical for widespread use. Furthermore, while wavelets might score well on traditional measures such as PSNR, DCT blocks create a perception of sharpness that wavelets often lack, requiring higher bitrates to achieve similar subjective quality.[11]
Comparison with Fourier transform and time-frequency analysis[edit]Wavelets have some slight benefits over Fourier transforms in reducing computations when examining specific frequencies. However, they are rarely more sensitive, and indeed, the common Morlet wavelet is mathematically identical to a short-time Fourier transform using a Gaussian window function.[14] The exception is when searching for signals of a known, non-sinusoidal shape (e.g., heartbeats); in that case, using matched wavelets can outperform standard STFT/Morlet analyses.[15]
Other practical applications[edit]The wavelet transform can provide us with the frequency of the signals and the time associated to those frequencies, making it very convenient for its application in numerous fields. For instance, signal processing of accelerations for gait analysis,[16] for fault detection,[17] for the analysis of seasonal displacements of landslides,[18] for design of low power pacemakers and also in ultra-wideband (UWB) wireless communications.[19][20][21]
Applied the following discretization of frequency and time:
Leading to wavelets of the form, the discrete formula for the basis wavelet:
Such discrete wavelets can be used for the transformation:
As apparent from wavelet-transformation representation (shown below)
where c {\displaystyle c} is scaling factor, τ {\displaystyle \tau } represents time shift factor
and as already mentioned in this context, the wavelet-transformation corresponds to a convolution of a function y ( t ) {\displaystyle y(t)} and a wavelet-function. A convolution can be implemented as a multiplication in the frequency domain. With this the following approach of implementation results into:
There are many different types of wavelet transforms for specific purposes. See also a full list of wavelet-related transforms but the common ones are listed below: Mexican hat wavelet, Haar Wavelet, Daubechies wavelet, triangular wavelet.For processing temporal signals in real time, it is essential that the wavelet filters do not access signal values from the future as well as that minimal temporal latencies can be obtained. Time-causal wavelets representations have been developed by Szu et al[24] and Lindeberg,[25] with the latter method also involving a memory-efficient time-recursive implementation.
Synchro-squeezed transform[edit]Synchro-squeezed transform can significantly enhance temporal and frequency resolution of time-frequency representation obtained using conventional wavelet transform.[26][27]
Vorbis I is a forward-adaptive monolithic transform CODEC based on the Modified Discrete Cosine Transform. The codec is structured to allow addition of a hybrid wavelet filterbank in Vorbis II to offer better transient response and reproduction using a transform better suited to localized time events.
Wikimedia Commons has media related to
Wavelets.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4