用文字描述逆短时傅立叶逆变换算法

信号处理 | 2021-01-03 | 编程黑洞网 | 0条评论 | 248 人阅读

我试图从概念上理解将正向和反向短时傅立叶变换（STFT）应用于离散时域信号时发生的情况。我找到了Allen和Rabiner（1977）的经典论文，以及Wikipedia的文章（链接）。我相信在这里也可以找到另一篇好文章。

我对计算Gabor变换感兴趣，这无非是带有高斯窗口的STFT。

这是我对前向STFT的了解：

从信号中选择一个子序列，该子序列由时域元素组成。
使用时域中的逐点乘法将子序列乘以窗口函数。
使用FFT将乘子序列引入频域。 -sequences，并重复上述过程，我们得到一个矩阵，其中包含m行和n列。每一列是在给定时间计算的子序列。

但是，对于逆STFT，论文讨论了重叠分析部分的求和。我发现可视化这里发生的事情非常困难。为了能够计算反STFT（如上所述，按逐步顺序），我必须做什么？

前向STFT

我创建了一张图纸，显示了我认为前向STFT正在进行的工作。我不了解的是如何组装每个子序列，以便获得原始时间序列。有人可以修改此图或给出一个方程式来显示如何添加子序列吗？

逆变换

这是我对逆变换的了解。使用IFFT将每个连续的窗口带回到时域中。然后，将每个窗口移动步长，然后将其添加到上一个移动的结果中。下图显示了此过程。输出的总和就是时域信号。

代码示例STFT过程，表明逆运算是在数字舍入误差范围内的正向变换的对偶。信号的开始和结尾都进行零填充，以确保窗口的中心可以位于时域信号的第一个和最后一个元素。（1977），如果在频域中发生乘法以改变频率响应，则分析窗口的长度必须等于或大于$ N + N_0-1 $点，其中$ N_0 $是滤波器响应。通过零填充扩展长度。测试代码仅表明逆是正向变换的对偶。必须延长长度，以防止圆形卷积。

% The code computes the STFT (Gabor transform) with step size = 1
% This is most useful when modifications of the signal is required in
% the frequency domain

% The Gabor transform is a STFT with a Gaussian window (w_t in the code)

% written by Nicholas Kinar

% Reference:
% [1] J. B. Allen and L. R. Rabiner, 
% “A unified approach to short-time Fourier analysis and synthesis,” 
% Proceedings of the IEEE, vol. 65, no. 11, pp. 1558 – 1564, Nov. 1977.

% generate the signal
mm = 8192;                  % signal points
t = linspace(0,1,mm);       % time axis

dt = t(2) - t(1);           % timestep t
wSize = 101;                % window size


% generate time-domain test function
% See pg. 156
% J. S. Walker, A Primer on Wavelets and Their Scientific Applications, 
% 2nd ed., Updated and fully rev. Boca Raton: Chapman & Hall/CRC, 2008.
% http://www.uwec.edu/walkerjs/primer/Ch5extract.pdf
term1 = exp(-400 .* (t - 0.2).^2);
term2 = sin(1024 .* pi .* t);
term3 = exp(-400.*(t- 0.5).^2);
term4 = cos(2048 .* pi .* t);
term5 = exp(-400 .* (t-0.7).^2);
term6 = sin(512.*pi.*t) - cos(3072.*pi.*t);
u = term1.*term2  + term3.*term4 + term5.*term6; % time domain signal
u = u';

figure;
plot(u)

Nmid = (wSize - 1) / 2 + 1;    % midway point in the window
hN = Nmid - 1;                 % number on each side of center point       


% stores the output of the Gabor transform in the frequency domain
% each column is the FFT output
Umat = zeros(wSize, mm);     


% generate the Gaussian window 
% [1] Y. Wang, Seismic inverse Q filtering. Blackwell Pub., 2008.
% pg. 123.
T = dt * hN;                    % half-width
sp = linspace(dt, T, hN); 
targ = [-sp(end:-1:1) 0 sp];    % this is t - tau
term1 = -((2 .* targ) ./ T).^2;
term2 = exp(term1);
term3 = 2 / (T * sqrt(pi));
w_t = term3 .* term2;
wt_sum = sum ( w_t ); % sum of the wavelet


% sliding window code
% NOTE that the beginning and end of the sequence
% are padded with zeros 
for Ntau = 1:mm

    % case #1: pad the beginning with zeros
    if( Ntau <= Nmid )
        diff = Nmid - Ntau;
        u_sub = [zeros(diff,1); u(1:hN+Ntau)];
    end

    % case #2: simply extract the window in the middle
    if (Ntau < mm-hN+1 && Ntau > Nmid)
        u_sub = u(Ntau-hN:Ntau+hN);
    end

    % case #3: less than the end
    if(Ntau >= mm-hN+1)
        diff = mm - Ntau;
        adiff = hN - diff;
        u_sub = [ u(Ntau-hN:Ntau+diff);  zeros(adiff,1)]; 
    end   

    % windowed trace segment
    % multiplication in time domain with
    % Gaussian window  function
    u_tau_omega = u_sub .* w_t';

    % segment in Fourier domain
    % NOTE that this must be padded to prevent
    % circular convolution if some sort of multiplication
    % occurs in the frequency domain
    U = fft( u_tau_omega );

    % make an assignment to each trace
    % in the output matrix
    Umat(:,Ntau) = U;

end

% By here, Umat contains the STFT (Gabor transform)

% Notice how the Fourier transform is symmetrical 
% (we only need the first N/2+1
% points, but I've plotted the full transform here
figure;
imagesc( (abs(Umat)).^2 )


% now let's try to get back the original signal from the transformed
% signal

% use IFFT on matrix along the cols
us = zeros(wSize,mm);
for i = 1:mm 
    us(:,i) = ifft(Umat(:,i));
end

figure;
imagesc( us );

% create a vector that is the same size as the original signal,
% but allows for the zero padding at the beginning and the end of the time
% domain sequence
Nuu = hN + mm + hN;
uu = zeros(1, Nuu);

% add each one of the windows to each other, progressively shifting the
% sequence forward 
cc = 1; 
for i = 1:mm
   uu(cc:cc+wSize-1) = us(:,i) + uu(cc:cc+wSize-1)';
   cc = cc + 1;
end

% trim the beginning and end of uu 
% NOTE that this could probably be done in a more efficient manner
% but it is easiest to do here

% Divide by the sum of the window 
% see Equation 4.4 of paper by Allen and Rabiner (1977)
% We don't need to divide by L, the FFT transform size since 
% Matlab has already taken care of it 
uu2 = uu(hN+1:end-hN) ./ (wt_sum); 

figure;
plot(uu2)

% Compare the differences bewteen the original and the reconstructed
% signals.  There will be some small difference due to round-off error
% since floating point numbers are not exact
dd = u - uu2';

figure;
plot(dd);

很好的问题-但是，您是如何快速动态制作这些图的呢？...

我将Adobe Illustrator用于图表，将Mathtype用于希腊字符。

“我对计算Gabor变换感兴趣，这无非是带有高斯窗口的STFT。”请记住，Gabor变换是一个连续的积分，并且高斯窗口扩展到无穷大。 STFT的典型实现使用离散的重叠块，并且必须使用有限长度的窗口。

感谢您指出这一点，endolith。在进行信号处理时，我倾向于以非常离散的方式思考。

#1 楼

STFT变换对可以通过4个不同的参数来表征：

FFT大小（N）
步长（M）
分析窗口（大小N）
合成窗口（大小N）

过程如下：

从当前输入位置获取N个（fft大小）样本
应用分析窗口
执行FFT
在频域中执行任何操作
逆FFT
应用综合窗口
添加到当前输出位置的输出中
按M（步长）样本提前输入和输出位置

重叠添加算法就是一个很好的例子。在这种情况下，步长为N，FFT大小为2 * N，分析窗口为矩形，其中N个为1，后跟N个0，并且合成窗口为全1。

还有很多其他选择，在某些条件下，正向/反向传输可以完全重构（即可以将原始信号返回）。这是每个输出样本通常会从一个以上的逆FFT接收加性贡献。需要在多个帧上累积输出。贡献帧的数量仅由FFT大小除以步长即可得出（必要时向上舍入）。

$ \ begingroup $
非常感谢您的深刻见解。我了解重叠添加方法。综合窗口应使用什么？有方程式吗？如果我知道分析窗口函数（例如高斯窗口），如何计算综合窗口？我了解重叠叠加方法如何用于卷积，但是我不了解如何将其用于STFT。如果步长为step = 1，如何将帧加在一起？有方程式吗？
$ \ endgroup $
–尼古拉斯·基纳尔（Nicholas Kinar）
2012年7月2日在4:01

$ \ begingroup $
如果分析窗口功能以每个步长为1的样本为中心，是否对时域序列的开始和结尾进行零填充，以使窗口的中间以每个样本（包括时域序列中的第一个和最后一个样本）？
$ \ endgroup $
–尼古拉斯·基纳尔（Nicholas Kinar）
2012年7月2日在4:02

$ \ begingroup $
您可以根据应用程序的特定需求选择步长，ftf大小，分析和综合窗口。一个示例是步长N，FFT大小2 * N，分析汉宁，综合所有这些。您可以修改它以分析sqrt（hanning）和综合sqrt（hanning）。任一个都会起作用。我将归结为您在频域中所做的工作以及您可能会创建哪种类型的工件，例如时域混叠。
$ \ endgroup $
–希尔马
2012年7月2日13:37

$ \ begingroup $
@Hilmar：我需要能够对信号进行频域修改，然后采用IFFT来获取时域信号。我想尽量减少时域混叠。我仍然不明白如何将每个子序列带回时域，然后将它们加在一起。
$ \ endgroup $
–尼古拉斯·基纳尔（Nicholas Kinar）
2012年7月2日在18:18

$ \ begingroup $
我已经编写了一些测试代码，然后更新了原始问题。
$ \ endgroup $
–尼古拉斯·基纳尔（Nicholas Kinar）
2012年7月2日在23:52

#2 楼

在首次提出此问题七年后，我遇到了类似于@Nicholas Kinar的困惑。在这里，我想提供一些“非正式的”和“不能完全保证正确性”的个人感性想法和解释。

为了便于理解，以下语句的标题被夸大了。 >
STFT的转发过程实际上并不是保留原始信号的方法。 FFT是原始信号片段的倾斜/拉伸版本。
这对于特征提取非常有用，其中可以过滤掉无用/冗余数据。像在音节检测中一样，并不是所有的时间数据都需要检测语音中的某些音调。
窗口矢量中的峰值代表音频信号中算法应注意的少数位置。

因此，反向STFT的原始结果可能是我们可能无法凭直觉期望的结果。 STFT功能应如下所示。

要获得原始的非窗口信号片段，可以将反窗口应用于ifft的原始输出。

设计映射函数很容易，它可以消除汉恩/汉明窗效应。

然后将使用合成窗口来处理时间片段重叠

，因为原始的非窗口信号片段可以看作已经获得，则可以使用任何“过渡权重”对重叠的部分进行插值。信号，那么可能会有一种方法来设计相应的合成窗口。
还可以通过应用以下原理来给出简单的合成窗口生成算法：

如果此位置的分析窗口值较高，则权重较高，与其他重叠该片段的片段相比。
如果此位置的分析窗口值较低，则权重较低的位置以及其他重叠的片段以更大的分析窗口值来兑现此位置。

$ \ begingroup $
这些有趣的陈述绝对可以帮助鼓励人们思考STFT。
$ \ endgroup $
–尼古拉斯·基纳尔（Nicholas Kinar）
19年11月17日在21:25