Chapter 8
Acoustic Array Processing for Speech Enhancement
Markus Buck, Eberhard Hänsler,
Mohamed Krini, Gerhard Schmidt,
Tobias Wolff,
Eberhard Hänsler
Technische Universität Darmstadt, Darmstadt, Germany
Search for more papers by this authorGerhard Schmidt
Harman/Becker Automotive Systems, Acoustic Signal Processing Research, Ulm, Germany
Search for more papers by this authorMarkus Buck, Eberhard Hänsler,
Mohamed Krini, Gerhard Schmidt,
Tobias Wolff,
Eberhard Hänsler
Technische Universität Darmstadt, Darmstadt, Germany
Search for more papers by this authorGerhard Schmidt
Harman/Becker Automotive Systems, Acoustic Signal Processing Research, Ulm, Germany
Search for more papers by this authorBook Editor(s):Simon Haykin,
K. J. Ray Liu,
Simon Haykin
Department of Electrical Engineering, McMaster University, Hamilton, Ontario, Canada
Search for more papers by this authorK. J. Ray Liu
Department of Electrical & Computer Engineering, University of Maryland, College Park, MD, USA
Search for more papers by this authorSummary
This chapter contains sections titled:
-
Introduction
-
Signal Processing in Subband Domain
-
Multichannel Echo Cancellation
-
Speaker Localization
-
Beamforming
-
Sensor Calibration
-
Postprocessing
-
Conclusions
-
References
REFERENCES
-
E. Hänsler and G. Schmidt, Acoustic Echo and Noise Control: A Practical Approach, Wiley, Hoboken, NJ, 2004.
10.1002/0471678406 Google Scholar
-
R. E. Crochiere and L. R. Rabiner, Multirate Digital Signal Processing, Prentice Hall, Englewood Cliffs, NJ, 1983.
10.1016/0165-1684(83)90013-0 Google Scholar
- P. Vary, “Noise suppression by spectral magnitude estimation—Mechanism and theoretical limits,” Signal Process., vol. 8, no. 4, pp. 387–400, 1985.
- P. Vary, “An adaptive filterbank equalizer for speech enhancement,” Signal Process., vol. 86, pp. 1206–1214, June 2006.
- A. Sugiyama, T. P. Hua, M. Kato, and M. Serizawa, “Noise suppression with synthesis windowing and pseudo noise injection,” Proc. IEEE ICASSP '02, pp. 545–548, 2002.
- “Transmission planning aspects of the speech service in the GSM public land mobile network (PLMS) system,” ETS 300 903 (GSM 03.50), European Telecommunications Standards Institute, France, 1999.
- A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Prentice Hall, Englewood Cliffs, NJ, 1989.
- S. Haykin, Adaptive Filter Theory, 4th ed., Prentice Hall, Englewood Cliffs, NJ, 2002.
- S. Gay and S. Travathia, “The fast affine projection algorithm,” Proc. ICASSP '95, vol. 3, pp. 3023–3027, 1995.
- A. H. Sayed, Fundamentals of Adaptive Filtering, Wiley, Hoboken, NJ, 2003.
- G. Enzner and P. Vary, “Robust and elegant, purely statistical adaptation of acoustic echo canceler and postfilter,” Proc. IWAENC '03, pp. 43–46, 2003.
- Y. Joncour, A. Sugiyama, and A. Hirano, “DSP implementations and performance evaluation of a stereo echo canceller with pre-processing,” Proc. EUSIPCO '98, vol. 2, pp. 981–984, 1998.
- A. Sugiyama, Y. Joncour, and A. Hirano, “A stereo echo canceller with correct echo-path identification based on an input-sliding technique,” IEEE Trans. Signal Process., vol. 49, no. 1, pp. 2577–2587, 2001.
- A. Gilloire and V. Turbin, “Using auditory properties to improve the behaviour of stereo-phonic acoustic echo cancellation,” Proc. ICASSP '98, vol. 6, pp. 3681–3684, 1998.
- M. M. Sondhi and D. R. Morgan, “Stereophonic acoustic echo cancellation—An overview of the fundamental problem,” IEEE Signal Process. Lett., vol. 2, no. 8, pp. 148–151, 1995.
- Y. Huang, J. Benesty, and G. W. Elko, “Microphone arrays for video camera steering, in Acoustic Signal Processing for Telecommunication, S. L. Gay and J. Benesty (Eds.), Kluwer Academic, Boston, 2001, pp. 239–260.
-
J. H. DiBiase, H. F. Siverman, and M. S. Brandstein, “Robust source localization in reverberant rooms,” in Microphone Arrays, M. S. Brandstein and D. Ward (Eds.), Springer, Berlin, 2001, pp. 157–180.
10.1007/978-3-662-04619-7_8 Google Scholar
- R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas Propagat., vol. AP-34, no. 3, Mar. 1986.
- R. Kumaresan, “Spectral analysis,” in Handbook for Digital Signal Processing, S. K. Mitra and J. F. Kaiser (Eds.), Wiley, Hoboken, NJ, 1993, pp. 1143–1242.
- C. H. Knapp and G. C. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust. Speech Signal Process., vol. 24, no. 4, pp. 320–327, 1976.
- J. Benesty, “Adaptive eigenvalue decomposition algorithm for passive acoustic source localization,” J. Acoust. Soc. Am., vol. 107, no. 1, pp. 384–391, Jan. 2000.
- G. Doblinger, “Localization and tracking of acoustical sources,” in Topics in Acoustic Echo and Noise Control, E. Hänsler and G. Schmidt (Eds.), Springer, Berlin, 2006, pp. 91–122.
- T. Wolff, M. Buck, and G. Schmidt, “A subband based source localization system for reverberant environments,” Proc. ITG '08, 2008.
- D. H. Johnson and D. E. Dudgeon, Array Signal Processing—Concepts and Techniques, Prentice Hall, Englewood Cliffs, NJ, 1993.
- J. L. Flanagan, D. A. Berkley, G. W. Elko, J. E. West, and M. M. Sondhi, “Autodirective microphone systems,” Acustica, vol. 73, pp. 58–71, 1991.
-
J. Bitzer and K. U. Simmer, “Superdirective microphone arrays,” in Microphone Arrays, M. Brandstein and D. Ward (Eds.), Springer, Berlin, 2001, pp. 19–38.
10.1007/978-3-662-04619-7_2 Google Scholar
- H. Cox, R. M. Zeskind, and M. M. Owen, “Robust adaptive beamforming,” IEEE Trans. Acoust. Speech Signal Process., vol. 35, no. 10, pp. 1365–1375, 1987.
- E. N. Gilbert and S. P. Morgan, “Optimum design of directive antenna arrays subject to random variation,” Bell Syst. Tech. J., vol. 34, pp. 637–663, 1955.
- O. L. Frost, III, “An algorithm for linearily constrained adaptive array processing,” Proc. IEEE, vol. 60, no. 8, pp. 926–935, 1972.
- L. J. Griffiths and C. W. Jim, “An alternative approach to linearly constrained adaptive beamforming,” IEEE Trans. Antennas Propagat., vol. 30, no. 1, pp. 24–34, 1982.
- C. W. Jim, “A comparison of two LMS constrained optimal array structures,” Proc. IEEE, vol. 65, no. 12, pp. 1730–1731, 1977.
- O. Hoshuyama, A. Sugiyama, and A. Hirano, “A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters,” IEEE Trans. Signal Process., vol. 47, no. 10, pp. 2677–2684, 1999.
- B. Widrow, K. M. Duvall, R. P. Gooch, and W. C. Newman, “Signal cancellation phenomena in adaptive antennas: Causes and cures,” IEEE Trans. Antennas Propagat., vol. 30, no. 3, pp. 469–478, 1982.
- D. Van Compernolle, “Switching adaptive filters for enhancing noisy and reverberant speech from microphone array recordings,” Proc. ICASSP '90, vol. 2, pp. 833–836, 1990.
- W. Herbordt, S. Nakamura, and W. Kellermann, “Joint optimization of LCMV beamforming and acoustic echo cancellation for automatic speech recognition,” Proc. ICASSP '05, vol. 3, pp. 77–80, 2005.
- W. H. Neo and B. Farhang-Boroujeny, “Robust microphone arrays using subband adaptive filters,” Proc. ICASSP '01, vol. 6, pp. 3721–3724, 2001.
- S. Gannot, D. Burshstein, and E. Weinstein, “Signal enhancement using beamforming and nonstationarity with applications to speech,” IEEE Trans. Signal Process., vol. 49, no. 8, pp. 1614–1626, 2001.
- M. Buck, T. Haulick, and H.-J. Pfleiderer, “Self-calibrating microphone arrays for speech signal acquisition: A systematic approach,” Signal Process., vol. 86, no. 6, pp. 1230–1238, 2006.
- W. Herbordt and W. Kellermann, “GSAEC—Acoustic echo cancellation embedded into the generalized sidelobe canceller,” Proc. EUSIPCO '00, vol. 3, pp. 1843–1846, 2000.
-
W. Kellermann, “Acoustic echo cancellation for beamforming microphone arrays,” in Microphone Arrays, M. Brandstein and D. Ward (Eds.), Springer, Berlin, 2001, pp. 281–306.
10.1007/978-3-662-04619-7_13 Google Scholar
- S. Doclo, M. Moonen, and E. De Clippel, “Combined acoustic echo and noise reduction using GSVD-based optimal filtering,” Proc. ICASSP '00, vol. 2, pp. 1051–1054, 2000.
- W. Herbordt, W. Kellermann, and S. Nakamura, “Joint optimization of acoustic echo cancellation and adaptive beamforming,” in Topics in Acoustic Echo and Noise Control, E. Hänsler and G. Schmidt (Eds.), Springer, Berlin, 2006.
- Z. Liu, M. L. Seltzer, A. Acero, I. Tashev, Z. Zhang, and M. Sinclair, “A compact multi-sensor headset for hands-free communication,” Proc. WASPAA '05, pp. 138–141, 2005.
- S. Nordholm, I. Claesson, and M. Dahl, “Adaptive microphone array employing calibration signals: An analytical evaluation,” IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 241–252, 1999.
- X. Zhang and J. H. L. Hansen, “CSA-BF: Novel constrained switched adaptive beamforming for speech enhancement and recognition in real car environments,” Proc. ICASSP 03, vol. 2, pp. 125–128, 2003.
- T. P. Hua, A. Sugiyama, and G. Faucon, “A new self-calibration technique for adaptive microphone arrays,” Proc. IWAENC '05, pp. 237–240, 2005.
- P. Oak and W. Kellermann, “A calibration algorithm for robust generalized sidelobe cancelling beamformers,” Proc. IWAENC '05, pp. 97–100, 2005.
- M. Buck, T. Haulick, and H.-J. Pfleiderer, “Microphone calibration for multi-channel signal processing,” in Topics in Speech and Audio Processing in Adverse Environments, E. Hänsler and G. Schmidt (Eds.), Springer, Berlin, 2008.
-
G. W. Elko, “Superdirectional microphone arrays,” in Acoustic Signal Processing for Telecommunication, S. L. Gay and J. Benesty (Eds.), Kluwer, Boston, MA, 2000, pp. 181–237.
10.1007/978-1-4419-8644-3_10 Google Scholar
- S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust. Speech Signal Process., vol. 27, no. 2, pp. 113–120, 1979.
- Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. Acoustics Speech Signal Process., vol. 32, no. 6, pp. 1109–1121, 1984.
- T. Lotter and P. Vary, “Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model,” EURASIP J. Appl. Signal Process., pp. 1110–1126, July 2005.
- K. Linhard and T. Haulick, “Spectral noise subtraction with recursive gain curves,” Proc. ICSLP '98, vol. 4, pp. 1479–1482, 1998.
- R. Martin, “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Trans. Speech Audio Process., vol. 9, no. 5, pp. 504–512, 2001.
- E. Habets, Multi-channel speech dereverberation based on a statistical model of late reverberation, Proc. ICASSP 05, vol. 4, pp. 173–176, 2005.
- K. Lebart and J. M. Boucher, “A new method based on spectral subtraction for speech dereverberation,” Acustica, vol. 87, pp. 359–366, 2001.
- I. Tashev and D. Allred, “Reverberation reduction for improved speech recognition,” Proc. HSCMA '05, pp. 18–19, 2005.
- H. Kuttruff, Room Acoustics, 4th ed., Spon Press, London, 2000.
- M. Buck and A. Wolf, “Model-based dereverberation of single-channel speech signals,” Proc. DAGA '08, 2008.
- J. C. Junqua, “The influence of acoustics on speech production: A noise-induced stress phenomenon known as the Lombard reflex,” Speech Commun., vol. 20, no. 1, pp. 13–22, 1996.
-
K. U. Simmer, J. Bitzer, and C. Marro, “Post-filtering techniques,” in Microphone Arrays, M. Brandstein and D. Ward (Eds.), Springer, Berlin, 2001, pp. 39–60.
10.1007/978-3-662-04619-7_3 Google Scholar
- I. Cohen, S. Gannot, and B. Berdugo, “An integrated real-time beamforming and post-filtering system for nonstationary noise enviroments,” EURASIP J. Appl. Signal Process., pp. 1064–1073, Nov. 2003.
- T. Wolff and M. Buck, “Spatial maximum a posteriori post-filtering for arbitrary beam-forming,” Proc. HSCMA '08, pp. 53–56, 2008.
- E. Zavarehei, S. Vaseghi, and Q. Yan, “Noisy speech enhancement using harmonic-noise model and codebook-based post-processing,” IEEE Trans. Speech Audio Process., vol. 15, no. 4, pp. 1194–1203, 2007.
-
P. Vary and R. Martin, Digital Speech Transmission, Wiley, Hoboken, NJ, 2006.
10.1002/0470031743 Google Scholar
- Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. Commun., vol. COM-28, no. 1, pp. 84–95, 1980.
-
W. Hess, Pitch Determination of Speech Signals, Springer, Berlin, 1983.
10.1007/978-3-642-81926-1 Google Scholar
- M. R. Schroeder, “Period histogram and product spectrum: New methods for fundamental frequency measurements,” J. Acoust. Soc. Am., vol. 43, no. 4, pp. 829–834, 1968.
- M. Krini and G. Schmidt, “Spectral refinement and its application to fundamental frequency estimation,” Proc. IEEE WASPAA '07, pp. 251–254, 2007.
- H. Puder and O. Soffke, “An approach for an optimized voice-activity detector for noisy speech signals,” Proc. EUSIPCO 02, pp. 243–246, 2002.
- D. Hartmann, “Noise and voice quality in VoIP enviroments,” in Noise Reduction in Speech Applications, G. M. Davis (Ed.), CRC Press, Boca Raton, FL, 2002, pp. 277–304.
- S. J. Leese, “Echo cancellation,” in Noise Reduction in Speech Applications, G. M. Davis (Ed.), CRC Press, Boca Raton, FL, 2002, pp. 199–216.
- D. Van Compernolle and S. Van Gerven, “Beamforming with microphone arrays,” in Digital Signal Processing to Telecommunications, A. R. Figueiras–Vidal (Ed.), Cost 229, 1995, pp. 107–131.
- G. W. Elko, “Microphone array systems for hands-free telecommunication,” Speech Commun., vol. 20, pp. 229–240, 1996.
- W. Herbordt and W. Kellermann, “Computationally efficient frequency-domain robust generalized sidelobe canceller,” Proc. IWAENC '01, pp. 51–55, 2001.
- P. P. Vaidyanathan, Mulitrate Systems and Filter Banks, Prentice Hall, Englewood Cliffs, NJ, 1992.