Electroglottographic Measures Based on GCI and GOI Detection Using Multiscale Product
Keywords:
wavelet transform, multi-scale product, electroglottographic signal, glottal closing instant, glottal opening instantAbstract
This paper deals with glottal parameter estimation such as local pitch and open quotient from electroglottographic signal (EGG). This estimation is based on glottal closing instants and glottal opening instants determined by a multi-scale product of this signal. Wavelet transform of EGG signal is made with a quadratic spline function. Wavelet coefficients calculated on different dyadic scales, show modulus maxima at localized discontinuities of EGG signal. The detected maxima and minima correspond to the glottal opening and closing instants called GOIs and GCIs. To improve the estimate precision, we operate the multi-scale product of wavelet transform coefficients of three successive dyadic scales. This processing enhances edge detection. A Multi-scale product is a nonlinear combination of successive scales; it reduces noise and spurious peaks. We apply cubic root amplitude on the product to improve the representation of weak amplitudes. The method has a good representation of GCI and a best detection of GOI. The method was tested on the Keele University database; it is effective and robust in multiple cases even for a typical signal showing undetermined GOIs and multiple peaks at GCIs. Finally precise measurement of these instants allows accurate estimation of prosodic parameters as local pitch and open quotient.References
D. G. Childers, A. M. Smith and G. P. Moore, Relationships Between Electroglottograph, Speech, and Vocal Cord Contact, Folia Phoniatr., Vol. 36, pp. 105-118, 1984. http://dx.doi.org/10.1159/000265727
S. Mallat, A Wavelet Tour of Signal Processing, Second Edition, Academic Press, San Diego 1999.
A. Bouzid, and N. Ellouze, Local Regularity Analysis at Glottal Opening and Closing Instants in Electroglottogram Signal Using Wavelet Transform Modulus Maxima, in Proc. Eurospeech 2003, Geneve, pp. 2837-2840, 2003.
N. Henrich, C. d'Alessandro, M. Castellongo, On the Use of the Derivative of Electroglottographic Signals for Characterization of Non-Pathological Phonation, Journal of Acoustical Society of America, Vol. 115, pp. 1321-1332, 2004. http://dx.doi.org/10.1121/1.1646401
B. M. Sadler, T. Pham, and L. C. Sadler, Optimal and Wavelet Based Shock Wave Detection and Estimation, Journal of Acoustical Society of America, Vol. 104, no. 2, pp. 955-963, 1998. http://dx.doi.org/10.1121/1.423312
B. M. Sadler, and A. Swami, Analysis of Multiscale Products for Step Detection and Estimation, IEEE Trans. Inform. Theory, Vol. 45, no. 3, pp. 1043-1051, 1999. http://dx.doi.org/10.1109/18.761341
L. Zhang, and P. Bao, Edge Detection by Scale Multiplication in Wavelet Domain, Pattern Recognition Letters, Vol. 23, no. 14, pp. 1771-1784, 2002. http://dx.doi.org/10.1016/S0167-8655(02)00151-4
P. Bao, L. Zhang, and X. Wu, Canny Edge Detection Enhancement by Scale Multiplication, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 27, no. 9, pp. 1485-1490, 2005. http://dx.doi.org/10.1109/TPAMI.2005.173
Y. Xu, J. B. Weaver, D. M. Healy, and J. Lu, Wavelet Transform Domain Filters: A Spatially Selective Noise Filtration Technique, IEEE Trans. Image Processing, Vol. 3, no. 6, pp. 747-758, 1994. http://dx.doi.org/10.1109/83.336245
M. Rothenberg, and J. J. Mahshie, Monitoring Vocal Fold Abduction through Vocal Fold Contact Area, Journal of Speech and Hearing Research, Vol. 31, pp. 338-351, 1988. http://dx.doi.org/10.1044/jshr.3103.338
D. M. Howard, Variation of Electrolaryngographically Derived Closed Quotient for Trained and Untrained Adult Female Singers, Journal of Voice, Vol. 9, no. 2, pp. 1212-1223, 1995. http://dx.doi.org/10.1016/S0892-1997(05)80250-4
D. M. Howard, G. A. Lindsey, and B. Allen, Toward the Quantification of Vocal Efficiency, Journal of Voice, Vol. 4, no. 3, pp. 205-212, 1990. http://dx.doi.org/10.1016/S0892-1997(05)80015-3
D. G. Childers, and A. K. Krishnamurthy, A Critical Review of Electroglottography, CRC Critical Reviews in Biomedical Engineering, Vol. 12, pp. 131-161, 1985.
D. G. Childers, D. M. Hooks, G. P. Moore, L. Eskenazi, and A. L. Lalwani, Electroglottography and Vocal Fold Physiology, Journal of Speech Hearing Research, Vol. 33, pp. 245-254, 1990. http://dx.doi.org/10.1044/jshr.3302.245
D. G. Childers, and J. N. Lara, Electroglottography for Laryngeal Function Assessment and Speech Analysis, IEEE Trans. on Biomedical Engineering BME, Vol. 31, No. 12, pp. 807-817, 1985.
S. Anastaplo, and M. P. Karnell, Synchronized Videoscopic and Electroglottographic Examination of Glottal Opening, Journal of Acoustical Society of America, Vol. 83, no. 5, pp. 1883-1890, 1988. http://dx.doi.org/10.1121/1.396472
M. H. Hess, and M. Ludwigs, Strobophotoglottographic Transillumination as a Method for the Analysis of Vocal Fold Vibration Patterns, Journal of Voice, Vol. 14, no. 2, pp. 255-271, 2000. http://dx.doi.org/10.1016/S0892-1997(00)80034-X
F. Plante, G. F. Meyer, and W. A. Ainsworth, A Pitch Extraction Reference Database, in . Eurospeech 1995, pp. 837-840, 1995.
A. Bouzid, N. Ellouze, Contribution à la Détection des Instants d'Ouverture et de Fermeture de la Glotte sur les Signaux de Parole Voisé par Transformée en Ondelettes, Thése de doctorat, ENIT, Juillet 2004.
A. Witkin, Scale-Space Filtering, Proc. Int. Joint Conf. Artif. Intell., pp. 1019-1021, 1983.
J. Pérez, and A. Bonafonte, Automatic Voice-Source Parametrization of Natural Speech, in Proc. ICSLP 2005, Lisboa, Portugal, 2005.
A. Rosenfeld, A Non Linear Edge Detection, Proc. IEEE, Vol. 58, pp. 814-816, 1970. http://dx.doi.org/10.1109/PROC.1970.7756
S. Mallat, and S. Zhong, Characterization of Signals from Multiscale Edges, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 14, no. 7, pp. 710-732, 1992. http://dx.doi.org/10.1109/34.142909
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.