Convoluational Transformer With Adaptive Position Embedding For Covid-19 Detection From Cough Sounds

verfasst von
Tianhao Yan, Hao Meng, Shuo Liu, Emilia Parada-Cabaleiro, Zhao Ren, Björn W. Schuller

Covid-19 has caused a huge health crisis worldwide in the past two years. Although an early detection of the virus through nucleic acid screening can considerably reduce its spread, the efficiency of this diagnostic process is limited by its complexity and costs. Hence, an effective and inexpensive way to early detect Covid-19 is still needed. Considering that the cough of an infected person contains a large amount of information, we propose an algorithm for the automatic recognition of Covid-19 from cough signals. Our approach generates static log-Mel spectrograms with deltas and delta-deltas from the cough signal and subsequently extracts feature maps through a Convolutional Neural Network (CNN). Following the advances on transformers in the realm of deep learning, our proposed architecture exploits a novel adaptive position embedding structure which can learn the position information of the features from the CNN output. This make the transformer structure rapidly lock the attention feature location by overlaying with the CNN output, which yields better classification. The efficiency of the proposed architecture is shown by the improvement, w. r. t. the baseline, of our experimental results on the INTERPSEECH 2021 Computational Paralinguistics Challenge CCS (Coughing Sub Challenge) database, which reached 72.6 % UAR (Unweighted Average Recall).

Forschungszentrum L3S
Externe Organisation(en)
Harbin Engineering University
Universität Augsburg
Johannes Kepler Universität Linz (JKU)
Imperial College London
Aufsatz in Konferenzband
Anzahl der Seiten
ASJC Scopus Sachgebiete
Software, Signalverarbeitung, Elektrotechnik und Elektronik
Elektronische Version(en) (Zugang: Geschlossen)