Publication date: Available online 14 April 2018
Source:Computer Speech & Language
Author(s): Constantin Spille, Birger Kollmeier, Bernd T. Meyer
Former comparisons of human speech recognition (HSR) and automatic speech recognition (ASR) have shown that humans outperform ASR systems in nearly all speech recognition tasks. However, recent progress in ASR has led to substantial improvements of recognition accuracy, and it is therefore unclear how large the task-dependent human-machine gap still remains. This paper investigates this gap between HSR and ASR based on deep neural networks (DNNs) in different acoustic conditions, with the aim of comparing differences and identifying processing strategies that should be considered in ASR. We find that DNN-based ASR reaches human performance for single-channel, small-vocabulary tasks in the presence of speech-shaped noise and in multi-talker babble noise, which is an important difference to previous human-machine comparisons: The speech reception threshold, i.e., the signal-to-noise ratio with 50 % word recognition rate is at about -7 to -8 dB both for HSR and ASR. However, in more complex spatial scenes with diffuse noise and moving talkers, the SRT gap amounts to approximately 12 dB. Based on cross comparisons that use oracle knowledge (e.g., the speakers' true position), incorrect responses are attributed to localization errors or missing pitch information to distinguish between speakers with different gender. In terms of the SRT, localization errors and missing spectral information amount to 2.1 and 3.2 dB, respectively. The comparison hence identifies specific components in ASR that can profit from learning from auditory signal processing.
from #ORL-AlexandrosSfakianakis via ola Kala on Inoreader https://ift.tt/2qwFdYO
Αρχειοθήκη ιστολογίου
-
►
2023
(269)
- ► Φεβρουαρίου (133)
- ► Ιανουαρίου (136)
-
►
2022
(2046)
- ► Δεκεμβρίου (165)
- ► Σεπτεμβρίου (161)
- ► Φεβρουαρίου (165)
-
►
2021
(3028)
- ► Δεκεμβρίου (135)
- ► Σεπτεμβρίου (182)
- ► Φεβρουαρίου (324)
-
►
2020
(1051)
- ► Δεκεμβρίου (292)
- ► Σεπτεμβρίου (60)
- ► Φεβρουαρίου (28)
-
►
2019
(2277)
- ► Δεκεμβρίου (18)
- ► Σεπτεμβρίου (54)
- ► Φεβρουαρίου (89)
-
▼
2018
(26280)
- ► Δεκεμβρίου (189)
-
▼
Απριλίου
(5246)
-
▼
Απρ 14
(291)
- Combined effects of rat Schwann cells and 17β-estr...
- Hospital readmissions after spontaneous intracereb...
- Comparing human and automatic speech recognition i...
- CA125 suppresses amatuximab immune-effector functi...
- Nucleoside diphosphate kinase B promotes osteosarc...
- Cisplatin based therapy: the role of the mitogen a...
- Caregiver-Reported Indicators of Communication and...
- Monogenic diabetes syndromes: Locus-specific datab...
- Application of uncertainty and sensitivity analysi...
- A national-scale review of air pollutant concentra...
- Anal Cytology: Institutional Statistics, Correlati...
- Molecular Testing in Lung Cancer: Where to Draw th...
- High-resolution MRI of the inner ear enables syndr...
- Non-typeable Haemophilus Influenzae detection in t...
- Examining the social status, risk factors and life...
- Intestinal Microbiota in Hirschsprung Disease
- Anorectal Manometry May Reduce the Number of Recta...
- Birth Month as a Risk Factor for the Diagnosis of ...
- Pediatric Liver Transplant Teams’ Coping with Pati...
- Early Serum Gut Hormone Concentrations Associated ...
- Gestational Obstructive Sleep Apnea: Biomarker Scr...
- Evaluation of the inter-person variability of haza...
- R. J. Michael Fry (8/7/1924–24/11/2017)
- Experimental studies on the biological effects of ...
- Education and training to support radiation protec...
- Characterization of a partial-body irradiation mod...
- 3 takeaways on how EMS providers train to use supr...
- Efficacy of PD-1 & PD-L1 inhibitors in older adult...
- Pseudoprogression manifesting as recurrent ascites...
- Most-enhancing tumor volume by MRI radiomics predi...
- Language development in children who stutter: A re...
- The effect of speech-language pathology students o...
- Auditory brainstem implant in postmeningitis total...
- Using the device-oriented subjective outcome (DOSO...
- The development of a vocabulary for PEEPS-SE-profi...
- The safety and risk factors of revision adenoidect...
- A comparison of tonsillar surface swabbing, fine-n...
- Using the device-oriented subjective outcome (DOSO...
- Editorial: Spontaneous Activity in Sensory Systems.
- Temporal Bone Histopathology in Cockayne Syndrome.
- Using the device-oriented subjective outcome (DOSO...
- Durable response to bevacizumab in adults with rec...
- "NQO1 Gene C609T Polymorphism (dbSNP: rs1800566) a...
- Terminalia ferdinandiana Exell. Fruit and Leaf Ext...
- Using the device-oriented subjective outcome (DOSO...
- Interessenkonflikte in Leitlinien
- Neue klinische Anwendungen der Laser-Doppler-Vibro...
- Plastische Operationen
- Preisträger 2015 und 2017 der Deutschen Gesellscha...
- FORSCHUNG HEUTE – ZUKUNFT MORGEN
- Das angeborene Immunsystem beim Oropharynxkarzinom
- Lärminduzierte Neurodegeneration der zentralen Hör...
- Adipose tissue, but not skeletal muscle, sirtuin 1...
- Predicting an adrenal crisis: can we do it?
- Effect of ezetimibe on glycemic control: a systema...
- IGF-1-based screening reveals a low prevalence of ...
- The associations of metabolic syndrome with incide...
- Pancreatic neuroendocrine tumors in MEN1 disease: ...
- Morphologic and molecular pathway of cushing syndr...
- Oro-dental pathologies in acromegaly
- Complete evaluation of pituitary tumours in a sing...
- Massive Macroglossia After Posterior Cranial Fossa...
- Traumatic postero-lateral C1-C2 dislocation compli...
- Identification of Hub Genes and Pathways in Zika V...
- Deregulated TNF-Alpha Levels Along with HPV Genoty...
- EM Nerd-Behind the Veil of Science
- Efficacy of a clinical pathway for patients with t...
- Auditory brainstem implant in postmeningitis total...
- Pretreatment lymphocyte-to-monocyte ratio as an in...
- Biospecimen Education Among Pacific Islanders in S...
- Sociodemographic disparities in the occurrence of ...
- A Five-Year Retrospective Cohort Study Analyzing F...
- The Hybrid Arch Bar Is a Cost-Beneficial Alternati...
- Isolated Orbital Floor Fracture Management: A Surv...
- Prospective Randomized Controlled Pilot Study on O...
- Repurposing drugs in oncology (ReDO)—selective PDE...
- Risk sharing agreements, present and future
- Statistical Evaluation of Trace Metals, TSH and T ...
- Selenoprotein N Was Required for the Regulation of...
- Association of Elements with Schizophrenia and Int...
- Fluoroscopy-guided peroral endoscopic myotomy for ...
- Gastric peroral endoscopic myotomy for the treatme...
- A phase I, open-label, two-stage study to investig...
- Nab -paclitaxel plus gemcitabine versus FOLFIRINOX...
- Autophagy inhibition improves the chemotherapeutic...
- Associations Between Cool and Hot Executive Functi...
- Registers in Infant Phonation
- Spasmodic Dysphonia in Multiple Sclerosis Treatmen...
- The Nasal Musculature as a Control Panel for Singi...
- Distinct Acoustic Features and Glottal Changes Def...
- A Computerized Tomography Study of Vocal Tract Set...
- Outcome Measurement in the Treatment of Spasmodic ...
- The Selective Use of Radiation Therapy in Rectal C...
- Cardiotoxicity of Immunotherapy: Incidence, Diagno...
- Emerging Therapies in Metastatic Prostate Cancer
- The Impact of Obesity on Breast Cancer
- Mitteilungen der DGMKG
- Unterlippenrekonstruktion
- Zehn abgeschlossene Jahrgänge Der MKG-Chirurg
- Kombinierte Unterkiefer- und Lippenersatzplastiken
-
▼
Απρ 14
(291)
- ► Φεβρουαρίου (6130)
- ► Ιανουαρίου (7050)
-
►
2017
(33948)
- ► Δεκεμβρίου (6715)
- ► Σεπτεμβρίου (6470)
-
►
2016
(4179)
- ► Σεπτεμβρίου (638)
- ► Φεβρουαρίου (526)
- ► Ιανουαρίου (517)
Εγγραφή σε:
Σχόλια ανάρτησης (Atom)
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου