New AI model detects early neurological disorders through speech

Recently, the research team led by Prof. Li Hai at the Institute of Health and Medical Technology, the Hefei Institutes of Physical Science of the Chinese Academy of Sciences, has developed a novel deep learning framework that significantly improves the accuracy and interpretability of detecting neurological disorders through speech. 

"A slight change in the way we speak might be more than just a slip of the tongue-it could be a warning sign from the brain," said Prof. Li Hai, who led the team, "Our new model can detect early symptoms of neurological diseases like Parkinson' s, Huntington' s, and Wilson disease-by analyzing voice recordings."

The study was recently published in Neurocomputing.

Dysarthria is a common early symptom of various neurological disorders. Given that these speech abnormalities often reflect underlying neurodegenerative processes, voice signals have emerged as promising non-invasive biomarkers for early screening and continuous monitoring of such conditions. Automated speech analysis offers high efficiency, low cost, and non-invasiveness. However, current mainstream methods often suffer from over-reliance on handcrafted features, limited capacity to model temporal-variable interactions, and poor interpretability.

To address these challenges, the team proposed Cross-Time and Cross-Axis Interactive Transformer (CTCAIT) for multivariate time series analysis. This framework first employs a large-scale audio model to extract high-dimensional temporal features from speech, representing them as multidimensional embeddings along time and feature axes. It then leverages the Inception Time network to capture multi-scale and multi-level patterns within the time series. By integrating cross-time and cross-channel multi-head attention mechanisms, CTCAIT effectively captures pathological speech signatures embedded across different dimensions.

The method achieved a detection accuracy of 92.06% on a Mandarin Chinese dataset and 87.73% on an external English dataset, demonstrating strong cross-linguistic generalizability.

Furthermore, the team conducted interpretability analyses of the model's internal decision-making processes and systematically compared the effectiveness of different speech tasks, offering valuable insights for its potential clinical deployment.

These efforts provide important guidance for potential clinical applications of the method in early diagnosis and monitoring of neurological disorders.

Source:
Journal reference:

Zhang, Z., et al. (2025). Multivariate time series approach integrating cross-temporal and cross-channel attention for dysarthria detection from speech. Neurocomputing. doi.org/10.1016/j.neucom.2025.130708

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Research uncovers key role of the brain's blood flow dynamics in Alzheimer’s disease