Depression Detection at the Point of Care: Automated Analysis of Linguistic Signals from Routine Primary Care Encounters
Depression remains one of the most prevalent mental health disorders worldwide, yet it is frequently underdiagnosed in primary care settings. Timely identification of depression is crucial for effective treatment and improved patient outcomes. Recent advancements in digital scribing technologies have facilitated the recording of clinical encounters, providing a unique opportunity to harness naturalistic dialogue for the detection of depression.
The study titled “Automated Analysis of Linguistic Signals from Routine Primary Care Encounters,” published on arXiv, explores the potential of using automated systems to identify depression through audio recordings of primary care interactions. The research involved analyzing 1,108 audio-recorded primary care encounters as part of the Establishing Focus study, with depression status determined using the Patient Health Questionnaire-9 (PHQ-9). Out of the total encounters, 253 were classified as depressed, while 855 were considered non-depressed.
Methodology
To evaluate the effectiveness of automated depression detection, the researchers compared three supervised learning approaches:
- Sentence-BERT combined with Logistic Regression (LR)
- LIWC (Linguistic Inquiry and Word Count) with Logistic Regression
- ModernBERT
Additionally, they included a zero-shot approach using GPT-OSS. The performance of these models was measured using metrics such as Area Under the Precision-Recall Curve (AUPRC) and Area Under the Receiver Operating Characteristic Curve (AUROC).
Findings
The results revealed that the GPT-OSS model achieved the strongest overall performance, with an AUPRC of 0.510 and an AUROC of 0.774. Meanwhile, the LIWC+LR model demonstrated competitive results among the supervised models, with an AUPRC of 0.500 and an AUROC of 0.742.
One of the noteworthy findings was that combined dyadic transcripts—those capturing dialogue between both the patient and the provider—outperformed single-speaker configurations. The analysis indicated that providers often mirrored the linguistic cues of patients during encounters involving depression, suggesting that this interaction provides additional signals that are not evident when analyzing the speech of either party in isolation.
Furthermore, the study found that meaningful detection of depression could be achieved from the initial 128 patient tokens, yielding an AUPRC of 0.356 and an AUROC of 0.675. This finding underscores the potential for in-the-moment clinical decision support, leveraging linguistic signals to enhance the accuracy of depression identification in real-time.
Conclusion
These findings advocate for the integration of passively collected clinical audio as a low-burden complement to existing depression screening workflows. By utilizing innovative automated analysis techniques, primary care providers can significantly improve the detection rates of depression, ultimately leading to better patient care and outcomes.
The study encourages further exploration of automated linguistic analysis in clinical settings, highlighting its role in bridging gaps in mental health diagnosis and treatment.
