Generative AI–Based Multilingual Multimodal Framework for Depression Detection
Keywords:
Depression Detection, Artificial Intelligence (AI), Generative AI, Multimodal Models, Multilingual AnalysisAbstract
Depression is a prevalent psychological condition that is not easy to detect at an early stage due to its multipolar nature in terms of clinical manifestations and subjective clinical diagnoses. Recent advancements in generative artificial intelligence models, deep learning, and machine learning techniques have made it feasible to digitally identify depression based on how the illness manifests itself in speech, facial expressions, text, and physiological and behavioral characteristics. The study examines and discusses the use of large language, multimodal, and unimodal models for digital depression identification in various multilingual contexts. The technology consistently outperforms unimodal systems, according to the results, with enhanced transformer and cross-attention architecture performance in cross-modal relationship capture, a crucial component of clinical decision support. Large language models have been shown to have potential applications in few-shot learning, multilingual analysis, transcript-based severity estimation, data generation for simulation, and transparent clinical decision support systems. The current review aims to provide a systematic overview of the current methodologies and identify the areas of future research; however, much work needs to be done to have such applications extensive in number, culture-independent, and reflecting the ordinal level severity.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Southern Journal of Computer Science

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.