Temporal Dropout Risk in Learning Analytics: A Harmonized Survival Benchmark Across Dynamic and Early-Window Representations
Summary: arXiv:2604.08870v1 Announce Type: cross
Abstract
Student dropout remains a significant concern in the field of Learning Analytics. However, comparative studies often evaluate predictive models using heterogeneous protocols, which tend to prioritize discrimination over critical factors such as temporal interpretability and calibration. In response to this challenge, the present study introduces a survival-oriented benchmark aimed at modeling temporal dropout risk by utilizing the Open University Learning Analytics Dataset (OULAD).
Methodology
This study compares two harmonized arms: a dynamic weekly arm, which employs models in person-period representation, and a continuous-time arm that features an expanded roster of modeling families. These families include:
- Tree-based survival models
- Parametric models
- Neural models
Evaluation Protocol
The evaluation protocol consists of four analytical layers that assess:
- Predictive performance
- Ablation
- Explainability
- Calibration
Results are reported separately for each arm, as conducting a single cross-arm ranking is not methodologically warranted.
Key Findings
In the comparable continuous-time arm, the Random Survival Forest model demonstrated superior discrimination and horizon-specific Brier scores. Meanwhile, the dynamic weekly arm saw the Poisson Piecewise-Exponential model narrowly leading in integrated Brier score within a tight cluster of five competing families. It is important to note that the no-refit bootstrap sampling variability indicates these positions should be interpreted as directional signals rather than definitive claims of superiority.
Ablation and Explainability
Ablation and explainability analyses converged on a significant finding across all model families: the dominant predictive signal was predominantly temporal and behavioral in nature. This result diverges from traditional expectations that emphasize demographic or structural factors as primary predictors of dropout risk.
Calibration Insights
Calibration analyses further supported this temporal-behavioral perspective in the models exhibiting better discrimination. However, an exception was noted with the XGBoost Accelerated Failure Time (AFT) model, which displayed a systematic bias in its predictions.
Conclusion
These findings underscore the importance of establishing a harmonized, multi-dimensional benchmark in Learning Analytics. They also position dropout risk as a temporal-behavioral process, challenging the notion that it is merely a function of static background attributes. This study contributes valuable insights for future research and practical applications in Learning Analytics, aiming to enhance predictive accuracy and ultimately reduce dropout rates in educational settings.
