A Mathematical Framework for Temporal Modeling and Counterfactual Policy Simulation of Student Dropout
Summary: arXiv:2604.08874v1 Announce Type: cross
This study proposes a temporal modeling framework that integrates a counterfactual policy-simulation layer specifically designed for analyzing student dropout rates in higher education. By leveraging Learning Management System (LMS) engagement data alongside administrative withdrawal records, the research addresses the critical issue of dropout as a time-to-event outcome at the enrollment level.
Key Findings
The research operationalizes dropout using a sophisticated modeling technique that assesses weekly risk in discrete time intervals. The model utilizes penalized, class-balanced logistic regression applied to person-period data rows. The results indicate that, under a late-event temporal holdout, the model achieves impressive row-level Area Under the Curve (AUC) scores of:
- Training AUC: 0.8350
- Testing AUC: 0.8405
While the aggregate calibration of the model is acceptable, it is noted that the calibration is sparsely supported in the highest-risk bins, indicating areas for potential improvement.
Methodology and Analysis
Ablation analyses conducted as part of the study reveal a sensitivity in performance relative to the composition of the feature set utilized in the model. This underscores the significance of temporal engagement signals, suggesting that the timing and nature of student interactions with coursework may play a pivotal role in predicting dropout risk.
Counterfactual Policy Simulation
One of the most innovative aspects of this research is the introduction of a scenario-indexed policy layer that generates survival contrasts, denoted as ΔS(T), under an explicit trigger and schedule contract. The findings highlight the following:
- Positive contrasts are primarily observed within the shock branch with values at $T_{\rm policy}=18$:
- 0.0102
- 0.0260
- 0.0819
- In contrast, the mechanism-aware branch displays negative values, specifically:
- ΔS_{\rm mech}(18) = -0.0078
- ΔS_{\rm mech}(38) = -0.0134
Subgroup Analysis
A subgroup analysis by gender also quantifies scenario-induced survival gaps using bootstrap methods. While the contrasts noted are directionally stable, they remain relatively small, indicating subtle differences that may still warrant further investigation.
Conclusion
It is important to note that while the results derived from this framework are not causally identified, they effectively demonstrate the framework’s potential for internal structural scenario comparison within the constraints of observational data. This offers valuable insights for stakeholders in higher education who are focused on understanding and mitigating student dropout rates.
