Reduce biases in reward models using causally motivated inference-time interventions to improve alignment with human preferences without losing performance...
Explore pitfalls and remedies in using generative synthetic data for causal inference, enhancing accuracy with a hybrid framework and robust diagnostics.