Inclusion-of-Thoughts: Stabilizing LLM Decisions by Filtering

Inclusion-of-Thoughts: Mitigating Preference Instability via Purifying the Decision Space

Summary: arXiv:2604.04944v1 Announce Type: cross

Abstract: Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs). However, LLMs remain vulnerable to the presence of plausible distractors. This often diverts attention toward irrelevant choices, resulting in unstable oscillation between correct and incorrect answers.

Introduction

In recent years, the use of large language models (LLMs) has surged across various applications, making their evaluation increasingly critical. Among the numerous methods employed for this purpose, multiple-choice questions (MCQs) stand out due to their structured format and ease of analysis. Nevertheless, a significant challenge arises from the presence of plausible distractors within these MCQs, which can lead to cognitive overload for the models. This results in erratic decision-making, as LLMs may vacillate between correct and incorrect options.

Proposed Solution: Inclusion-of-Thoughts (IoT)

To address this challenge, we introduce a new approach known as Inclusion-of-Thoughts (IoT). This method is a progressive self-filtering strategy designed to enhance the decision-making capabilities of LLMs by mitigating the cognitive load associated with distractors. The core idea behind IoT is to reconstruct the MCQs so that only plausible option choices are presented to the model.

Key Features of IoT

Self-Filtering Mechanism: IoT operates by filtering out irrelevant options, allowing the model to focus on the most plausible answers.
Comparative Judgements: By providing a controlled setting, IoT fosters better comparative judgments and enhances the stability of the model’s internal reasoning.
Transparency and Interpretability: The filtering process is explicitly documented, which improves the transparency and interpretability of the model’s decision-making.

Empirical Evaluation

We conducted extensive empirical evaluations to assess the effectiveness of IoT across various domains, including arithmetic, commonsense reasoning, and educational benchmarks. The results reveal substantial improvements in chain-of-thought performance with minimal computational overhead. Specifically, our findings indicate that the IoT framework significantly enhances the ability of LLMs to arrive at correct answers by reducing the influence of distractors.

Conclusion

The Inclusion-of-Thoughts strategy represents a significant advancement in the evaluation of large language models. By addressing the cognitive load associated with plausible distractors, IoT not only improves the stability of decision-making but also enhances the overall interpretability of LLMs. As AI continues to evolve, methodologies like IoT will be essential for ensuring that LLMs can operate effectively in diverse and complex environments.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Inclusion-of-Thoughts: Stabilizing LLM Decisions by Filtering

Inclusion-of-Thoughts: Mitigating Preference Instability via Purifying the Decision Space

Introduction

Proposed Solution: Inclusion-of-Thoughts (IoT)

Key Features of IoT

Empirical Evaluation

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related