Efficient Long-Document QA with Chain-of-Structured-Thought

Date:

Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

Summary: arXiv:2603.29232v1 Announce Type: cross

Large language models (LLMs) have gained significant traction in performing data analytics over various documents. However, the direct reasoning capabilities of these models over long and often noisy documents remain a challenge, often resulting in brittle and error-prone outputs. To address this issue, a new approach has been proposed for document question answering (QA) that consolidates dispersed evidence into structured outputs, such as tables, graphs, or organized chunks. This innovation aims to enhance the reliability and verifiability of QA processes.

The LiteCoST Framework

The proposed solution, known as the LiteCoST framework, is built upon two key pillars designed to achieve both high accuracy and low latency while utilizing small language models (SLMs).

Pillar 1: Chain-of-Structured-Thought (CoST)

The first pillar introduces a novel Chain-of-Structured-Thought (CoST) template. This schema-aware instruction guides a robust LLM to generate both a step-wise CoST trace and the corresponding structured output. The advantages of this process include:

  • Inducing a minimal structure to the data
  • Normalizing entities and units for consistency
  • Aligning records to ensure accuracy
  • Serializing the output for systematic representation
  • Verifying and refining the output to yield auditable supervision

Pillar 2: SLM Fine-Tuning

The second pillar focuses on the fine-tuning of compact models. This process involves training the SLMs on LLM-generated CoST data through two distinct stages:

  • Supervised Fine-Tuning: This stage focuses on achieving structural alignment.
  • Group Relative Policy Optimization (GRPO): This stage incorporates triple rewards aimed at enhancing answer quality, output format, and process consistency.

Performance and Efficiency

By distilling a structure-first behavior into SLMs, the LiteCoST approach achieves quality comparable to that of LLMs in multi-domain long-document QA tasks. Impressively, this is accomplished using models with sizes of only 3B and 7B parameters. Furthermore, the framework demonstrates a significant performance edge, offering 2 to 4 times lower latency compared to existing models such as GPT-4o and DeepSeek-R1, which have considerably larger parameter counts (671B).

Conclusion

The LiteCoST framework represents a significant advancement in the field of document question answering, blending the strengths of structured reasoning with the efficiency of small language models. As the demand for reliable and efficient document analytics continues to grow, innovations like LiteCoST will play a crucial role in shaping the future of AI-powered data processing.

The code for LiteCoST is available at GitHub – HKUSTDial/LiteCoST.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.