Two-Phase Fine-Tuning for Efficient Text-to-SQL Models

Date:


Schema on the Inside: A Two-Phase Fine-Tuning Method for High-Efficiency Text-to-SQL at Scale

Summary: arXiv:2603.24023v1 Announce Type: cross

Abstract: The application of large, proprietary API-based language models to text-to-SQL tasks presents significant challenges for the industry. The reliance on massive, schema-heavy prompts leads to prohibitive per-token API costs and high latency, which ultimately hinder scalable production deployment. To address these challenges, we introduce a specialized, self-hosted 8B-parameter model designed for a conversational bot in CriQ, a sister app to Dream11, India’s largest fantasy sports platform boasting over 250 million users. This bot effectively answers user queries regarding cricket statistics.

Key Innovations

Our research introduces a novel two-phase supervised fine-tuning approach that allows the model to internalize the entire database schema. This innovative strategy eliminates the need for long-context prompts, which significantly reduces input tokens from a baseline of 17,000 tokens to fewer than 100. The implications of this reduction are profound, as they replace costly external API calls with efficient local inference.

Performance Metrics

The resulting system has demonstrated remarkable performance, achieving:

  • 98.4% execution success
  • 92.5% semantic accuracy

These metrics substantially surpass those of a prompt-engineered baseline using Google’s Gemini Flash 2.0, which recorded 95.6% execution success and 89.4% semantic accuracy. This performance showcases the effectiveness of our approach in achieving high-precision, low-latency text-to-SQL applications.

Implications for the Industry

Our findings suggest a practical pathway for deploying domain-specialized, self-hosted language models in large-scale production environments. The ability to drastically reduce input tokens not only minimizes operational costs but also enhances the speed of query processing, which is critical in applications requiring real-time responses.

Conclusion

In conclusion, the two-phase fine-tuning method we have developed offers a transformative solution for text-to-SQL tasks, overcoming the prevalent challenges associated with existing API-based systems. By leveraging a self-hosted language model capable of understanding and internalizing database schemas, we pave the way for more efficient, scalable, and cost-effective natural language processing solutions in the domain of SQL query generation.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.