TikZilla: Advanced Text-to-TikZ with RL & Quality Data

Date:

TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

The advent of large language models (LLMs) has revolutionized various scientific workflows, providing advanced assistance in generating textual content, analyzing data, and even creating figures. One of the significant challenges in this domain is the generation of high-quality figures from textual descriptions, particularly when these figures are represented as TikZ programs, which can be rendered into scientific images. A recent paper, arXiv:2603.03072v2, introduces TikZilla, a novel approach that addresses the limitations of existing datasets and modeling techniques for Text-to-TikZ tasks.

Challenges in Existing Approaches

Prior research has attempted to tackle the Text-to-TikZ generation problem by proposing various datasets and modeling strategies. However, many of these existing datasets are often too small and noisy to adequately capture the intricate complexity of TikZ. This inadequacy frequently results in mismatches between the textual descriptions and the rendered figures. Furthermore, traditional methods have predominantly relied on supervised fine-tuning (SFT) alone, which fails to expose the models to the rendered semantics of the figures. Consequently, this can lead to various errors, including:

  • Looping issues in generated figures
  • Inclusion of irrelevant content
  • Incorrect spatial relations between elements

Introducing DaTikZ-V4 Dataset

To overcome these challenges, the authors of TikZilla have developed the DaTikZ-V4 dataset, which is over four times larger and significantly higher in quality compared to its predecessor, DaTikZ-V3. The new dataset is enriched with figure descriptions generated by LLMs, providing a more robust foundation for training models. By utilizing a more comprehensive dataset, TikZilla aims to improve the accuracy and fidelity of the generated TikZ figures.

Training the TikZilla Model

TikZilla is a family of small open-source Qwen models, specifically the 3B and 8B variants, trained using a two-stage pipeline. The initial stage employs supervised fine-tuning (SFT) to establish a baseline performance. Following this, reinforcement learning (RL) is utilized to refine the models further. In this stage, the authors employ an image encoder that has been trained via inverse graphics, providing semantically faithful reward signals that inform the model during training.

Evaluation and Results

Extensive human evaluations have been conducted with over 1,000 judgments to assess the performance of TikZilla. The findings reveal a significant improvement, with TikZilla scoring between 1.5 to 2 points higher than its base models on a 5-point scale. Notably, it surpasses the performance of GPT-4o by 0.5 points and matches the capabilities of GPT-5 in image-based evaluations, all while operating with much smaller model sizes.

Availability

The authors have committed to making the code, data, and models publicly available, thereby fostering further research and development in the Text-to-TikZ domain. This initiative not only enhances accessibility for researchers but also encourages collaborative advancements in the generation of scientific figures from textual descriptions.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.