A PennyLane-Centric Dataset to Enhance LLM-based Quantum Code Generation using RAG
Summary: arXiv:2503.02497v4 Announce Type: replace-cross
Abstract: Large Language Models (LLMs) offer powerful capabilities in code generation, natural language understanding, and domain-specific reasoning. Their application to quantum software development remains limited, in part because of the lack of high-quality datasets both for LLM training and as dependable knowledge sources. To bridge this gap, we introduce PennyLang, an off-the-shelf, high-quality dataset of 3,347 PennyLane-specific quantum code samples with contextual descriptions, curated from textbooks, official documentation, and open-source repositories.
Introduction
The burgeoning field of quantum computing has created a demand for sophisticated tools and methodologies to facilitate software development. Large Language Models (LLMs) have shown promise in various domains, yet their application in quantum programming has been constrained by insufficient high-quality datasets. This article presents PennyLang, a curated dataset aimed at enhancing the capabilities of LLMs in quantum code generation.
Key Contributions
Our research encompasses three significant contributions:
- Creation of PennyLang: We have developed and released an open-source dataset comprising 3,347 quantum code samples tailored specifically for PennyLane.
- Framework for Automated Dataset Construction: We established a systematic approach for curating, annotating, and formatting quantum code datasets to maximize their usability for LLMs.
- Baseline Evaluation: A comprehensive evaluation of the dataset has been conducted across various open-source and commercial models, including ablation studies within a retrieval-augmented generation (RAG) pipeline.
PennyLang Dataset
The PennyLang dataset serves as a critical resource for researchers and developers in the quantum computing domain. By compiling a diverse range of code samples, it not only enhances the training of LLMs but also acts as a reliable source of knowledge. The dataset includes:
- Code snippets demonstrating various quantum algorithms and applications.
- Contextual descriptions that provide insights into the functionality and intended use of each code sample.
- Annotations that facilitate better understanding and learning for users of all skill levels.
Performance Improvement with RAG
The integration of PennyLang with retrieval-augmented generation (RAG) techniques has shown substantial improvements in code generation performance. For instance:
- The success rate of Qwen 7B increased from 8.7% without retrieval to 41.7% with full-context augmentation.
- LLaMa 4 exhibited an improvement from 78.8% to 84.8%, demonstrating enhanced accuracy in generating quantum code.
- Reduction in hallucinations and an increase in the correctness of quantum code were also observed, validating the effectiveness of the dataset.
Conclusion
In summary, the PennyLang dataset represents a significant advancement in the field of quantum software development. By providing a rich resource tailored to PennyLane, it facilitates the effective training of LLMs and fosters the development of AI-assisted quantum programming tools. Moving beyond the conventional focus on Qiskit, our work aims to propel the capabilities of LLMs within the PennyLane ecosystem, paving the way for future innovations in quantum computing.
