JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents
In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) have emerged as powerful tools for various applications. However, as these models are augmented with an increasing number of external tools, challenges arise, particularly in domain-specific settings. A recent paper titled “JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents” presents a novel approach to enhance the reliability of tool-calling in LLM agents.
Background
LLM agents often face difficulties when navigating through a vast array of tools, especially when tool descriptions are ambiguous and agent instructions are underspecified. These challenges can lead to:
- Incorrect tool selection
- Improper slot/value instantiation
- Overall inefficiency in task execution
The authors of the study hypothesize that these issues stem from two primary causes:
- The use of generic, one-size-fits-all prompts that overlook tool-specific details.
- Underspecified tool schemas that fail to provide clear guidance on the appropriate usage of each tool, including formatting parameters accurately.
Introducing JTPRO
The Joint Tool-Prompt Reflective Optimization (JTPRO) framework aims to address these challenges through a systematic approach. By utilizing a process of rollout-driven reflection, JTPRO co-optimizes global instructions alongside per-tool schema and argument descriptions. This ensures accurate tool selection and argument instantiation, especially in environments with extensive tool inventories.
Key features of JTPRO include:
- Preservation of only the tool-local cues necessary for effective disambiguation and slot filling.
- Iterative optimization that enhances both the global instructions and specific tool schemas.
Evaluation and Results
The effectiveness of JTPRO was evaluated across various multi-tool benchmarks, which considered different numbers of tools. The evaluation employed three key metrics:
- Tool Selection Accuracy (TSA)
- Slot Filling Accuracy (SFA)
- Overall Success Rate (OSR) – a composite measure that accounts for correct tool selection, slot filling, and value instantiation.
Results indicate that JTPRO consistently outperformed leading baselines, including Chain of Thought (CoT) style agents and reflective prompt optimizers like GEPA. Specifically, JTPRO surpassed these alternatives by 5% to 20% (relative) in terms of Overall Success Rate (OSR).
Conclusion
In conclusion, the JTPRO framework presents a significant advancement in the optimization of tool-calling reliability for language agents. The findings suggest that the joint optimization of instructions and tool schemas is more effective and robust compared to optimizing each component in isolation. As AI continues to integrate more complex tools, frameworks like JTPRO may play a crucial role in enhancing the efficiency and accuracy of language models.
