InquireMobile: Safe VLM Mobile Agents via Reinforcement Tuning

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning

In a significant advancement in the realm of artificial intelligence, researchers have introduced InquireMobile, a pioneering model designed to enhance the interaction capabilities of Vision-Language Model (VLM)-based mobile agents. This development aims to address the safety challenges posed by fully autonomous systems that may not always comprehend or reason effectively in complex real-world scenarios.

The recent paper, available on arXiv (arXiv:2508.19679v2), outlines a comprehensive strategy to improve mobile agents’ abilities to seek human assistance at critical decision-making junctures. The researchers emphasize the importance of incorporating human oversight in mobile agent interactions, especially when faced with ambiguous or complex tasks.

The Challenge of Autonomous Decision Making

As VLMs continue to evolve, their integration into mobile agents has enabled these systems to perceive and interact with dynamic environments based on human instructions. However, reliance on fully autonomous decision-making can lead to safety risks, particularly when agents encounter scenarios beyond their training data or reasoning capabilities. To mitigate these risks, the researchers propose a new approach that encourages proactive inquiry from mobile agents.

Introducing InquireBench

At the core of this research is InquireBench, a meticulously crafted benchmark that assesses mobile agents’ proficiency in safe interactions and proactive inquiries with users. InquireBench is divided into five categories and includes 22 sub-categories, highlighting the diverse challenges that VLM-based agents currently face. Notably, many existing models have shown near-zero performance in these areas, underscoring the necessity for improved training methodologies.

Evaluation Categories:
- Understanding Ambiguity
- Contextual Awareness
- User Intent Recognition
- Safety Protocols
- Proactive Communication
Sub-Categories:
- Real-Time Decision Making
- Complex Query Handling
- Feedback Integration
- Task Prioritization
- Safety Compliance Checks

Development of InquireMobile

To cultivate a mobile agent that can effectively request human assistance, the researchers devised InquireMobile, employing a novel two-stage training strategy inspired by reinforcement learning. This model incorporates an interactive pre-action reasoning mechanism that prompts the agent to seek confirmation from users before executing critical tasks. This interaction not only enhances the agent’s decision-making process but also fosters a collaborative environment between the agent and the user.

Performance and Future Directions

The results of the study are promising, revealing that InquireMobile achieved a remarkable 46.8% improvement in inquiry success rates compared to existing baseline models on InquireBench. Moreover, it secured the highest overall success rate, showcasing its potential to transform the landscape of mobile agent interaction.

In a move to promote further research and development, the authors have committed to open-sourcing all datasets, models, and evaluation codes. This initiative aims to foster collaboration between academia and industry, ultimately enhancing the safety and efficacy of VLM-based mobile agents in real-world applications.

The introduction of InquireMobile marks a pivotal step towards creating more reliable and safe AI systems that can seamlessly integrate human judgment into their operational frameworks, paving the way for future advancements in artificial intelligence.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

InquireMobile: Safe VLM Mobile Agents via Reinforcement Tuning

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning

The Challenge of Autonomous Decision Making

Introducing InquireBench

Development of InquireMobile

Performance and Future Directions

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related