EE-MCP: Self-Evolving GUI Agents with Automated Learning

EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience Learning

Summary: arXiv:2604.09815v1 Announce Type: new

Abstract

Computer-use agents that combine GUI interaction with structured API calls via the Model Context Protocol (MCP) show promise for automating software tasks. However, existing approaches lack a principled understanding of how agents should balance these two modalities and how to enable iterative self-improvement across diverse applications.

We formulate MCP-GUI interplay as a unified hybrid policy learning problem where the agent learns when each modality provides complementary advantages. Our findings indicate that distillation and experience augmentation target fundamentally different failure modes, necessitating application-aware mechanism selection.

Proposed Framework

Built on this formulation, we propose a self-evolving framework with a fully automatic pipeline that orchestrates the following:

Automatic environment generation and validation
Trajectory collection
Gap-driven task synthesis
Quality-filtered training – all without manual intervention

Key Innovations

A key innovation of our approach is the experience bank. This bank accumulates rules learned from large language models (LLMs) through trajectory comparison, which enables inference-time improvements without the need for fine-tuning.

Cross-Application Analysis

Our systematic cross-application analysis across three desktop applications reveals that the optimal strategy for agent performance depends on the MCP-GUI composition:

Distillation achieves a 77.8% pass rate on MCP-dominant tasks, an improvement of 17.8 percentage points.
The experience bank excels on GUI-intensive tasks, yielding an enhancement of 10.0 percentage points.

Conclusion

The research emphasizes the importance of recognizing the interplay between different modalities in software task automation. By integrating automated environment generation with experience learning, agents can enhance their performance iteratively across varying applications. The proposed self-evolving framework not only simplifies the process of training and improving agents but also demonstrates the potential for significant advancements in how automation tools can be developed and utilized in real-world scenarios.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

EE-MCP: Self-Evolving GUI Agents with Automated Learning

EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience Learning

Abstract

Proposed Framework

Key Innovations

Cross-Application Analysis

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related