EE-MCP: Self-Evolving GUI Agents with Automated Learning

Date:

EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience Learning

Summary: arXiv:2604.09815v1 Announce Type: new

Abstract

Computer-use agents that combine GUI interaction with structured API calls via the Model Context Protocol (MCP) show promise for automating software tasks. However, existing approaches lack a principled understanding of how agents should balance these two modalities and how to enable iterative self-improvement across diverse applications.

We formulate MCP-GUI interplay as a unified hybrid policy learning problem where the agent learns when each modality provides complementary advantages. Our findings indicate that distillation and experience augmentation target fundamentally different failure modes, necessitating application-aware mechanism selection.

Proposed Framework

Built on this formulation, we propose a self-evolving framework with a fully automatic pipeline that orchestrates the following:

  • Automatic environment generation and validation
  • Trajectory collection
  • Gap-driven task synthesis
  • Quality-filtered training – all without manual intervention

Key Innovations

A key innovation of our approach is the experience bank. This bank accumulates rules learned from large language models (LLMs) through trajectory comparison, which enables inference-time improvements without the need for fine-tuning.

Cross-Application Analysis

Our systematic cross-application analysis across three desktop applications reveals that the optimal strategy for agent performance depends on the MCP-GUI composition:

  • Distillation achieves a 77.8% pass rate on MCP-dominant tasks, an improvement of 17.8 percentage points.
  • The experience bank excels on GUI-intensive tasks, yielding an enhancement of 10.0 percentage points.

Conclusion

The research emphasizes the importance of recognizing the interplay between different modalities in software task automation. By integrating automated environment generation with experience learning, agents can enhance their performance iteratively across varying applications. The proposed self-evolving framework not only simplifies the process of training and improving agents but also demonstrates the potential for significant advancements in how automation tools can be developed and utilized in real-world scenarios.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.