Memory-Augmented LLM System for Auto Feature Generation

Date:


Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data

Summary: arXiv:2604.20261v1 Announce Type: new

Abstract: Automated feature generation extracts informative features from raw tabular data without manual intervention and is crucial for accurate, generalizable machine learning. Traditional methods rely on predefined operator libraries and cannot leverage task semantics, limiting their ability to produce diverse, high-value features for complex tasks. Recent Large Language Model (LLM)-based approaches introduce richer semantic signals, but still suffer from a restricted feature space due to fixed generation patterns and from the absence of feedback from the learning objective. To address these challenges, we propose a Memory-Augmented LLM-based Multi-Agent System (MALMAS) for automated feature generation. MALMAS decomposes the generation process into agents with distinct responsibilities, and a Router Agent activates an appropriate subset of agents per iteration, further broadening exploration of the feature space. We further integrate a memory module comprising procedural memory, feedback memory, and conceptual memory, enabling iterative refinement that adaptively guides subsequent feature generation and improves feature quality and diversity. Extensive experiments on multiple public datasets against state-of-the-art baselines demonstrate the effectiveness of our approach. The code is available at https://github.com/fxdong24/MALMAS.

Introduction

Feature generation is a pivotal process in machine learning, especially when dealing with raw tabular data. The ability to automatically extract features can significantly enhance the performance of machine learning models, making them more accurate and generalizable. Traditional methods, while effective to some extent, often come with limitations that hinder their overall efficacy.

Challenges with Traditional Methods

Conventional automated feature generation approaches typically rely on:

  • Predefined operator libraries that restrict the diversity of generated features.
  • The inability to leverage task-specific semantics, which can lead to suboptimal feature extraction.
  • Fixed generation patterns that limit exploration of the feature space.
  • Absence of feedback mechanisms which can guide and improve the generation process.

The MALMAS Approach

To overcome these challenges, we introduce the Memory-Augmented LLM-based Multi-Agent System (MALMAS). This innovative system comprises multiple agents, each with specific roles in the feature generation process. Key components include:

  • Router Agent: Activates a subset of agents to broaden feature exploration during each iteration.
  • Memory Module: Includes procedural, feedback, and conceptual memory that facilitate iterative refinement of generated features.

Benefits of MALMAS

MALMAS offers several advantages over traditional methods:

  • Enhanced exploration of the feature space allows for more diverse and high-value features.
  • Iterative refinement improves the quality of features based on previous outputs.
  • Adaptive guidance from memory modules ensures that feature generation is aligned with the learning objective.

Conclusion

In conclusion, MALMAS represents a significant advancement in automated feature generation for tabular data. With its multi-agent architecture and memory augmentation, it effectively addresses the limitations of traditional methods and sets a new standard for feature extraction in machine learning. Researchers and practitioners are encouraged to explore the implementation of MALMAS through the provided code repository.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.