AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow
In the realm of subsurface flow simulation, the demand for high-fidelity numerical methods often comes with a significant computational cost. This complexity is especially pronounced in many-query tasks such as uncertainty quantification and data assimilation. To alleviate this burden, deep learning (DL) surrogates have emerged as a promising solution, providing a means to accelerate forward simulations. However, constructing these surrogates typically requires substantial machine learning (ML) expertise, including architecture design and hyperparameter tuning, which can be daunting for domain scientists lacking such knowledge. The process remains largely manual and heavily dependent on heuristic choices, creating a notable expertise gap that hinders the broader adoption of DL surrogate techniques.
To address these challenges, we introduce AutoSurrogate, a revolutionary large-language-model (LLM)-driven multi-agent framework. This innovative system empowers practitioners without ML expertise to construct high-quality surrogates for subsurface flow problems using natural-language instructions.
Key Features of AutoSurrogate
AutoSurrogate comprises four specialized agents that work collaboratively to automate the surrogate modeling process. These agents perform a series of tasks, including:
- Data Profiling: Analyzing simulation data to extract relevant features and characteristics.
- Architecture Selection: Choosing appropriate model architectures from a curated model zoo based on the specific problem.
- Bayesian Hyperparameter Optimization: Systematically tuning hyperparameters to enhance the model’s performance.
- Model Training and Quality Assessment: Training the model and assessing its quality against user-defined thresholds.
Moreover, AutoSurrogate autonomously addresses common failure modes that can arise during model development. For instance, if numerical instabilities occur, the system can promptly restart training with adjusted configurations. Additionally, if the predictive accuracy of the model falls short of targets, AutoSurrogate is capable of switching to alternative architectures to better meet user expectations.
Efficiency and Effectiveness
One of the most remarkable aspects of AutoSurrogate is its ability to transform a single natural-language sentence into a deployment-ready surrogate model. This efficiency minimizes the need for human intervention at various stages of the process, significantly streamlining the workflow. The system’s utility is further demonstrated through its application in a 3D geological carbon storage modeling task. Here, AutoSurrogate effectively maps permeability fields to pressure and CO2 saturation fields over 31 timesteps.
Importantly, AutoSurrogate has been shown to outperform both expert-designed baselines and domain-agnostic AutoML methods without any manual tuning. This enhances its appeal for practical deployment across various fields where subsurface flow modeling is critical.
Conclusion
In summary, AutoSurrogate represents a significant advancement in the field of deep learning surrogates for subsurface flow problems. By leveraging the capabilities of a multi-agent framework driven by large language models, it democratizes access to sophisticated modeling techniques, allowing practitioners with minimal ML expertise to harness the power of deep learning. This innovation not only addresses the existing expertise gap but also paves the way for broader adoption of DL surrogate techniques in subsurface flow applications.
