Synthetic Homes: A Multimodal Generative AI Pipeline for Residential Building Data Generation under Data Scarcity
Summary: arXiv:2509.09794v4 Announce Type: replace
Abstract: Computational models have emerged as powerful tools for multi-scale energy modeling research at the building and urban scale, supporting data-driven analysis across building and urban energy systems. However, these models require large amounts of building parameter data that is often inaccessible, expensive to collect, or subject to privacy constraints.
In response to these challenges, researchers have introduced a modular, multimodal generative Artificial Intelligence (AI) framework aimed at producing synthetic residential building datasets. This framework integrates various components, including image, tabular, and simulation-based elements, to generate relevant data from publicly available county records and images. This article presents an end-to-end pipeline that exemplifies the capabilities of this innovative framework.
Key Features of the Framework
- Modularity: The framework’s modular design allows for flexibility in its components, making it adaptable to various data generation needs.
- Multimodal Integration: By combining image, tabular, and simulation data, the framework provides a comprehensive approach to generating diverse datasets.
- Data Accessibility: Utilizing publicly available records reduces reliance on costly or restricted data sources, making the research process more efficient.
- Occlusion-based Visual Focus Analysis: To enhance the performance of the model, the research employs visual focus analysis to systematically evaluate the model’s capability in processing building images.
Evaluation of Model Performance
The research team conducted thorough evaluations of the model’s components. They utilized occlusion-based visual focus analysis to compare the effectiveness of their selected vision-language model against a GPT-based alternative. The findings indicate that the chosen model demonstrates significantly stronger visual focus, thereby enhancing its application in building image processing tasks.
Realism and Data Overlap
To assess the realism of the synthetic datasets generated by the framework, the researchers compared their results against a national reference dataset. The analysis revealed that:
- The synthetic data overlaps more than 65% with the reference dataset across all evaluated parameters.
- For three of the four parameters assessed, the overlap exceeds 90%, showcasing the high fidelity of the synthetic data.
Impact on Energy Research and Modeling
This innovative work significantly lowers the barriers to conducting building-scale energy research and Machine Learning (ML)-driven urban energy modeling. By providing scalable synthetic datasets, the framework enables downstream tasks such as:
- Energy modeling
- Retrofit analysis
- Urban-scale simulation
Overall, the multimodal generative AI pipeline presents a promising solution for addressing data scarcity challenges in residential building research, ultimately paving the way for more effective and sustainable urban energy systems.
