SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation
Summary: arXiv:2604.13236v1 Announce Type: cross
The field of semiconductor failure analysis (FA) has traditionally demanded extensive expertise and time investment from engineers. The process often involves examining inspection images, correlating equipment telemetry, consulting historical defect records, and compiling structured reports. This labor-intensive approach can consume several hours per case, highlighting the need for a more efficient solution.
In response to this challenge, researchers have introduced SemiFA, a groundbreaking agentic multi-modal framework designed to autonomously generate structured FA reports from semiconductor inspection images in less than one minute. This innovative approach significantly reduces the time required for report generation while maintaining high accuracy and reliability.
Overview of SemiFA
SemiFA operates through a sophisticated four-agent LangGraph pipeline, each agent playing a vital role in the overall analysis process:
- DefectDescriber: This agent classifies and narrates defect morphology using advanced models such as DINOv2 and LLaVA-1.6.
- RootCauseAnalyzer: Fusing SECS/GEM equipment telemetry with historically similar defects retrieved from a Qdrant vector database, this agent identifies potential root causes of failures.
- SeverityClassifier: This component assigns severity ratings and estimates yield impacts based on the analyzed data.
- RecipeAdvisor: The final agent in the pipeline proposes corrective process adjustments to mitigate identified issues.
A fifth node in the framework is responsible for assembling a comprehensive PDF report, encapsulating the insights derived from the preceding agents.
Dataset and Performance
The researchers have also introduced a specialized dataset known as SemiFA-930, which comprises 930 annotated semiconductor defect images paired with structured FA narratives across nine distinct defect classes. This dataset was developed from various sources, including procedural synthesis, WM-811K, and MixedWM38.
In terms of performance, the DINOv2-based classifier demonstrated an impressive accuracy rate of 92.1% on a validation set of 140 images, achieving a macro F1 score of 0.917. Furthermore, the entire SemiFA pipeline is capable of producing complete FA reports in just 48 seconds when executed on an NVIDIA A100-SXM4-40 GB GPU.
Multi-Modal Fusion and Its Impact
To validate the effectiveness of the SemiFA framework, a GPT-4o judge ablation study was conducted across four modality conditions. The results indicated that multi-modal fusion significantly enhances root cause reasoning, yielding an improvement of +0.86 composite points (on a 1-5 scale) compared to an image-only baseline. Notably, equipment telemetry emerged as the more influential modality in this enhancement.
Conclusion
To date, SemiFA represents the first known system to integrate SECS/GEM equipment telemetry into a vision-language model pipeline for the autonomous generation of semiconductor failure analysis reports. This innovative framework not only streamlines the FA process but also paves the way for a new era of efficiency and accuracy in semiconductor manufacturing and analysis.
