Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories
Summary: arXiv:2308.10562v2 Announce Type: cross
Abstract
The field of Computer Vision (CV) is increasingly shifting towards “high-level” visual sensemaking tasks, yet the exact nature of these tasks remains unclear and tacit. This survey paper addresses this ambiguity by systematically reviewing research on high-level visual understanding, focusing particularly on Abstract Concepts (ACs) in automatic image classification.
Key Contributions
Our survey contributes in three main ways:
- Clarification of High-Level Semantics: We provide a multidisciplinary analysis that clarifies the tacit understanding of high-level semantics in CV, categorizing these into distinct clusters, including commonsense, emotional, aesthetic, and inductive interpretative semantics.
- Identification of CV Tasks: We identify and categorize computer vision tasks associated with high-level visual sensemaking, offering insights into the diverse research areas within this domain.
- Examination of Abstract Concepts: Our survey examines how abstract concepts such as values and ideologies are handled in CV, revealing challenges and opportunities in AC-based image classification.
Challenges and Opportunities
Notably, our survey of AC image classification tasks highlights persistent challenges, such as the limited efficacy of massive datasets and the importance of integrating supplementary information and mid-level features. The findings suggest that merely relying on large datasets may not be sufficient for accurately classifying images based on abstract concepts.
The Role of Hybrid AI Systems
We emphasize the growing relevance of hybrid AI systems in addressing the multifaceted nature of AC image classification tasks. These systems combine various methodologies and approaches, allowing for a more nuanced understanding of the complexities involved in high-level visual reasoning.
Conclusion
Overall, this survey enhances our understanding of high-level visual reasoning in CV and lays the groundwork for future research endeavors. By addressing the ambiguities and challenges in the field, we hope to guide researchers and practitioners toward more effective methods for integrating high-level semantics in image classification tasks. The insights gained from this survey will not only inform ongoing research but also inspire innovative solutions that push the boundaries of what is possible in computer vision.
