Discover GeoGuide, a novel framework enhancing open-vocabulary 3D semantic segmentation using hierarchical geometric guidance for improved accuracy and con...
Discover ARTA, a mixed-resolution vision transformer that boosts dense feature extraction efficiency with adaptive token allocation and reduced computation...
Explore how Object-DINO uncovers distributed object-centric features in self-supervised Vision Transformers to boost unsupervised object discovery and redu...