Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Lee, Minhyeok | - |
| dc.contributor.author | Cho, Suhwan | - |
| dc.contributor.author | Lee, Jungho | - |
| dc.contributor.author | Yang, Sunghun | - |
| dc.contributor.author | Choi, Heeseung | - |
| dc.contributor.author | Kim, Ig-Jae | - |
| dc.contributor.author | Lee, Sangyoun | - |
| dc.date.accessioned | 2025-12-30T02:00:52Z | - |
| dc.date.available | 2025-12-30T02:00:52Z | - |
| dc.date.created | 2025-11-25 | - |
| dc.date.issued | 2025-06-10 | - |
| dc.identifier.uri | https://pubs.kist.re.kr/handle/201004/153920 | - |
| dc.description.abstract | Open-vocabulary semantic segmentation aims to assign pixel-level labels to images across an unlimited range of classes. Traditional methods address this by sequentially connecting a powerful mask proposal generator, such as the Segment Anything Model (SAM), with a pre-trained vision-language model like CLIP. But these two-stage approaches often suffer from high computational costs, memory inefficiencies. In this paper, we propose ESC-Net, a novel one-stage open-vocabulary segmentation model that leverages the SAM decoder blocks for class-agnostic segmentation within an efficient inference framework. By embedding pseudo prompts generated from image-text correlations into SAM’s promptable segmentation framework, ESC-Net achieves refined spatial aggregation for accurate mask predictions. Additionally, a Vision-Language Fusion (VLF) module enhances the final mask prediction through image and text guidance. ESC-Net and PASCAL-Context, outperforming prior methods in both efficiency and accuracy. Comprehensive ablation studies further demonstrate its robustness across challenging conditions. | - |
| dc.publisher | IEEE | - |
| dc.title | Effective SAM Combination for Open-Vocabulary Semantic Segmentation | - |
| dc.type | Conference | - |
| dc.identifier.doi | 10.1109/cvpr52734.2025.02429 | - |
| dc.description.journalClass | 1 | - |
| dc.identifier.bibliographicCitation | 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.26081 - 26090 | - |
| dc.citation.title | 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | - |
| dc.citation.startPage | 26081 | - |
| dc.citation.endPage | 26090 | - |
| dc.citation.conferencePlace | US | - |
| dc.citation.conferencePlace | Nashville, TN, USA | - |
| dc.citation.conferenceDate | 2025-06-10 | - |
| dc.relation.isPartOf | 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.