Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Choi, Tae-Min | - |
dc.contributor.author | Yoon, Inug | - |
dc.contributor.author | Kim, Jong-Hwan | - |
dc.contributor.author | Park, Ju youn | - |
dc.date.accessioned | 2024-12-12T08:30:06Z | - |
dc.date.available | 2024-12-12T08:30:06Z | - |
dc.date.created | 2024-12-11 | - |
dc.date.issued | 2024-11-25 | - |
dc.identifier.uri | https://pubs.kist.re.kr/handle/201004/151350 | - |
dc.identifier.uri | https://bmvc2024.org/proceedings/85/ | - |
dc.description.abstract | Open-vocabulary object detection (OVD) is a computer vision task that detects and classifies objects from categories not seen during training. While recent OVD methods primarily focus on aligning region embeddings with visual-language pre-trained models like CLIP for classification, object detection requires effective localization as well. However, existing methods often use a proposal generator biased toward the training data, which creates a bottleneck in performance improvement. To address this challenge, we introduce the Textual Attention Region Proposal Network (TA-RPN). This network enhances proposal generation by integrating visual and textual features from the CLIP text encoder, utilizing pixel-wise attention for a comprehensive fusion across the image space. Our approach also incorporates prompt learning to optimize textual features for better localization. Evaluated on the COCO and LVIS benchmarks, TA-RPN outperforms existing state-of-the-art methods, demonstrating its effectiveness in detecting novel object categories. | - |
dc.language | English | - |
dc.publisher | The British Machine Vision Association and Society for Pattern Recognition | - |
dc.title | Textual Attention RPN for Open-Vocabulary Object Detection | - |
dc.type | Conference | - |
dc.description.journalClass | 1 | - |
dc.identifier.bibliographicCitation | The 35th British Machine Vision Conference | - |
dc.citation.title | The 35th British Machine Vision Conference | - |
dc.citation.conferencePlace | UK | - |
dc.citation.conferencePlace | Glasgow, UK | - |
dc.citation.conferenceDate | 2024-11-25 | - |
dc.relation.isPartOf | The 35th British Machine Vision Conference | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.