Full metadata record

DC Field Value Language
dc.contributor.authorChoi, Tae-Min-
dc.contributor.authorYoon, Inug-
dc.contributor.authorKim, Jong-Hwan-
dc.contributor.authorPark, Ju youn-
dc.date.accessioned2024-12-12T08:30:06Z-
dc.date.available2024-12-12T08:30:06Z-
dc.date.created2024-12-11-
dc.date.issued2024-11-25-
dc.identifier.urihttps://pubs.kist.re.kr/handle/201004/151350-
dc.identifier.urihttps://bmvc2024.org/proceedings/85/-
dc.description.abstractOpen-vocabulary object detection (OVD) is a computer vision task that detects and classifies objects from categories not seen during training. While recent OVD methods primarily focus on aligning region embeddings with visual-language pre-trained models like CLIP for classification, object detection requires effective localization as well. However, existing methods often use a proposal generator biased toward the training data, which creates a bottleneck in performance improvement. To address this challenge, we introduce the Textual Attention Region Proposal Network (TA-RPN). This network enhances proposal generation by integrating visual and textual features from the CLIP text encoder, utilizing pixel-wise attention for a comprehensive fusion across the image space. Our approach also incorporates prompt learning to optimize textual features for better localization. Evaluated on the COCO and LVIS benchmarks, TA-RPN outperforms existing state-of-the-art methods, demonstrating its effectiveness in detecting novel object categories.-
dc.languageEnglish-
dc.publisherThe British Machine Vision Association and Society for Pattern Recognition-
dc.titleTextual Attention RPN for Open-Vocabulary Object Detection-
dc.typeConference-
dc.description.journalClass1-
dc.identifier.bibliographicCitationThe 35th British Machine Vision Conference-
dc.citation.titleThe 35th British Machine Vision Conference-
dc.citation.conferencePlaceUK-
dc.citation.conferencePlaceGlasgow, UK-
dc.citation.conferenceDate2024-11-25-
dc.relation.isPartOfThe 35th British Machine Vision Conference-
Appears in Collections:
KIST Conference Paper > 2024
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE