CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning

05/26/2023
by   Zhaoheng Zheng, et al.
0

Compositionality, the ability to combine existing concepts and generalize towards novel compositions, is a key functionality for intelligent entities. Here, we study the problem of Compositional Zero-Shot Learning (CZSL), which aims at recognizing novel attribute-object compositions. Recent approaches build their systems on top of large-scale Vision-Language Pre-trained (VLP) models, e.g. CLIP, and observe significant improvements. However, these methods treat CLIP as a black box and focus on pre- and post-CLIP operations. Here, we propose to dive deep into the architecture and insert adapters, a parameter-efficient technique proven to be effective among large language models, to each CLIP encoder layer. We further equip adapters with concept awareness so that concept-specific features of "object", "attribute" and "composition" can be extracted. We name our method CAILA, Concept-Aware Intra-Layer Adapters. Quantitative evaluations performed on three popular CZSL datasets, MIT-States, C-GQA, and UT-Zappos, reveal that CAILA achieves double-digit relative improvements against the current state-of-the-art on all benchmarks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro