저널 : JIPS(Journal of INformation Processing System), Vol. 16, No. 6, pp. 1250~1260
· 논문제목 : “Multimodal Context Embedding for Scene Graph Generation”
· 저자 : 정가영, 김인철
· 요약 : This study proposes a novel deep neural network model that can accurately detect objects and their relationships in an image and represent them as a scene graph. The proposed model utilizes several multimodal features, including linguistic features and visual context features, to accurately detect objects and relationships. In addition, in the proposed model, context features are embedded using graph neural networks to depict the dependencies between two related objects in the context feature vector. This study demonstrates the effectiveness of the proposed model through comparative experiments using the Visual Genome benchmark dataset.