〈 HOME
			 ·  
				저널 : MPE(Mathematical Problems in Engineering),Vol. 2018(2018), Article ID 3125879, 8 pages.
			
			 ·  
				논문제목 : “Multimodal Feature Learning for Video Captioning”
			
			 ·  
				저자 : 이수진, 김인철
			
			 ·  
				요약 : Video captioning refers to the task of generating a natural language sentence that explains the content of the input video clips.
				This study proposes a deep neural network model for effective video captioning. Apart from visual features, the proposed model
				learns additionally semantic features that describe the video content effectively. In our model, visual features of the input video
				are extracted using convolutional neural networks such as C3D and ResNet, while semantic features are obtained using recurrent
				neural networks such as LSTM. In addition, our model includes an attention-based caption generation network to generate the
				correct natural language captions based on the multimodal video feature sequences. Various experiments, conducted with the two
				large benchmark datasets,Microsoft Video Description (MSVD) andMicrosoft Research Video-to-Text (MSR-VTT), demonstrate
				the performance of the proposed model.