Google Scholar

Articles

Scholar

Guiding the flowing of semantics: Interpretable video captioning via POS tag

X Xiao, L Wang, B Fan, S Xiang… - Proceedings of the 2019 …, 2019 - aclanthology.org

In the current video captioning models, the video frames are collected in one network and
the semantics are mixed into one feature, which not only increase the difficulty of the caption
decoding, but also decrease the interpretability of the captioning models. To address these
problems, we propose an Adaptive Semantic Guidance Network (ASGN), which instantiates
the whole video semantics to different POS-aware semantics with the supervision of part of
speech (POS) tag. In the encoding process, the POS tag activates the related neurons and …

Save Cite Cited by 11 Related articles Cached

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Guiding the flowing of semantics: Interpretable video captioning via POS tag