


default search action
28th ACM Multimedia 2020: Virtual Event (Seattle, WA), USA
- Chang Wen Chen, Rita Cucchiara, Xian-Sheng Hua, Guo-Jun Qi, Elisa Ricci, Zhengyou Zhang, Roger Zimmermann:

MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12-16, 2020. ACM 2020, ISBN 978-1-4503-7988-5
Oral Session A1: Deep Learning for Multimedia
- Jin Wang, Chen Wang, Qingming Huang, Yunhui Shi, Jian-Feng Cai, Qing Zhu, Baocai Yin:

Image Inpainting Based on Multi-frequency Probabilistic Inference Model. 1-9 - Jianzhe Lin, Lichao Mou, Tianze Yu, Xiaoxiang Zhu, Z. Jane Wang:

Dual Adversarial Network for Unsupervised Ground/Satellite-to-Aerial Scene Adaptation. 10-18 - Yadan Luo

, Zi Huang
, Zijian Wang
, Zheng Zhang
, Mahsa Baktashmotlagh
:
Adversarial Bipartite Graph Learning for Video Domain Adaptation. 19-27 - Peng Wang, Dongyang Liu, Hui Li, Qi Wu:

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge. 28-36 - Weijiang Yu, Jian Liang, Lu Li, Nong Xiao:

Single Image De-noising via Staged Memory Network. 37-45 - Xuanchi Ren

, Haoran Li, Zijian Huang, Qifeng Chen:
Self-supervised Dance Video Synthesis Conditioned on Music. 46-54
Oral Session B1: Deep Learning for Multimedia
- Fanfan Ye, Shiliang Pu, Qiaoyong Zhong

, Chao Li, Di Xie, Huiming Tang:
Dynamic GCN: Context-enriched Topology Learning for Skeleton-based Action Recognition. 55-63 - Peike Li, Yunchao Wei, Yi Yang:

Meta Parsing Networks: Towards Generalized Few-shot Scene Parsing with Adaptive Metric Learning. 64-72 - Wei Li

, Zhenting Wang, Xiao Wu, Ji Zhang, Qiang Peng, Hongliang Li
:
CODAN: Counting-driven Attention Network for Vehicle Detection in Congested Scenes. 73-82 - Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang

:
Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph. 83-91 - Zeren Sun

, Xian-Sheng Hua, Yazhou Yao, Xiu-Shen Wei, Guosheng Hu, Jian Zhang
:
CRSSC: Salvage Reusable Samples from Noisy Data for Robust Learning. 92-101 - Jen-Chun Lin

, Wen-Li Wei, Yen-Yu Lin
, Tyng-Luh Liu, Hong-Yuan Mark Liao:
Learning From Music to Visual Storytelling of Shots: A Deep Interactive Learning Mechanism. 102-110
Oral Session C1: Deep Learning for Multimedia
- Fangfang Wang

, Yifeng Chen, Fei Wu, Xi Li:
TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene Text Detection. 111-119 - Peng Lu

, Jiahui Liu, Xujun Peng, Xiaojie Wang:
Weakly Supervised Real-time Image Cropping based on Aesthetic Distributions. 120-128 - Yuting Liu, Zheng Wang, Miaojing Shi, Shin'ichi Satoh, Qijun Zhao, Hongyu Yang:

Towards Unsupervised Crowd Counting via Regression-Detection Bi-knowledge Transfer. 129-137 - Yanlu Wei, Renshuai Tao, Zhangjie Wu, Yuqing Ma, Libo Zhang, Xianglong Liu:

Occluded Prohibited Items Detection: An X-ray Security Inspection Benchmark and De-occlusion Attention Module. 138-146 - Hsuan-Kai Kao, Li Su:

Temporally Guided Music-to-Body-Movement Generation. 147-155 - Yixiong Zou, Shanghang Zhang, Ke Chen, Yonghong Tian, Yaowei Wang, José M. F. Moura:

Compositional Few-Shot Recognition with Primitive Discovery and Enhancing. 156-164
Oral Session D1: Deep Learning for Multimedia
- Chen Gao, Si Liu, Defa Zhu, Quan Liu, Jie Cao, Haoqian He, Ran He, Shuicheng Yan:

InteractGAN: Learning to Generate Human-Object Interaction. 165-173 - Shijie Wang, Zhihui Wang, Haojie Li, Wanli Ouyang

:
Category-specific Semantic Coherency Learning for Fine-grained Image Recognition. 174-183 - Che Sun, Yunde Jia, Yao Hu, Yuwei Wu:

Scene-Aware Context Reasoning for Unsupervised Abnormal Event Detection in Videos. 184-192 - Jing Jin

, Junhui Hou
, Jie Chen, Sam Kwong
, Jingyi Yu:
Light Field Super-resolution via Attention-Guided Fusion of Hybrid Lenses. 193-201 - Wei-Cheng Lai, Zi-Xiang Xia, Hao-Siang Lin, Lien-Feng Hsu, Hong-Han Shuai, I-Hong Jhuo, Wen-Huang Cheng:

Trajectory Prediction in Heterogeneous Environment via Attended Ecology Embedding. 202-210 - Liang Sun, Xiang Guan, Yang Yang, Lei Zhang:

Text-Embedded Bilinear Model for Fine-Grained Visual Recognition. 211-219
Oral Session E1: Deep Learning for Multimedia
- Zhiheng Ma

, Xing Wei, Xiaopeng Hong, Yihong Gong:
Learning Scales from Points: A Scale-aware Probabilistic Model for Crowd Counting. 220-228 - Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu:

Learning Global Structure Consistency for Robust Object Tracking. 229-237 - Xinke Li

, Chongshou Li
, Zekun Tong
, Andrew Lim
, Junsong Yuan
, Yuwei Wu
, Jing Tang
, Raymond Huang:
Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene. 238-246 - Jun-Hyuk Kim

, Soobeom Jang, Jun-Ho Choi, Jong-Seok Lee:
Instability of Successive Deep Image Compression. 247-255 - Akash Gupta, Abhishek Aich, Amit K. Roy-Chowdhury:

ALANET: Adaptive Latent Attention Network for Joint Video Deblurring and Interpolation. 256-264 - Shaotian Yan, Chen Shen, Zhongming Jin, Jianqiang Huang, Rongxin Jiang, Yaowu Chen, Xian-Sheng Hua:

PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation. 265-273
Oral Session F1: Deep Learning for Multimedia
- Peixi Peng, Yonghong Tian, Yangru Huang, Xiangqian Wang, Huilong An:

Discriminative Spatial Feature Learning for Person Re-Identification. 274-283 - Xiangping Wu

, Qingcai Chen, Wei Li, Yulun Xiao, Baotian Hu:
AdaHGNN: Adaptive Hypergraph Neural Networks for Multi-Label Image Classification. 284-293 - Dawei Zhang

, Zhonglong Zheng, Minglu Li, Xiaowei He, Tianxiang Wang, Liyuan Chen, Riheng Jia, Feilong Lin
:
Reinforced Similarity Learning: Siamese Relation Networks for Robust Object Tracking. 294-303 - Ruoxi Deng, Shengjun Liu:

Deep Structural Contour Detection. 304-312 - Saurabh Sahu, Palash Goyal

, Shalini Ghosh, Chul Lee:
Cross-modal Non-linear Guided Attention and Temporal Coherence in Multi-modal Deep Video Models. 313-321 - Zhenhuan Liu, Jincan Deng, Liang Li

, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang:
IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning. 322-330
Oral Session G1: Deep Learning for Multimedia
- Xin Wang

, Wei Huang
, Qi Liu, Yu Yin, Zhenya Huang, Le Wu, Jianhui Ma, Xue Wang:
Fine-Grained Similarity Measurement between Educational Videos and Exercises. 331-339 - Mengli Cheng, Minghui Qiu, Xing Shi, Jun Huang, Wei Lin:

One-shot Text Field labeling using Attention and Belief Propagation for Structure Information Extraction. 340-348 - Yunzhuo Liu, Bo Jiang, Tian Guo

, Ramesh K. Sitaraman
, Don Towsley
, Xinbing Wang:
Grad: Learning for Overhead-aware Adaptive Video Streaming with Scalable Video Coding. 349-357 - Yat Hong Lam, Alireza Zare, Francesco Cricri, Jani Lainema, Miska M. Hannuksela:

Efficient Adaptation of Neural Network Filter for Video Compression. 358-366 - Naoki Kimura, Keisuke Shiro, Yota Takakura, Hiromi Nakamura, Jun Rekimoto:

SonoSpace: Visual Feedback of Timbre with Unsupervised Learning. 367-374 - Bo Pang, Deming Zhai, Junjun Jiang, Xianming Liu:

Single Image Deraining via Scale-space Invariant Attention Neural Network. 375-383
Oral Session H1: Emerging Multimedia Applications
- Kaihao Zhang, Wenhan Luo

, Björn Stenger, Wenqi Ren, Lin Ma, Hongdong Li
:
Every Moment Matters: Detail-Aware Networks to Bring a Blurry Image Alive. 384-392 - Weiqing Min, Linhu Liu, Zhiling Wang, Zhengdong Luo

, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang:
ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network. 393-401 - Tianyu Zhang, Weiqing Min, Ying Zhu, Yong Rui, Shuqiang Jiang:

An Egocentric Action Anticipation Framework via Fusing Intuition and Analysis. 402-410 - Diangang Li, Jianquan Liu, Shoji Nishimura, Yuka Hayashi, Jun Suzuki, Yihong Gong:

Multi-Person Action Recognition in Microwave Sensors. 411-420 - Qi Jia, Xin Fan, Meiyu Yu, Yuqing Liu, Dingrong Wang

, Longin Jan Latecki
:
Coupling Deep Textural and Shape Features for Sketch Recognition. 421-429 - Huaizheng Zhang

, Yong Luo, Qiming Ai, Yonggang Wen, Han Hu:
Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning. 430-438
Oral Session A2: Emerging Multimedia Applications
- Komal Chugh, Parul Gupta

, Abhinav Dhall, Ramanathan Subramanian
:
Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization. 439-447 - Kai Cheng, Xin Liu, Yiu-ming Cheung, Rui Wang, Xing Xu, Bineng Zhong:

Hearing like Seeing: Improving Voice-Face Interactions and Associations via Adversarial Deep Semantic Matching Network. 448-455 - Ramit Sawhney, Puneet Mathur, Ayush Mangal, Piyush Khanna, Rajiv Ratn Shah

, Roger Zimmermann:
Multimodal Multi-Task Financial Risk Forecasting. 456-465 - Jiahang Wang, Tong Sha, Wei Zhang, Zhoujun Li

, Tao Mei:
Down to the Last Detail: Virtual Try-on with Fine-grained Details. 466-474 - Yifeng Zhou, Xing Xu, Fumin Shen, Lianli Gao, Huimin Lu, Heng Tao Shen:

Temporal Denoising Mask Synthesis Network for Learning Blind Video Temporal Consistency. 475-483 - K. R. Prajwal, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar:

A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild. 484-492
Oral Session B2: Emotional and Social Signals in Multimedia
- Guangyao Shen, Xin Wang, Xuguang Duan, Hongzhi Li, Wenwu Zhu:

MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos. 493-502 - Dong Zhang, Weisheng Zhang, Shoushan Li, Qiaoming Zhu, Guodong Zhou:

Modeling both Intra- and Inter-modal Influence for Real-Time Emotion Detection in Conversations. 503-511 - Xincheng Ju, Dong Zhang, Junhui Li, Guodong Zhou:

Transformer-based Label Set Generation for Multi-modal Multi-label Emotion Detection. 512-520 - Kaicheng Yang, Hua Xu, Kai Gao:

CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis. 521-528 - Xingkun Zuo, Jiyi Li, Qili Zhou, Jianjun Li, Xiaoyang Mao:

AffectI: A Game for Diverse, Reliable, and Efficient Affective Image Annotation. 529-537 - Shi Yin, Shangfei Wang, Xiaoping Chen, Enhong Chen

, Cong Liang:
Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking. 538-546
Oral Session C2: Media Interpretation
- Xiaobin Liu

, Shiliang Zhang:
Domain Adaptive Person Re-Identification via Coupling Optimization. 547-555 - Peipei Li, Yinglu Liu, Hailin Shi, Xiang Wu, Yibo Hu, Ran He, Zhenan Sun:

Dual-Structure Disentangling Variational Generation for Data-Limited Face Parsing. 556-564 - Chunhui Zhang, Shiming Ge, Kangkai Zhang, Dan Zeng:

Accurate UAV Tracking with Distance-Injected Overlap Maximization. 565-573 - Hongru Liang

, Wenqiang Lei, Paul Yaozhu Chan, Zhenglu Yang, Maosong Sun, Tat-Seng Chua:
PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music. 574-582 - Guang Yu

, Siqi Wang, Zhiping Cai, En Zhu, Chuanfu Xu, Jianping Yin, Marius Kloft:
Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events. 583-591 - Qian Bao, Wu Liu, Jun Hong, Lingyu Duan, Tao Mei

:
Pose-native Network Architecture Search for Multi-person Human Pose Estimation. 592-600
Oral Session D2: Media Interpretation
- Xiruo Shi, Liutong Xu, Pengfei Wang

, Yuanyuan Gao, Haifang Jian, Wu Liu:
Beyond the Attention: Distinguish the Discriminative and Confusable Features For Fine-grained Image Classification. 601-609 - Hao Tang

, Zechao Li, Zhimao Peng, Jinhui Tang:
BlockMix: Meta Regularization and Self-Calibrated Inference for Metric-Based Meta-Learning. 610-618 - Dechao Meng, Liang Li

, Shuhui Wang, Xingyu Gao
, Zheng-Jun Zha, Qingming Huang:
Fine-grained Feature Alignment with Part Perspective Transformation for Vehicle ReID. 619-627 - Yanbin Hao, Hao Zhang, Chong-Wah Ngo, Qiang Liu, Xiaojun Hu:

Compact Bilinear Augmented Query Structured Attention for Sport Highlights Classification. 628-636 - Jiacheng Li, Zhiwei Xiong, Dong Liu, Xuejin Chen, Zheng-Jun Zha:

Semantic Image Analogy with a Conditional Single-Image GAN. 637-645 - Yangchun Zhu

, Zheng-Jun Zha, Tianzhu Zhang, Jiawei Liu, Jiebo Luo
:
A Structured Graph Attention Network for Vehicle Re-Identification. 646-654
Oral Session E2: Media Interpretation
- Baoyu Fan, Li Wang

, Runze Zhang, Zhenhua Guo, Yaqian Zhao, Rengang Li, Weifeng Gong:
Contextual Multi-Scale Feature Learning for Person Re-Identification. 655-663 - Zeyu Xiao, Zhiwei Xiong, Xueyang Fu

, Dong Liu, Zheng-Jun Zha:
Space-Time Video Super-Resolution Using Temporal Profiles. 664-672 - Boqiang Xu, Lingxiao He, Xingyu Liao, Wu Liu, Zhenan Sun, Tao Mei:

Black Re-ID: A Head-shoulder Descriptor for the Challenging Problem of Person Re-Identification. 673-681 - Haoran Lv, Qin Yang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong:

SalGCN: Saliency Prediction for 360-Degree Images Based on Spherical Graph Convolutional Networks. 682-690 - Sai Praneeth Reddy Sunkesula, Rishabh Dabral, Ganesh Ramakrishnan:

LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal Networks for HOI in videos. 691-699 - Zhengqing Fang

, Kun Kuang, Yuxiao Lin, Fei Wu, Yu-Feng Yao:
Concept-based Explanation for Fine-grained Images and Its Application in Infectious Keratitis Classification. 700-708
Oral Session F2: Mobile Multimedia & Multimedia HCI and Quality of Experience
- Yuanqiang Cai, Dawei Du, Libo Zhang, Longyin Wen, Weiqiang Wang, Yanjun Wu, Siwei Lyu:

Guided Attention Network for Object Detection and Counting on Drones. 709-717 - Jingchen Sun, Jiming Chen, Tao Chen, Jiayuan Fan, Shibo He:

PIDNet: An Efficient Network for Dynamic Pedestrian Intrusion Detection. 718-726 - Xing Cai, Lanqing Zhang, Chengyuan Li, Ge Li, Thomas H. Li:

VONAS: Network Design in Visual Odometry using Neural Architecture Search. 727-735 - Wenbo Zheng

, Lan Yan, Fei-Yue Wang, Chao Gou
:
Learning from the Past: Meta-Continual Learning with Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition. 736-743 - Zijie Ye, Haozhe Wu, Jia Jia, Yaohua Bu, Wei Chen, Fanbo Meng, Yanfeng Wang:

ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit. 744-752 - Qiushi Li

, Wenwu Zhu, Chao Wu, Xinglin Pan, Fan Yang, Yuezhi Zhou, Yaoxue Zhang:
InvisibleFL: Federated Learning over Non-Informative Intermediate Updates against Multimedia Privacy Leakages. 753-762 - Shu Zhao, Dayan Wu, Wanqian Zhang, Yu Zhou

, Bo Li
, Weiping Wang
:
Asymmetric Deep Hashing for Efficient Hash Code Compression. 763-771
Oral Session G2: Multimedia HCI and Quality of Experience
- Yuen-Jen Lin, Hsuan-Kai Kao, Yih-Chih Tseng, Ming Tsai, Li Su:

A Human-Computer Duet System for Music Performance. 772-780 - Yujia Wang, Sifan Hou, Bing Ning, Wei Liang:

Photo Stand-Out: Photography with Virtual Character. 781-788 - Dingquan Li

, Tingting Jiang
, Ming Jiang:
Norm-in-Norm Loss with Faster Convergence and Better Performance for Image Quality Assessment. 789-797 - Munan Xu, Jia-Xing Zhong, Yurui Ren, Shan Liu, Ge Li:

Context-aware Attention Network for Predicting Image Aesthetic Subjectivity. 798-806 - Nikolas Wehner, Michael Seufert

, Sebastian Egger-Lampl, Bruno Gardlo, Pedro Casas
, Raimund Schatz:
Scoring High: Analysis and Prediction of Viewer Behavior and Engagement in the Context of 2018 FIFA WC Live Streaming. 807-815 - Jingwen Hou, Sheng Yang, Weisi Lin:

Object-level Attention for Aesthetic Rating Distribution Prediction. 816-824 - Zhaohui Zhang, Haichao Zhu, Qian Zhang:

ARSketch: Sketch-Based User Interface for Augmented Reality Glasses. 825-833
Oral Session H2: Multimedia HCI and Quality of Experience & Multimedia Search and Recommendation
- Pengfei Chen, Leida Li

, Lei Ma, Jinjian Wu, Guangming Shi:
RIRNet: Recurrent-In-Recurrent Network for Video Quality Assessment. 834-842 - Yiru Wang, Shen Huang, Gongfu Li, Qiang Deng, Dongliang Liao, Pengda Si, Yujiu Yang

, Jin Xu:
Cognitive Representation Learning of Self-Media Online Article Quality. 843-851 - Jakub Nawala

, Lucjan Janowski
, Bogdan Cmiel
, Krzysztof Rusek
:
Describing Subjective Experiment Consistency by p-Value P-P Plot. 852-861 - Leonardo Galteri

, Marco Bertini, Lorenzo Seidenari, Tiberio Uricchio
, Alberto Del Bimbo:
Increasing Video Perceptual Quality with GANs and Semantic Coding. 862-870 - Yongxin Wang

, Xin Luo
, Xin-Shun Xu:
Label Embedding Online Hashing for Cross-Modal Retrieval. 871-879 - Zhaopeng Li, Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang:

Quaternion-Based Knowledge Graph Network for Recommendation. 880-888
Oral Session A3: Multimedia Search and Recommendation
- Yongguo Ling, Zhun Zhong, Zhiming Luo, Paolo Rota

, Shaozi Li, Nicu Sebe
:
Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification. 889-897 - Da Cao, Yawen Zeng

, Xiaochi Wei, Liqiang Nie, Richang Hong, Zheng Qin:
Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization. 898-906 - Xinchen Liu, Wu Liu, Jinkai Zheng, Chenggang Yan, Tao Mei

:
Beyond the Parts: Learning Multi-view Cross-part Correlation for Vehicle Re-identification. 907-915 - Lu Jin, Zechao Li, Yonghua Pan, Jinhui Tang:

Weakly-Supervised Image Hashing through Masked Visual-Semantic Graph-based Reasoning. 916-924 - Heyu Zhou, Weizhi Nie

, Dan Song, Nian Hu, Xuanya Li, An-An Liu:
Semantic Consistency Guided Instance Feature Alignment for 2D Image-Based 3D Shape Retrieval. 925-933 - Niluthpol Chowdhury Mithun, Karan Sikka, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar:

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization. 934-954
Oral Session B3: Multimedia Systems and Middleware & Media Transport and Delivery
- Weiming Zhuang, Yonggang Wen, Xuesen Zhang, Xin Gan, Daiying Yin, Dongzhan Zhou, Shuai Zhang, Shuai Yi:

Performance Optimization of Federated Person Re-identification via Benchmark Analysis. 955-963 - Hung-Min Hsu, Yizhou Wang

, Jenq-Neng Hwang:
Traffic-Aware Multi-Camera Tracking of Vehicles Based on ReID and Camera Link Model. 964-972 - Jie Wu, Tianshui Chen, Lishan Huang, Hefeng Wu, Guanbin Li, Ling Tian, Liang Lin:

Active Object Search. 973-981 - Jun Yi, Md Reazul Islam, Shivang Aggarwal

, Dimitrios Koutsonikolas, Y. Charlie Hu, Zhisheng Yan
:
An Analysis of Delay in Live 360° Video Streaming Systems. 982-990 - Yuhang Li, Xuejin Chen, Binxin Yang, Zihan Chen, Zhihua Cheng, Zheng-Jun Zha:

DeepFacePencil: Creating Face Images from Freehand Sketches. 991-999 - Peilin Chen

, Wenhan Yang, Long Sun, Shiqi Wang:
When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding. 1000-1008 - Gang Yan

, Jian Li:
RL-Bélády: A Unified Learning Framework for Content Caching. 1009-1017
Oral Session C3: Multimodal Analysis and Description &Summarization, Analytics, and Storytelling
- Zhizhong Han, Chao Chen, Yu-Shen Liu

, Matthias Zwicker:
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences. 1018-1027 - Xing Wei, Diangang Li, Xiaopeng Hong, Wei Ke, Yihong Gong:

Co-Attentive Lifting for Infrared-Visible Person Re-Identification. 1028-1037 - Zhiwei Wu, Changmeng Zheng

, Yi Cai, Junying Chen, Ho-fung Leung, Qing Li:
Multimodal Representation with Embedded Visual Guiding Objects for Named Entity Recognition in Social Media Posts. 1038-1046 - Leigang Qu, Meng Liu, Da Cao, Liqiang Nie, Qi Tian:

Context-Aware Multi-View Summarization Network for Image-Text Matching. 1047-1055 - Evlampios Apostolidis

, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras:
Performance over Random: A Robust Evaluation Protocol for Video Summarization Methods. 1056-1064 - Pravin Nagar, Mansi Khemka, Chetan Arora:

Concept Drift Detection for Multivariate Data Streams and Temporal Segmentation of Daylong Egocentric Videos. 1065-1074 - Shuyue Lan, Zhilu Wang, Amit K. Roy-Chowdhury, Ermin Wei

, Qi Zhu:
Distributed Multi-agent Video Fast-forwarding. 1075-1084
Oral Session D3: Multimodal Fusion and Embedding
- Yitian Yuan, Lin Ma, Jingwen Wang, Wenwu Zhu:

Controllable Video Captioning with an Exemplar Sentence. 1085-1093 - Qing Lin

, Bo Yan, Jichun Li
, Weimin Tan:
MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting. 1094-1102 - Yiheng Liu, Wengang Zhou, Mao Xi, Sanjing Shen, Houqiang Li:

Vision Meets Wireless Positioning: Effective Person Re-identification with Recurrent Context Propagation. 1103-1111 - Beichen Zhang

, Liang Li
, Li Su, Shuhui Wang, Jincan Deng, Zheng-Jun Zha, Qingming Huang:
Structural Semantic Adversarial Active Learning for Image Captioning. 1112-1121 - Devamanyu Hazarika, Roger Zimmermann, Soujanya Poria

:
MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. 1122-1131 - Liangming Pan, Jingjing Chen

, Jianlong Wu, Shaoteng Liu, Chong-Wah Ngo, Min-Yen Kan, Yu-Gang Jiang, Tat-Seng Chua:
Multi-modal Cooking Workflow Construction for Food Recipes. 1132-1141 - Yuqian Fu, Li Zhang, Junke Wang, Yanwei Fu

, Yu-Gang Jiang:
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition. 1142-1151 - David Semedo

, João Magalhães:
Adaptive Temporal Triplet-loss for Cross-modal Embedding Learning. 1152-1161
Oral Session E3: Music, Speech and Audio Processing in Multimedia & Social Media
- Yujia Wang, Wei Liang, Wanwan Li

, Dingzeyu Li, Lap-Fai Yu:
Scene-Aware Background Music Synthesis. 1162-1170 - Xutong Jin

, Sheng Li, Tianshu Qu, Dinesh Manocha, Guoping Wang:
Deep-Modal: Real-Time Impact Sound Synthesis for Arbitrary Shapes. 1171-1179 - Yu-Siang Huang, Yi-Hsuan Yang:

Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions. 1180-1188 - Zhejing Hu, Yan Liu, Gong Chen

, Sheng-hua Zhong, Aiwei Zhang:
Make Your Favorite Music Curative: Music Style Transfer for Anxiety Reduction. 1189-1197 - Yi Ren, Jinzheng He, Xu Tan

, Tao Qin
, Zhou Zhao, Tie-Yan Liu:
PopMAG: Pop Music Accompaniment Generation. 1198-1206 - Run Wang, Felix Juefei-Xu, Yihao Huang, Qing Guo, Xiaofei Xie

, Lei Ma, Yang Liu
:
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices. 1207-1216 - Yihao Huang, Felix Juefei-Xu, Run Wang, Qing Guo, Lei Ma, Xiaofei Xie

, Jianwen Li, Weikai Miao, Yang Liu
, Geguang Pu:
FakePolisher: Making DeepFakes More Detection-Evasive by Shallow Reconstruction. 1217-1226
Oral Session F3: Vision and Language
- Guohao Li

, Xin Wang, Wenwu Zhu:
Boosting Visual Question Answering with Context-aware Knowledge Aggregation. 1227-1235 - Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, Zheng-Jun Zha, Meng Wang:

Memory-Augmented Relation Network for Few-Shot Learning. 1236-1244 - Yiyi Zhou, Rongrong Ji, Xiaoshuai Sun, Gen Luo, Xiaopeng Hong, Jinsong Su, Xinghao Ding, Ling Shao:

K-armed Bandit based Multi-Modal Network Architecture Search for Visual Question Answering. 1245-1254 - Yuan Xie, Tianshui Chen, Tao Pu

, Hefeng Wu, Liang Lin:
Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition. 1255-1264 - Xiaoze Jiang, Siyi Du

, Zengchang Qin, Yajing Sun, Jing Yu:
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue. 1265-1273 - Gen Luo, Yiyi Zhou, Rongrong Ji, Xiaoshuai Sun, Jinsong Su, Chia-Wen Lin, Qi Tian:

Cascade Grouped Attention Network for Referring Expression Segmentation. 1274-1282
Oral Session G3: Vision and Language
- Jie Wu, Guanbin Li, Xiaoguang Han, Liang Lin:

Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos. 1283-1291 - Shengyu Zhang, Ziqi Tan, Jin Yu, Zhou Zhao, Kun Kuang, Jie Liu, Jingren Zhou, Hongxia Yang, Fei Wu:

Poet: Product-oriented Video Captioner for E-commerce. 1292-1301 - Lisai Zhang

, Qingcai Chen, Baotian Hu, Shuoran Jiang:
Text-Guided Neural Image Inpainting. 1302-1310 - Keyang Wang, Lei Zhang:

Single-Shot Two-Pronged Detector with Rectified IoU Loss. 1311-1319 - Huan Lin, Fandong Meng, Jinsong Su, Yongjing Yin, Zhengyuan Yang, Yubin Ge, Jie Zhou, Jiebo Luo

:
Dynamic Context-guided Capsule Network for Multimodal Machine Translation. 1320-1329 - Shitong Luo, Wei Hu:

Differentiable Manifold Reconstruction for Point Cloud Denoising. 1330-1338
Oral Session H3: Vision and Language
- Hongyi Zheng, Wangmeng Zuo, Lei Zhang

:
BS-MCVR: Binary-sensing based Mobile-cloud Visual Recognition. 1339-1347 - Jingjing Li, Mengmeng Jing, Lei Zhu, Zhengming Ding, Ke Lu, Yang Yang:

Learning Modality-Invariant Latent Representations for Generalized Zero-shot Learning. 1348-1356 - Yahui Liu, Marco De Nadai

, Deng Cai, Huayang Li, Xavier Alameda-Pineda, Nicu Sebe
, Bruno Lepri:
Describe What to Change: A Text-guided Unsupervised Image-to-image Translation Approach. 1357-1365 - Advaith Sridhar, Rohith Gandhi Ganesan, Pratyush Kumar, Mitesh M. Khapra:

INCLUDE: A Large Scale Dataset for Indian Sign Language Recognition. 1366-1375 - Run Wang, Felix Juefei-Xu, Qing Guo, Yihao Huang, Xiaofei Xie

, Lei Ma, Yang Liu
:
Amora: Black-box Adversarial Morphing Attack. 1376-1385 - Fan Yu, Haonan Wang, Tongwei Ren, Jinhui Tang, Gangshan Wu:

Visual Relation of Interest Detection. 1386-1394
Poster Session A1: Deep Learning for Multimedia
- Zhedong Zheng

, Yunchao Wei, Yi Yang:
University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization. 1395-1403 - Tao Dai, Yan Feng, Dongxian Wu, Bin Chen, Jian Lu, Yong Jiang, Shu-Tao Xia:

DIPDefend: Deep Image Prior Driven Defense against Adversarial Examples. 1404-1412 - Peng Zhang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Jing Lu

, Liang Qiao
, Yi Niu, Fei Wu:
TRIE: End-to-End Text Reading and Information Extraction for Document Understanding. 1413-1422 - Jiaming Zhang, Jitao Sang, Xian Zhao, Xiaowen Huang, Yanfeng Sun, Yongli Hu:

Adversarial Privacy-preserving Filter. 1423-1431 - Wei Peng

, Jingang Shi, Zhaoqiang Xia, Guoying Zhao:
Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition. 1432-1440 - Lizhao Liu, Junyi Cao, Minqian Liu, Yong Guo, Qi Chen, Mingkui Tan:

Dynamic Extension Nets for Few-shot Semantic Segmentation. 1441-1449 - Feifan Lv, Bo Liu, Feng Lu:

Fast Enhancement for Non-Uniform Illumination Images using Light-weight CNNs. 1450-1458 - Zili Yi, Qiang Tang, Vishnu Sanjay Ramiya Srinivasan, Zhan Xu:

Animating Through Warping: An Efficient Method for High-Quality Facial Expression Animation. 1459-1468 - Liang Han, Pichao Wang, Zhaozheng Yin, Fan Wang, Hao Li:

Exploiting Better Feature Aggregation for Video Object Detection. 1469-1477 - Chongyi Li

, Huazhu Fu
, Runmin Cong, Zechao Li, Qianqian Xu:
NuI-Go: Recursive Non-Local Encoder-Decoder Network for Retinal Image Non-Uniform Illumination Removal. 1478-1487 - Jie Zhao

, Kenan Dai, Dong Wang, Huchuan Lu, Xiaoyun Yang:
Online Filtering Training Samples for Robust Visual Tracking. 1488-1496 - Junfu Pu, Wengang Zhou, Hezhen Hu, Houqiang Li:

Boosting Continuous Sign Language Recognition via Cross Modality Augmentation. 1497-1505 - Chen Zhao, Bernard Ghanem

:
ThumbNet: One Thumbnail Image Contains All You Need for Recognition. 1506-1514 - Kaihua Zhang, Long Wang, Dong Liu, Bo Liu, Qingshan Liu, Zhu Li:

Dual Temporal Memory Network for Efficient Video Object Segmentation. 1515-1523
Poster Session B1: Deep Learning for Multimedia
- Zeyuan Wang, Yifan Zhao, Jia Li, Yonghong Tian:

Cooperative Bi-path Metric for Few-shot Learning. 1524-1532 - Yu Han, Shuai Yang

, Wenjing Wang
, Jiaying Liu
:
From Design Draft to Real Attire: Unaligned Fashion Image Translation. 1533-1541 - Fei Zhao, Ting Zhang, Chao Ma, Ming Tang, Jinqiao Wang, Xiaobo Wang:

Siamese Attentive Graph Tracking. 1542-1550 - Lingbo Yang, Shanshe Wang, Siwei Ma, Wen Gao, Chang Liu, Pan Wang, Peiran Ren:

HiFaceGAN: Face Renovation via Collaborative Suppression and Replenishment. 1551-1560 - Zhaohui Yang, Yunhe Wang, Chang Xu

, Peng Du, Chao Xu, Chunjing Xu, Qi Tian:
Discernible Image Compression. 1561-1569 - Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, Junsong Yuan:

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation. 1570-1578 - Xiaojun Jia

, Xingxing Wei, Xiaochun Cao, Xiaoguang Han:
Adv-watermark: A Novel Watermark Perturbation for Adversarial Examples. 1579-1587 - Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver Sangineto

, Nicu Sebe
:
Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild. 1588-1596 - Gang Li, Jian Li, Shanshan Zhang, Jian Yang:

Learning Hierarchical Graph for Occluded Pedestrian Detection. 1597-1605 - Taotao Jing, Haifeng Xia, Zhengming Ding:

Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation. 1606-1614 - Jinpeng Li, Shengcai Liao, Hangzhi Jiang, Ling Shao:

Box Guided Convolution for Pedestrian Detection. 1615-1624 - Yi-Fan Song

, Zhang Zhang, Caifeng Shan
, Liang Wang:
Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition. 1625-1633 - Xia Du

, Chi-Man Pun:
Adversarial Image Attacks Using Multi-Sample and Most-Likely Ensemble Methods. 1634-1642 - Cong Wang

, Xiaoying Xing, Yutong Wu, Zhixun Su
, Junyang Chen:
DCSFN: Deep Cross-scale Fusion Network for Single Image Rain Removal. 1643-1651
Poster Session C1: Deep Learning for Multimedia
- Yumeng Zhang, Gaoguo Jia, Li Chen, Mingrui Zhang, Junhai Yong:

Self-Paced Video Data Augmentation by Generative Adversarial Networks with Insufficient Samples. 1652-1660 - Xin Wen

, Zhizhong Han, Geunhyuk Youk, Yu-Shen Liu
:
CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention. 1661-1669 - Yunan Liu, Liang Zhao, Shanshan Zhang, Jian Yang:

Hybrid Resolution Network Using Edge Guided Region Mutual Information Loss for Human Parsing. 1670-1678 - Xiongwei Wu, Doyen Sahoo, Steven C. H. Hoi:

Meta-RCNN: Meta Learning for Few-Shot Object Detection. 1679-1687 - Ke Yang, Peng Zhang, Peng Qiao, Zhiyuan Wang, Dongsheng Li, Yong Dou:

Objectness Consistent Representation for Weakly Supervised Object Detection. 1688-1696 - Zhangkai Ni

, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong
:
Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network. 1697-1705 - Xierong Zhu, Jiawei Liu, Haoze Wu

, Meng Wang, Zheng-Jun Zha:
ASTA-Net: Adaptive Spatio-Temporal Attention Network for Person Re-Identification in Videos. 1706-1715 - Dan Zeng, Han Liu, Hui Lin, Shiming Ge:

Talking Face Generation with Expression-Tailored Generative Adversarial Network. 1716-1724 - Tianyu Yu

, Tianrui Hui, Zhihao Yu, Yue Liao, Sansi Yu, Faxi Zhang, Si Liu:
Cross-Modal Omni Interaction Modeling for Phrase Grounding. 1725-1734 - Yazhou Yao, Xiansheng Hua, Guanyu Gao, Zeren Sun

, Zhibin Li
, Jian Zhang
:
Bridging the Web Data and Fine-Grained Visual Recognition via Alleviating Label Noise and Domain Mismatch. 1735-1744 - Jiawei Zhao, Yifan Zhao, Jia Li, Xiaowu Chen:

Is Depth Really Necessary for Salient Object Detection? 1745-1754 - Nobukatsu Kajiura, Satoshi Kosugi, Xueting Wang, Toshihiko Yamasaki:

Self-Play Reinforcement Learning for Fast Image Retargeting. 1755-1763 - Ahmed Fares

, Sheng-hua Zhong, Jianmin Jiang:
Brain-media: A Dual Conditioned and Lateralization Supported GAN (DCLS-GAN) towards Visualization of Image-evoked Brain Activities. 1764-1772 - Guangming Yao, Yi Yuan, Tianjia Shao, Kun Zhou:

Mesh Guided One-shot Face Reenactment Using Graph Convolutional Networks. 1773-1781
Poster Session D1: Deep Learning for Multimedia
- Weihao Xia

, Yujiu Yang
, Jing-Hao Xue, Wensen Feng:
Controllable Continuous Gaze Redirection. 1782-1790 - Xinxiao Wu, Jialu Chen:

Preserving Global and Local Temporal Consistency for Arbitrary Video Style Transfer. 1791-1799 - Qinjie Xiao, Xiangjun Tang, You Wu, Leyang Jin, Yong-Liang Yang, Xiaogang Jin:

Deep Shapely Portraits. 1800-1808 - Xinchen Ye, Baoli Sun, Zhihui Wang, Jingyu Yang, Rui Xu, Haojie Li, Baopu Li:

Depth Super-Resolution via Deep Controllable Slicing Network. 1809-1818 - Chengcheng Ma, Weiliang Meng, Baoyuan Wu, Shibiao Xu, Xiaopeng Zhang:

Efficient Joint Gradient Based Attack Against SOR Defense for 3D Point Cloud Classification. 1819-1827 - Xiaofeng Cong

, Jie Gui, Kai-Chao Miao, Jun Zhang, Bing Wang, Peng Chen:
Discrete Haze Level Dehazing Network. 1828-1836 - Shikang Gan, Yong Luo, Yonggang Wen, Tongliang Liu

, Han Hu:
Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval. 1837-1845 - Meng Wei, Chun Yuan, Xiaoyu Yue, Kuo Zhong:

HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation. 1846-1854 - Lijian Lin, Haosheng Chen, Honglun Zhang, Jun Liang, Yu Li

, Ying Shan, Hanzi Wang:
Dual Semantic Fusion Network for Video Object Detection. 1855-1863 - Xiaodan Li, Yining Lang, Yuefeng Chen, Xiaofeng Mao

, Yuan He, Shuhui Wang, Hui Xue, Quan Lu:
Sharp Multiple Instance Learning for DeepFake Video Detection. 1864-1872 - Gang Fu, Qing Zhang, Qifeng Lin, Lei Zhu, Chunxia Xiao:

Learning to Detect Specular Highlights from Real-world Images. 1873-1881 - Jianping Luo, Shaofei Huang, Yuan Yuan:

Video Super-Resolution using Multi-scale Pyramid 3D Convolutional Networks. 1882-1890 - Hao Dou, Chen Chen, Xiyuan Hu

, Zuxing Xuan, Zhisen Hu
, Silong Peng:
PCA-SRGAN: Incremental Orthogonal Projection Discrimination for Face Super-resolution. 1891-1899 - Yizhi Wang, Zhouhui Lian:

Exploring Font-independent Features for Scene Text Recognition. 1900-1920
Poster Session E1: Deep Learning for Multimedia
- Zhangxuan Gu, Siyuan Zhou, Li Niu, Zihan Zhao

, Liqing Zhang:
Context-aware Feature Generation For Zero-shot Semantic Segmentation. 1921-1929 - Wenqing Liu, Miaojing Shi, Teddy Furon, Li Li:

Defending Adversarial Examples via DNN Bottleneck Reinforcement. 1930-1938 - Xun Yang, Xueliang Liu, Meng Jian, Xinjian Gao, Meng Wang:

Weakly-Supervised Video Object Grounding by Exploring Spatio-Temporal Contexts. 1939-1947 - Chon-Hou Sio, Yu-Jen Ma, Hong-Han Shuai, Jun-Cheng Chen

, Wen-Huang Cheng:
S2SiamFC: Self-supervised Fully Convolutional Siamese Network for Visual Tracking. 1948-1957 - Daniel Rotman, Yevgeny Yaroker, Elad Amrani, Udi Barzelay, Rami Ben-Ari:

Learnable Optimal Sequential Grouping for Video Scene Detection. 1958-1966 - Penghao Zhou, Chong Zhou

, Pai Peng, Junlong Du
, Xing Sun
, Xiaowei Guo, Feiyue Huang:
NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination. 1967-1975 - Chuangchuang Tan, Guanghua Gu, Tao Ruan, Shikui Wei, Yao Zhao:

Dual-Gradients Localization Framework for Weakly Supervised Object Localization. 1976-1984 - Weicong Chen, Xu Tan

, Yingce Xia, Tao Qin
, Yu Wang, Tie-Yan Liu:
DualLip: A System for Joint Lip Reading and Generation. 1985-1993 - Hao Tang, Song Bai, Nicu Sebe

:
Dual Attention GANs for Semantic Image Synthesis. 1994-2002 - Renwang Chen, Xuanhong Chen

, Bingbing Ni, Yanhao Ge:
SimSwap: An Efficient Framework For High Fidelity Face Swapping. 2003-2011 - Jialian Wu, Chunluan Zhou, Qian Zhang, Ming Yang

, Junsong Yuan:
Self-Mimic Learning for Small-scale Pedestrian Detection. 2012-2020 - Chuan Guo, Xinxin Zuo

, Sen Wang
, Shihao Zou, Qingyao Sun, Annan Deng
, Minglun Gong
, Li Cheng:
Action2Motion: Conditioned Generation of 3D Human Motions. 2021-2029 - Hui Zhang, Chuan Wang, Nenglun Chen, Jue Wang

, Wenping Wang:
Skin Textural Generation via Blue-noise Gabor Filtering based Generative Adversarial Network. 2030-2038 - Jiapeng Li, Ping Wei

, Yongchi Zhang, Nanning Zheng:
A Slow-I-Fast-P Architecture for Compressed Video Action Recognition. 2039-2047
Poster Session F1: Deep Learning for Multimedia
- Peisong Wen, Ruolin Yang, Qianqian Xu, Chen Qian, Qingming Huang, Runmin Cong, Jianlou Si:

DMVOS: Discriminative Matching for Real-time Video Object Segmentation. 2048-2056 - Zhensheng Shi, Liangjie Cao, Cheng Guan, Ju Liang, Qianqian Li, Zhaorui Gu, Haiyong Zheng, Bing Zheng:

Multi-Group Multi-Attention: Towards Discriminative Spatiotemporal Representation. 2057-2066 - Wei Yan, Ruonan Zhang, Jing Wang, Shan Liu, Thomas H. Li, Ge Li:

Vaccine-style-net: Point Cloud Completion in Implicit Continuous Function Space. 2067-2075 - Yumeng Zhang, Li Chen, Yufeng Liu

, Wen Zheng, Junhai Yong:
Adaptive Wasserstein Hourglass for Weakly Supervised RGB 3D Hand Pose Estimation. 2076-2084 - Weide Liu

, Chi Zhang, Guosheng Lin, Tzu-Yi Hung, Chunyan Miao:
Weakly Supervised Segmentation with Maximum Bipartite Graph Matching. 2085-2094 - Daksh Thapar, Aditya Nigam

, Chetan Arora:
Recognizing Camera Wearer from Hand Gestures in Egocentric Videos: https: //egocentricbiometric.github.io/. 2095-2103 - Zijian Wang

, Yadan Luo
, Zi Huang
, Mahsa Baktashmotlagh
:
Prototype-Matching Graph Network for Heterogeneous Domain Adaptation. 2104-2112 - Huanrong Zhang

, Zhi Jin, Xiaojun Tan, Xiying Li:
Towards Lighter and Faster: Learning Wavelets Progressively for Image Super-Resolution. 2113-2121 - Zhen Huang, Xu Shen, Xinmei Tian, Houqiang Li, Jianqiang Huang, Xian-Sheng Hua:

Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition. 2122-2130 - Wenheng Chen, He Wang

, Yi Yuan, Tianjia Shao, Kun Zhou:
Dynamic Future Net: Diversified Human Motion Generation. 2131-2139 - Xing Lan, Qinghao Hu, Fangzhou Xiong, Cong Leng, Jian Cheng:

ATF: Towards Robust Face Alignment via Leveraging Similarity and Diversity across Different Datasets. 2140-2148 - Nan Pu

, Wei Chen, Yu Liu, Erwin M. Bakker
, Michael S. Lew:
Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification. 2149-2158 - Chong Mou, Xin Zhang:

Attention Based Dual Branches Fingertip Detection Network and Virtual Key System. 2159-2165 - Md. Moniruzzaman, Zhaozheng Yin, Zhihai He, Ruwen Qin, Ming C. Leu:

Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization. 2166-2174
Poster Session G1: Deep Learning for Multimedia
- Akash Gupta, Rameswar Panda, Sujoy Paul, Jianming Zhang, Amit K. Roy-Chowdhury:

Adversarial Knowledge Transfer from Unlabeled Data. 2175-2183 - Xiaoqing Liang, Xu Zhao, Chaoyang Zhao, Nanfei Jiang, Ming Tang, Jinqiao Wang:

Task Decoupled Knowledge Distillation For Lightweight Face Detectors. 2184-2192 - Li Tao, Xueting Wang, Toshihiko Yamasaki:

Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework. 2193-2201 - Jie Liu

, Minqiang Zou, Jie Tang, Gangshan Wu:
Memory Recursive Network for Single Image Super-Resolution. 2202-2210 - Ying Chen, Lifeng Huang

, Chengying Gao, Ning Liu:
Scale-aware Progressive Optimization Network. 2211-2219 - Junguang Jiang, Ximei Wang, Mingsheng Long

, Jianmin Wang
:
Resource Efficient Domain Adaptation. 2220-2228 - Lina Wang, Kang Yang

, Wenqi Wang, Run Wang, Aoshuang Ye
:
MGAAttack: Toward More Query-efficient Black-box Attack by Microbial Genetic Algorithm. 2229-2236 - Ling Lei

, Jianfeng Li
, Tong Chen, Shigang Li:
A Novel Graph-TCN with a Graph Structured Representation for Micro-expression Recognition. 2237-2245 - Mengyue Geng, Peixi Peng, Yangru Huang, Yonghong Tian:

Masked Face Recognition with Generative Data Augmentation and Domain Constrained Ranking. 2246-2254 - Junhua Liao

, Haihan Duan, Xin Li, Haoran Xu
, Yanbing Yang, Wei Cai
, Yanru Chen, Liangyin Chen:
Occlusion Detection for Automatic Video Editing. 2255-2263 - Yi Zheng, Yifan Zhao, Mengyuan Ren, He Yan, Xiangju Lu, Junhui Liu, Jia Li:

Cartoon Face Recognition: A Benchmark Dataset. 2264-2272 - Xiquan Guan, Huamin Feng, Weiming Zhang, Hang Zhou, Jie Zhang

, Nenghai Yu:
Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication. 2273-2280 - Feifei Ding, Peixi Peng, Yangru Huang, Mengyue Geng, Yonghong Tian:

Masked Face Recognition with Latent Part Detection. 2281-2289 - Chunyan Zhang, Songhua Xu, Zongfang Li:

PanelNet: A Novel Deep Neural Network for Predicting Collective Diagnostic Ratings by a Panel of Radiologists for Pulmonary Nodules. 2290-2298
Poster Session H1: Deep Learning for Multimedia
- Xuan-Son Vu, Duc-Trong Le, Christoffer Edlund, Lili Jiang, Hoang D. Nguyen:

Privacy-Preserving Visual Content Tagging using Graph Transformer Networks. 2299-2307 - Youngjoong Kwon, Stefano Petrangeli, Dahun Kim, Haoliang Wang, Henry Fuchs, Viswanathan Swaminathan:

Rotationally-Consistent Novel View Synthesis for Humans. 2308-2316 - Minhao Fan, Wenjing Wang

, Wenhan Yang, Jiaying Liu
:
Integrating Semantic Segmentation and Retinex Model for Low-Light Image Enhancement. 2317-2325 - Xixia Xu, Qi Zou, Xue Lin:

Alleviating Human-level Shift: A Robust Domain Adaptation Method for Multi-person Pose Estimation. 2326-2335 - Lei Zhao, Sihuan Lin, Ailin Li, Huaizhong Lin, Wei Xing, Dongming Lu:

SpatialGAN: Progressive Image Generation Based on Spatial Recursive Adversarial Expansion. 2336-2344 - Li-Ming Zhan, Bo Liu, Lu Fan

, Jiaxin Chen, Xiao-Ming Wu
:
Medical Visual Question Answering via Conditional Reasoning. 2345-2354 - Jing Zhang

, Yang Cao, Zheng-Jun Zha, Dacheng Tao
:
Nighttime Dehazing with a Synthetic Benchmark. 2355-2363 - Chenru Jiang, Kaizhu Huang

, Shufei Zhang, Xinheng Wang
, Jimin Xiao:
Pay Attention Selectively and Comprehensively: Pyramid Gating Network for Human Pose Estimation without Pre-training. 2364-2371 - Chuanyi Zhang, Yazhou Yao, Xiangbo Shu, Zechao Li, Zhenmin Tang, Qi Wu:

Data-driven Meta-set Based Fine-Grained Visual Recognition. 2372-2381 - Bojia Zi, Minghao Chang, Jingjing Chen

, Xingjun Ma, Yu-Gang Jiang:
WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection. 2382-2390 - Ce Zheng

, Yecheng Lyu, Ming Li, Ziming Zhang:
LodoNet: A Deep Neural Network with 2D Keypoint Matching for 3D LiDAR Odometry Estimation. 2391-2399 - Weitao Wang, Ruyang Liu, Meng Wang, Sen Wang

, Xiaojun Chang
, Yang Chen:
Memory-Based Network for Scene Graph with Unbalanced Relations. 2400-2408 - Haotian Wang, Wenjing Yang, Ji Wang, Ruxin Wang, Long Lan, Mingyang Geng:

Pairwise Similarity Regularization for Adversarial Domain Adaptation. 2409-2418 - Mingyao Hong, Guorong Li, Xinfeng Zhang, Qingming Huang:

Generalized Zero-Shot Video Classification via Generative Adversarial Networks. 2419-2426
Poster Session A2: Deep Learning for Multimedia
- Maciej Tomczak, Masataka Goto

, Jason Hockman:
Drum Synthesis and Rhythmic Transformation with Adversarial Autoencoders. 2427-2435 - Guibiao Liao, Wei Gao

, Qiuping Jiang, Ronggang Wang, Ge Li:
MMNet: Multi-Stage and Multi-Scale Fusion Network for RGB-D Salient Object Detection. 2436-2444 - Songhua Liu

, Hao Wu, Shoutong Luo, Zhengxing Sun:
Stable Video Style Transfer Based on Partial Convolution with Depth-Aware Supervision. 2445-2453 - Yimeng Zhang

, Xiao-Yang Liu, Bo Wu, Anwar Walid:
Video Synthesis via Transform-Based Tensor Neural Network. 2454-2462 - Ziming Wang, Yuexian Zou, Zeming Zhang:

Cluster Attention Contrast for Video Anomaly Detection. 2463-2471 - Wolmer Bigi, Claudio Baecchi, Alberto Del Bimbo:

Automatic Interest Recognition from Posture and Behaviour. 2472-2480 - Yangfan Sun, Li Li, Zhu Li, Shan Liu:

Referenceless Rate-Distortion Modeling with Learning from Bitstream and Pixel Features. 2481-2489 - Lilang Lin, Sijie Song, Wenhan Yang, Jiaying Liu

:
MS2L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition. 2490-2498 - Dang-Khoa Nguyen, Wei-Lun Tseng

, Hong-Han Shuai:
Domain-Adaptive Object Detection via Uncertainty-Aware Distribution Alignment. 2499-2507 - Zhenyu Wu, Duc Hoang, Shih-Yao Lin, Yusheng Xie, Liangjian Chen, Yen-Yu Lin

, Zhangyang Wang, Wei Fan:
MM-Hand: 3D-Aware Multi-Modal Guided Hand Generation for 3D Hand Pose Synthesis. 2508-2516 - Cong Wang

, Yutong Wu, Zhixun Su
, Junyang Chen:
Joint Self-Attention and Scale-Aggregation for Self-Calibrated Deraining Network. 2517-2525 - Ling-An Zeng

, Fa-Ting Hong, Wei-Shi Zheng, Qi-Zhi Yu, Wei Zeng
, Yaowei Wang, Jian-Huang Lai:
Hybrid Dynamic-static Context-aware Attention Network for Action Assessment in Long Videos. 2526-2534 - Yan Hong, Li Niu, Jianfu Zhang, Weijie Zhao, Chen Fu, Liqing Zhang:

F2GAN: Fusing-and-Filling GAN for Few-shot Image Generation. 2535-2543 - Xianggang Yu, Haolin Liu

, Xiaoguang Han, Zhen Li, Zixiang Xiong, Shuguang Cui
:
JAFPro: Joint Appearance Fusion and Propagation for Human Video Motion Transfer from Multiple Reference Images. 2544-2552
Poster Session B2: Deep Learning for Multimedia & Emerging Multimedia Applications
- Jakub Lokoc, Tomás Soucek, Patrik Veselý, Frantisek Mejzlík, Jiaqi Ji, Chaoxi Xu, Xirong Li

:
A W2VV++ Case Study with Automated and Interactive Text-to-Video Retrieval. 2553-2561 - Yucheng Hang, Qingmin Liao, Wenming Yang, Yupeng Chen, Jie Zhou:

Attention Cube Network for Image Restoration. 2562-2570 - Yu Zhou, Hongtao Xie, Shancheng Fang, Yan Li, Yongdong Zhang:

CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes. 2571-2580 - Xiaoqian Guo, Xiangyang Li

, Shuqiang Jiang:
Expressional Region Retrieval. 2581-2589 - Shuyuan Li, Jianguo Li, Hanlin Tang, Rui Qian, Weiyao Lin

:
ATRW: A Benchmark for Amur Tiger Re-identification in the Wild. 2590-2598 - Weiying Wang, Jieting Chen, Qin Jin:

VideoIC: A Video Interactive Comments Dataset and Multimodal Multitask Learning for Comments Generation. 2599-2607 - Jiewen Zhao, Ruize Han, Yiyang Gan, Liang Wan, Wei Feng, Song Wang

:
Human Identification and Interaction Detection in Cross-View Multi-Person Videos with Wearable Cameras. 2608-2616 - Miaohui Wang

, Wuyuan Xie, Maolin Cui:
Surface Reconstruction with Unconnected Normal Maps: An Efficient Mesh-based Approach. 2617-2625 - Murari Mandal, Lav Kush Kumar, Santosh Kumar Vipparthi

:
MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos. 2626-2635 - Xuewen Yang

, Dongliang Xie, Xin Wang, Jiangbo Yuan, Wanying Ding, Pengyun Yan:
Learning Tuple Compatibility for Conditional Outfit Recommendation. 2636-2644 - Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin:

Efficient Crowd Counting via Structured Knowledge Transfer. 2645-2654 - Yifei Huang

, Chenhui Li, Xiaohu Guo, Jing Liao
, Chenxu Zhang, Changbo Wang:
DeSmoothGAN: Recovering Details of Smoothed Images via Spatial Feature-wise Transformation and Full Attention. 2655-2663 - Hyewon Song, Jaeseong Park, Suwoong Heo, Jiwoo Kang

, Sanghoon Lee:
PatchMatch based Multiview Stereo with Local Quadric Window. 2664-2672 - Alexander Tesch, Ralf Dörner

:
Expert Performance in the Examination of Interior Surfaces in an Automobile: Virtual Reality vs. Reality. 2673-2681
Poster Session C2: Emerging Multimedia Applications
- Wentao Bao

, Qi Yu
, Yu Kong:
Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning. 2682-2690 - Xuan Shao, Lin Zhang, Tianjun Zhang, Ying Shen, Hongyu Li, Yicong Zhou

:
A Tightly-coupled Semantic SLAM System with Visual, Inertial and Surround-view Sensors for Autonomous Indoor Parking. 2691-2699 - Yimu Wang, Shiyin Lu, Lijun Zhang:

Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy. 2700-2709 - Xin Wang, Huijun Zhang, Lei Cao, Ling Feng:

Leverage Social Media for Personalized Stress Detection. 2710-2718 - Yingying Deng, Fan Tang, Weiming Dong, Wen Sun, Feiyue Huang, Changsheng Xu:

Arbitrary Style Transfer via Multi-Adaptation Network. 2719-2727 - Jingcai Guo

, Shiheng Ma, Jie Zhang, Qihua Zhou, Song Guo:
Dual-view Attention Networks for Single Image Super-Resolution. 2728-2736 - Zhongnian Li, Tao Zhang, Ruoyu Chen, Daoqiang Zhang:

MRI Measurement Matrix Learning via Correlation Reweighting. 2737-2745 - Ruize Han, Jiewen Zhao, Wei Feng, Yiyang Gan, Liang Wan, Song Wang

:
Complementary-View Co-Interest Person Detection. 2746-2754 - Weidong He, Zhi Li, Dongcai Lu, Enhong Chen

, Tong Xu, Baoxing Huai, Jing Yuan:
Multimodal Dialogue Systems via Capturing Context-aware Dependencies of Semantic Elements. 2755-2764 - Carlos Bermejo, Dimitris Chatzopoulos

, Pan Hui:
EyeShopper: Estimating Shoppers' Gaze using CCTV Cameras. 2765-2774 - Eugene Yujun Fu

, Zhongqi Yang
, Hong Va Leong, Grace Ngai, Chi-Wai Do
, Lily Chan
:
Exploiting Active Learning in Novel Refractive Error Detection with Smartphones. 2775-2783 - Liang Han, Zhaozheng Yin, Zhurong Xia, Minqian Tang, Rong Jin:

Price Suggestion for Online Second-hand Items with Texts and Images. 2784-2792 - Xuebin Sun, Sukai Wang, Miaohui Wang, Shing Shin Cheng, Ming Liu:

An Advanced LiDAR Point Cloud Sequence Coding Scheme for Autonomous Driving. 2793-2801 - Xing Xu, Jiefu Chen, Jinhui Xiao, Zheng Wang

, Yang Yang, Heng Tao Shen:
Learning Optimization-based Adversarial Perturbations for Attacking Sequential Recognition Models. 2802-2822
Poster Session D2: Emerging Multimedia Applications & Emotional and Social Signals in Multimedia
- Trisha Mittal, Uttaran Bhattacharya

, Rohan Chandra, Aniket Bera, Dinesh Manocha:
Emotions Don't Lie: An Audio-Visual Deepfake Detection Method using Affective Cues. 2823-2832 - Delian Ruan, Yan Yan, Si Chen, Jing-Hao Xue, Hanzi Wang:

Deep Disturbance-Disentangled Learning for Facial Expression Recognition. 2833-2841 - Xinhui Song, Tianyang Shi, Zunlei Feng, Mingli Song, Jackie Lin, Chuanjie Lin, Changjie Fan, Yi Yuan:

Unsupervised Learning Facial Parameter Regressor for Action Unit Intensity Estimation via Differentiable Renderer. 2842-2851 - Jingjun Liang, Ruichen Li, Qin Jin:

Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching. 2852-2861 - Songcheng Gao, Wenzhong Li, Lynda J. Song

, Xiao Zhang, Mingkai Lin
, Sanglu Lu:
PersonalitySensing: A Multi-View Multi-Task Learning Approach for Personality Detection based on Smartphone Usage. 2862-2870 - Hong-Xia Xie, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng:

AU-assisted Graph Attention Convolutional Network for Micro-Expression Recognition. 2871-2880 - Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang

, Wanchuang Xia, Cheng Lu, Jiateng Liu:
DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions in the Wild. 2881-2889 - Zheng Zhang, Taoyue Wang, Lijun Yin:

Region of Interest Based Graph Convolution: A Heatmap Regression Approach for Action Unit Detection. 2890-2898 - Junjie Zhu, Bingjun Luo, Sicheng Zhao, Shihui Ying, Xibin Zhao, Yue Gao:

IExpressNet: Facial Expression Recognition with Incremental Classes. 2899-2908 - Ziyu Jia, Youfang Lin

, Xiyang Cai, Haobin Chen, Haijun Gou, Jing Wang:
SST-EmotionNet: Spatial-Spectral-Temporal based Attention 3D Dense Network for EEG Emotion Recognition. 2909-2917 - Connor T. Heaton, David M. Schwartz:

Language Models as Emotional Classifiers for Textual Conversation. 2918-2926 - Bin Xia, Shangfei Wang:

Occluded Facial Expression Recognition with Step-Wise Assistance from Unpaired Non-Occluded Images. 2927-2935 - Bin Xia, Weikang Wang, Shangfei Wang, Enhong Chen

:
Learning from Macro-expression: a Micro-expression Recognition Framework. 2936-2944 - Sicheng Zhao, Yaxian Li, Xingxu Yao, Weizhi Nie

, Pengfei Xu, Jufeng Yang, Kurt Keutzer:
Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space. 2945-2954
Poster Session E2: Emotional and Social Signals in Multimedia & Media Interpretation
- Zhiwei Xu, Shangfei Wang, Can Wang:

Exploiting Multi-Emotion Relations at Feature and Label Levels for Emotion Tagging. 2955-2963 - Linyi Zhou, Xijian Fan, Yingjie Ma, Tardi Tjahjadi, Qiaolin Ye:

Uncertainty-aware Cross-dataset Facial Expression Recognition via Regularized Conditional Alignment. 2964-2972 - Tugba Kulahcioglu, Gerard de Melo:

Fonts Like This but Happier: A New Way to Discover Fonts. 2973-2981 - Huiyuan Yang, Taoyue Wang, Lijun Yin:

Adaptive Multimodal Fusion for Facial Action Units Recognition. 2982-2990 - Shi Yin, Shangfei Wang, Xiaoping Chen, Enhong Chen

:
Exploiting Self-Supervised and Semi-Supervised Learning for Facial Landmark Tracking with Unlabeled Data. 2991-2998 - Woan-Shiuan Chien

, Hao-Chun Yang
, Chi-Chun Lee
:
Cross Corpus Physiological-based Emotion Recognition Using a Learnable Visual Semantic Graph Convolutional Network. 2999-3006 - Mengshi Qi, Jie Qin, Xiantong Zhen, Di Huang, Yi Yang, Jiebo Luo

:
Few-Shot Ensemble Learning for Video Classification with SlowFast Memory Networks. 3007-3015 - Chenyu Li, Shiming Ge, Daichi Zhang, Jia Li:

Look Through Masks: Towards Masked Face Recognition with De-Occlusion Distillation. 3016-3024 - Jizhe Zhou, Chi-Man Pun, Yu Tong

:
Privacy-sensitive Objects Pixelation for Live Video Streaming. 3025-3033 - Jiaxin Chen, Jie Qin, Yichao Yan, Lei Huang, Li Liu, Fan Zhu, Ling Shao:

Deep Local Binary Coding for Person Re-Identification by Delving into the Details. 3034-3043 - Hai Xu, Hongtao Xie, Zheng-Jun Zha, Sun'ao Liu, Yongdong Zhang:

March on Data Imperfections: Domain Division and Domain Generalization for Semantic Segmentation. 3044-3053 - Beibei Lin, Shunli Zhang, Feng Bao:

Gait Recognition with Multiple-Temporal-Scale 3D Convolutional Neural Network. 3054-3062 - Yi Li, Wenjie Pei, Zhenyu He:

SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space. 3063-3071 - Jianbo Jiao

, Ying Cao, Manfred Lau, Rynson W. H. Lau:
Tactile Sketch Saliency. 3072-3080
Poster Session F2: Media Interpretation & Mobile Multimedia
- Zhengrui Ma, Zhao Kang, Guangchun Luo, Ling Tian, Wenyu Chen:

Towards Clustering-friendly Representations: Subspace Clustering via Graph Filtering. 3081-3089 - Yuyu Guo, Jingkuan Song, Lianli Gao, Heng Tao Shen:

One-shot Scene Graph Generation. 3090-3098 - Huiyuan Fu, Ting Yu, Xin Wang, Huadong Ma:

Cross-Granularity Learning for Multi-Domain Image-to-Image Translation. 3099-3107 - Rui Li, Xiantuo He, Yu Zhu, Xianjun Li, Jinqiu Sun, Yanning Zhang:

Enhancing Self-supervised Monocular Depth Estimation via Incorporating Robust Constraints. 3108-3117 - Tuo Feng

, Licheng Jiao
, Hao Zhu, Long Sun:
A Novel Object Re-Track Framework for 3D Point Clouds. 3118-3126 - Zixuan Su, Xindi Shang, Jingjing Chen

, Yu-Gang Jiang, Zhiyong Qiu, Tat-Seng Chua:
Video Relation Detection via Multiple Hypothesis Association. 3127-3135 - Lin Huang

, Jianchao Tan, Jingjing Meng, Ji Liu, Junsong Yuan:
HOT-Net: Non-Autoregressive Transformer for 3D Hand-Object Pose Estimation. 3136-3145 - Lixuan Meng, Chenggang Yan, Jun Li, Jian Yin, Wu Liu, Hongtao Xie, Liang Li

:
Multi-Features Fusion and Decomposition for Age-Invariant Face Recognition. 3146-3154 - Hongshuo Tian

, Ning Xu, An-An Liu, Yongdong Zhang:
Part-Aware Interactive Learning for Scene Graph Generation. 3155-3163 - Raul Gomez, Yahui Liu, Marco De Nadai

, Dimosthenis Karatzas
, Bruno Lepri, Nicu Sebe
:
Retrieval Guided Unsupervised Multi-domain Image to Image Translation. 3164-3172 - Liuwan Zhu, Rui Ning, Cong Wang, Chunsheng Xin, Hongyi Wu:

GangSweep: Sweep out Neural Backdoors by GAN. 3173-3181 - Zhengcong Fei:

Iterative Back Modification for Faster Image Captioning. 3182-3190 - Carlos Bermejo, Tristan Braud, Ji Yang, Shayan Mirjafari

, Bowen Shi, Yu Xiao
, Pan Hui:
VIMES: A Wearable Memory Assistance System for Automatic Information Retrieval. 3191-3200
Poster Session G2: Multimedia -- Art and Entertainment, Cloud and Edge Computing, Data Systems, & HCI
- Tianyang Shi, Zhengxia Zou, Xinhui Song, Zheng Song, Changjian Gu, Changjie Fan, Yi Yuan:

Neutral Face Game Character Auto-Creation via PokerFace-GAN. 3201-3209 - Peng Lu

, Jinbei Yu, Xujun Peng, Zhaoran Zhao, Xiaojie Wang:
Gray2ColorNet: Transfer More Colors from Reference Image. 3210-3218 - Cheng-Che Lee, Wan-Yi Lin, Yen-Ting Shih, Pei-Yi (Patricia) Kuo, Li Su:

Crossing You in Style: Cross-modal Style Transfer from Music to Visual Arts. 3219-3227 - Keyu Chen, Jianmin Zheng, Jianfei Cai, Juyong Zhang:

Modeling Caricature Expressions by 3D Blendshape and Dynamic Texture. 3228-3236 - Jia Li, Nan Gao

, Tong Shen, Wei Zhang, Tao Mei
, Hui Ren:
SketchMan: Learning to Create Professional Sketches. 3237-3245 - Xuanhong Chen

, Xirui Yan, Naiyuan Liu, Ting Qiu, Bingbing Ni:
Anisotropic Stroke Control for Multiple Artists Style Transfer. 3246-3255 - Hao Hao, Changqiao Xu, Lujie Zhong, Gabriel-Miro Muntean:

A Multi-update Deep Reinforcement Learning Algorithm for Edge Computing Service Offloading. 3256-3264 - Zichuan Xu, Jiangkai Wu, Qiufen Xia, Pan Zhou, Jiankang Ren, Huizhi Liang:

Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds. 3265-3273 - Wanqian Zhang, Dayan Wu, Yu Zhou

, Bo Li
, Weiping Wang
, Dan Meng:
Deep Unsupervised Hybrid-similarity Hadamard Hashing. 3274-3282 - Mengmeng Jing, Jingjing Li, Lei Zhu, Ke Lu, Yang Yang, Zi Huang

:
Incomplete Cross-modal Retrieval with Dual-Aligned Variational Autoencoders. 3283-3291 - Tie Liu, Mai Xu, Shengxi Li, Rui Ding, Huaida Liu:

MRS-Net: Multi-Scale Recurrent Scalable Network for Face Quality Enhancement of Compressed Videos. 3292-3301 - Jasper R. R. Uijlings, Mykhaylo Andriluka, Vittorio Ferrari:

Panoptic Image Annotation with a Collaborative Assistant. 3302-3310 - Jari Korhonen, Yicheng Su, Junyong You:

Blind Natural Video Quality Prediction via Statistical Temporal Features and Deep Spatial Features. 3311-3319
Session H2: Multimedia HCI, Multimeda Scalability and Management, & Multimedia Search and Recommendation
- Zhiyuan Hu, Jia Jia, Bei Liu, Yaohua Bu, Jianlong Fu:

Aesthetic-Aware Image Style Transfer. 3320-3329 - Naoki Sugimoto, Yoshihito Ebine, Kiyoharu Aizawa:

Building Movie Map - A Tool for Exploring Areas in a City - and its Evaluations. 3330-3338 - Jing Li, Suiyi Ling, Junle Wang, Patrick Le Callet:

A Probabilistic Graphical Model for Analyzing the Subjective Visual Quality Assessment Data from Crowdsourcing. 3339-3347 - Linsheng Li, Bin Yang, Cathy Bao, Shuo Liu, Randy Xu, Yong Yao, Mohammad R. Haghighat, Jerry W. Hu, Shoumeng Yan, Zhengwei Qi:

DroidCloud: Scalable High Density AndroidTM Cloud Rendering. 3348-3356 - Jiaxin Wu

, Chong-Wah Ngo:
Interpretable Embedding for Ad-Hoc Video Search. 3357-3366 - Feifei Zhang, Mingliang Xu, Qirong Mao, Changsheng Xu:

Joint Attribute Manipulation and Modality Alignment Learning for Composing Text and Image to Image Retrieval. 3367-3376 - Yangxi Li, Han Hu, Jin Li

, Yong Luo, Yonggang Wen:
Semi-supervised Online Multi-Task Metric Learning for Visual Recognition and Retrieval. 3377-3385 - Yu-Wei Zhan, Xin Luo

, Yongxin Wang, Xin-Shun Xu:
Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval. 3386-3394 - Weizhi Nie

, Yue Zhao, An-An Liu, Zan Gao, Yuting Su:
Multi-graph Convolutional Network for Unsupervised 3D Shape Retrieval. 3395-3403 - Wenjie Yang, Dangwei Li, Xiaotang Chen, Kaiqi Huang:

Bottom-Up Foreground-Aware Feature Fusion for Person Search. 3404-3412 - Zhi Chen

, Sen Wang
, Jingjing Li, Zi Huang
:
Rethinking Generative Zero-Shot Learning: An Ensemble Learning Perspective for Recognising Visual Patches. 3413-3421 - Yanan Wang, Shengcai Liao, Ling Shao:

Surpassing Real-World Source Training Data: Random 3D Characters for Generalizable Person Re-Identification. 3422-3430 - Meng-Jiun Chiou

, Zhenguang Liu, Yifang Yin
, An-An Liu, Roger Zimmermann:
Zero-Shot Multi-View Indoor Localization via Graph Location Networks. 3431-3440 - Kecheng Zheng, Wu Liu, Jiawei Liu, Zheng-Jun Zha, Tao Mei

:
Hierarchical Gumbel Attention Network for Text-based Person Search. 3441-3449
Poster Session A3: Multimedia Search and Recommendation & Multimedia System and Middleware
- Jiawei Liu, Zheng-Jun Zha, Richang Hong, Meng Wang, Yongdong Zhang:

Dual Context-Aware Refinement Network for Person Search. 3450-3459 - Lei Meng, Fuli Feng, Xiangnan He, Xiaoyan Gao, Tat-Seng Chua:

Heterogeneous Fusion of Semantic and Collaborative Information for Visually-Aware Food Recommendation. 3460-3468 - Xiaoyu Du, Xiang Wang, Xiangnan He, Zechao Li, Jinhui Tang, Tat-Seng Chua:

How to Learn Item Representation for Cold-Start Multimedia Recommendation? 3469-3477 - Xuzheng Yu, Tian Gan, Yinwei Wei, Zhiyong Cheng, Liqiang Nie:

Personalized Item Recommendation for Second-hand Trading Platform. 3478-3486 - Hao Jiang, Wenjie Wang, Yinwei Wei, Zan Gao, Yinglong Wang, Liqiang Nie:

What Aspect Do You Like: Multi-scale Time-aware User Interest Modeling for Micro-video Recommendation. 3487-3495 - Yuting Su, Yuqian Li, Dan Song, Zhendong Mao, Xuanya Li, An-An Liu:

Domain-Specific Alignment Network for Multi-Domain Image-Based 3D Object Retrieval. 3496-3504 - Jun Hu, Quan Fang, Shengsheng Qian, Changsheng Xu:

Multi-modal Attentive Graph Pooling Model for Community Question Answer Matching. 3505-3513 - Tianwei Cao, Qianqian Xu, Zhiyong Yang, Qingming Huang:

Task-distribution-aware Meta-learning for Cold-start CTR Prediction. 3514-3522 - Ziruo Sun, Xiushan Nie, Xiaoming Xi, Yilong Yin:

CFVMNet: A Multi-branch Network for Vehicle Re-identification Based on Common Field of View. 3523-3531 - Chunyuan Yuan, Qianwen Ma, Junyang Chen, Wei Zhou, Xiaodan Zhang, Xuehai Tang, Jizhong Han

, Songlin Hu
:
Exploiting Heterogeneous Artist and Listener Preference Graph for Music Genre Classification. 3532-3540 - Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Tat-Seng Chua:

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback. 3541-3549 - Riddhiman Dasgupta, Francis Tom, Sudhir Kumar, Mithun Das Gupta, Yokesh Kumar, Badri N. Patro

, Vinay P. Namboodiri:
Visually Precise Query. 3550-3558 - Dingjian Jin, Anke Zhang, Jiamin Wu

, Gaochang Wu
, Haoqian Wang, Lu Fang:
All-in-depth via Cross-baseline Light Field Camera. 3559-3567 - Mohammad Amin Arab, Puria Azadi Moghadam, Mohamed E. Hussein

, Wael Abd-Almageed, Mohamed Hefeeda
:
Revealing True Identity: Detecting Makeup Attacks in Face-based Biometric Systems. 3568-3576
Poster Session B3: Multimedia System and Middleware & Multimedia Telepresence and Virtual/Augmented Reality
- Negin Ghamsarian

, Hadi Amirpour
, Christian Timmerer, Mario Taschwer, Klaus Schöffmann:
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. 3577-3585 - Chirag Raman, Stephanie Tan, Hayley Hung:

A Modular Approach for Synchronized Wireless Multimodal Multisensor Data Acquisition in Highly Dynamic Social Settings. 3586-3594 - Shuoqian Wang, Xiaoyang Zhang, Mengbai Xiao, Kenneth Chiu, Yao Liu:

SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication. 3595-3603 - Yawen Lu, Yuxing Wang, Guoyu Lu

:
Single Image Shape-from-Silhouettes. 3604-3613 - Zhongze Tang, Xianglong Feng, Yi Xie, Huy Phan

, Tian Guo
, Bo Yuan, Sheng Wei:
VVSec: Securing Volumetric Video Streaming via Benign Use of Adversarial Perturbation. 3614-3623 - Viktor Kelkkanen, Markus Fiedler, David Lindero:

Bitrate Requirements of Non-Panoramic VR Remote Rendering. 3624-3631 - Serhan Gül

, Sebastian Bosse, Dimitri Podborski, Thomas Schierl, Cornelius Hellge:
Kalman Filter-based Head Motion Prediction for Cloud-based Mixed Reality. 3632-3641 - Chaoyang Zeng, Tiesong Zhao, Qian Liu, Yiwen Xu, Kai Wang:

Perception-Lossless Codec of Haptic Data with Low Delay. 3642-3650 - Xin Suo, Minye Wu, Yanshun Zhang, Yingliang Zhang

, Lan Xu
, Qiang Hu, Jingyi Yu:
Neural3D: Light-weight Neural Portrait Scanning via Context-aware Correspondence Learning. 3651-3660 - Jack Ratcliffe

, Laurissa Tokarchuk:
Presence, Embodied Interaction and Motivation: Distinct Learning Phenomena in an Immersive Virtual Environment. 3661-3668 - Shishir Subramanyam, Irene Viola

, Alan Hanjalic, Pablo César:
User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. 3669-3677 - Rui-Xiao Zhang, Ming Ma, Tianchi Huang, Hanyu Li, Jiangchuan Liu, Lifeng Sun:

Leveraging QoE Heterogenity for Large-Scale Livecaset Scheduling. 3678-3686 - JongBeom Jeong, Soonbin Lee

, Il-Woong Ryu, Tuan Thanh Le, Eun-Seok Ryu:
Towards Viewport-dependent 6DoF 360 Video Tiled Streaming for Virtual Reality Systems. 3687-3695
Poster Session C3: Multimedia Transport and Delivery & Multimedia Analysis and Description
- Yixiang Mao, Liyang Sun

, Yong Liu
, Yao Wang
:
Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming. 3696-3704 - Rongqun Lin

, Linwei Zhu
, Shiqi Wang, Sam Kwong
:
Towards Modality Transferable Visual Information Representation with Optimal Model Compression. 3705-3714 - Chao Zhou, Shuoqian Wang, Mengbai Xiao, Sheng Wei, Yao Liu:

AdaP-360: User-Adaptive Area-of-Focus Projections for Bandwidth-Efficient 360-Degree Video Streaming. 3715-3723 - Praveen Kumar Yadav, Wei Tsang Ooi:

Tile Rate Allocation for 360-Degree Tiled Adaptive Video Streaming. 3724-3733 - Lianli Gao, Junchen Zhu, Jingkuan Song, Feng Zheng, Heng Tao Shen:

Lab2Pix: Label-Adaptive Generative Adversarial Network for Unsupervised Image Synthesis. 3734-3742 - Zhou Yu

, Yuhao Cui, Jun Yu, Meng Wang, Dacheng Tao
, Qi Tian:
Deep Multimodal Neural Architecture Search. 3743-3752 - Jie Wen, Zheng Zhang

, Zhao Zhang
, Zhihao Wu, Lunke Fei, Yong Xu, Bob Zhang
:
DIMC-net: Deep Incomplete Multi-view Clustering Network. 3753-3761 - Bin Zhu

, Chong-Wah Ngo, Jingjing Chen
:
Cross-domain Cross-modal Food Transfer. 3762-3770 - Li-Shuai Gao, Hua Zhang, Zan Gao, Weili Guan, Zhiyong Cheng, Meng Wang:

Texture Semantically Aligned with Visibility-aware for Partial Person Re-identification. 3771-3779 - Xuanhan Wang, Lianli Gao, Jingkuan Song, Heng Tao Shen:

KTN: Knowledge Transfer Network for Multi-person DensePose Estimation. 3780-3788 - Junwen Chen, Wentao Bao

, Yu Kong:
Activity-driven Weakly-Supervised Spatio-Temporal Grounding from Untrimmed Videos. 3789-3797 - Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang:

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis. 3798-3806 - Wenqiao Zhang, Xin Eric Wang

, Siliang Tang
, Haizhou Shi, Haochen Shi
, Jun Xiao, Yueting Zhuang, William Yang Wang:
Relational Graph Learning for Grounded Video Description Generation. 3807-3828
Poster Session D3: Multimedia Analysis and Description & Multimedia Fusion and Embedding
- Deepak Kumar, Chetan Kumar

, Chun-Wei Seah, Siyu Xia, Ming Shao
:
Finding Achilles' Heel: Adversarial Attack on Multi-modal Action Recognition. 3829-3837 - Jinxing Li, Hongwei Yong, Feng Wu, Mu Li:

Online Multi-view Subspace Learning with Mixed Noise. 3838-3846 - Qiao Liu

, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou
, Di Yuan
, Jing Li, Kai Yang
, Nana Fan, Feng Zheng:
LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark. 3847-3856 - Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian:

Towards More Explainability: Concept Knowledge Mining Network for Event Recognition. 3857-3865 - Shuang Li, Binhui Xie, Jiashu Wu

, Ying Zhao, Chi Harold Liu
, Zhengming Ding:
Simultaneous Semantic Alignment Network for Heterogeneous Domain Adaptation. 3866-3874 - Liang Li

, Shijie Yang, Li Su, Shuhui Wang, Chenggang Yan, Zhengjun Zha, Qingming Huang:
Diverter-Guider Recurrent Network for Diverse Poems Generation from Image. 3875-3883 - Ying Cheng, Ruize Wang, Zhihao Pan, Rui Feng, Yuejie Zhang:

Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning. 3884-3892 - Haoming Xu, Runhao Zeng, Qingyao Wu, Mingkui Tan, Chuang Gan:

Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization. 3893-3901 - Yikai Wang, Fuchun Sun, Ming Lu, Anbang Yao:

Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion. 3902-3910 - Ruijian Jia, Xinsheng Wang, Shanmin Pang, Jihua Zhu, Jianru Xue:

Look, Listen and Infer. 3911-3919 - Zhi Chen, Wei Yang, Zhenbo Xu, Xike Xie, Liusheng Huang:

DCNet: Dense Correspondence Neural Network for 6DoF Object Pose Estimation in Occluded Scenes. 3929-3937
Poster Session E3: Multimedia Fusion and Embedding & Music, Speech and Audio & Summarization, Analytics and Storytelling
- Xuejing Liu, Liang Li

, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang:
Transferrable Referring Expression Grounding with Concept Transfer and Context Inheritance. 3938-3946 - Yanhui Guo

, Xi Zhang
, Xiaolin Wu:
Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos. 3947-3955 - Yingying Zhang, Quan Fang, Shengsheng Qian, Changsheng Xu:

Multi-modal Multi-relational Feature Aggregation Network for Medical Knowledge Representation Learning. 3956-3965 - Wenqiao Zhang, Siliang Tang

, Yanpeng Cao, Jun Xiao, Shiliang Pu, Fei Wu, Yueting Zhuang:
Photo Stream Question Answer. 3966-3975 - Xinhang Song, Haitao Zeng, Sixian Zhang, Luis Herranz

, Shuqiang Jiang:
Generalized Zero-shot Learning with Multi-source Semantic Embeddings for Scene Recognition. 3976-3985 - Xia Du

, Chi-Man Pun, Zheng Zhang
:
A Unified Framework for Detecting Audio Adversarial Examples. 3986-3994 - Kunihiro Miyazaki, Takayuki Uchiba, Scarlett Young, Yuichi Sasaki, Kenji Tanaka

:
Emerging Topic Detection on the Meta-data of Images from Fashion Social Media. 3995-4003 - Xin Li, Tianwei Lin, Xiao Liu, Wangmeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen, Chuang Gan:

Deep Concept-wise Temporal Convolutional Networks for Action Localization. 4004-4012 - Shuang Wu, Shaojing Fan, Zhiqi Shen

, Mohan S. Kankanhalli
, Anthony K. H. Tung:
Who You Are Decides How You Tell. 4013-4022 - Junyan Wang

, Yang Bai, Yang Long, Bingzhang Hu, Zhenhua Chai, Yu Guan, Xiaolin Wei:
Query Twice: Dual Mixture Attention Meta Learning for Video Summarization. 4023-4031 - Kai Niu, Yan Huang, Liang Wang:

Textual Dependency Embedding for Person Search by Language. 4032-4040 - Chenchen Jing, Yuwei Wu, Mingtao Pei, Yao Hu, Yunde Jia, Qi Wu:

Visual-Semantic Graph Matching for Visual Grounding. 4041-4050 - Yi Zheng

, Wenda Qin, Derry Wijaya
, Margrit Betke:
LAL: Linguistically Aware Learning for Scene Text Recognition. 4051-4059
Poster Session F3: Vision and Language
- Fen Liu, Guanghui Xu, Qi Wu, Qing Du, Wei Jia, Mingkui Tan:

Cascade Reasoning Network for Text-based Visual Question Answering. 4060-4069 - Daizong Liu, Xiaoye Qu, Xiao-Yang Liu, Jianfeng Dong, Pan Zhou, Zichuan Xu:

Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization. 4070-4078 - Zijian Zhang

, Zhou Zhao, Zhu Zhang, Baoxing Huai, Jing Yuan:
Text-Guided Image Inpainting. 4079-4087 - Mohan Zhang, Qiqi Gao, Jinglu Wang

, Henrik Turbell, David Zhao, Jinhui Yu, Yan Lu:
RT-VENet: A Convolutional Network for Real-time Video Enhancement. 4088-4097 - Zhu Zhang, Zhijie Lin, Zhou Zhao, Jieming Zhu, Xiuqiang He:

Regularized Two-Branch Proposal Networks for Weakly-Supervised Moment Retrieval in Videos. 4098-4106 - Miao Zhang, Yu Zhang, Yongri Piao, Beiqi Hu, Huchuan Lu:

Feature Reintegration over Differential Treatment: A Top-down and Adaptive Fusion Network for RGB-D Salient Object Detection. 4107-4115 - Hao Wang, Zheng-Jun Zha, Xuejin Chen, Zhiwei Xiong, Jiebo Luo

:
Dual Path Interaction Network for Video Moment Localization. 4116-4124 - Guiyu Tian, Shuai Wang, Jie Feng, Li Zhou, Yadong Mu:

Cap2Seg: Inferring Semantic and Spatial Context from Captions for Zero-Shot Image Segmentation. 4125-4134 - Congcong Zhu

, Xiaoqiang Li, Jide Li, Guangtai Ding, Weiqin Tong:
Spatial-Temporal Knowledge Integration: Robust Self-Supervised Facial Landmark Tracking. 4135-4143 - Zengyi Qin

, Jinglu Wang
, Yan Lu:
Weakly Supervised 3D Object Detection from Point Clouds. 4144-4152 - Fenglin Liu, Xian Wu

, Shen Ge, Xiaoyu Zhang
, Wei Fan, Yuexian Zou:
Bridging the Gap between Vision and Language Domains for Improved Image Captioning. 4153-4161 - Da Cao, Yawen Zeng, Meng Liu, Xiangnan He, Meng Wang, Zheng Qin:

STRONG: Spatio-Temporal Reinforcement Learning for Cross-Modal Video Moment Localization. 4162-4170 - Heqian Qiu, Hongliang Li

, Qingbo Wu, Fanman Meng, Hengcan Shi, Taijin Zhao, King Ngi Ngan:
Language-Aware Fine-Grained Object Representation for Referring Expression Comprehension. 4171-4180 - Xu Yang, Chongyang Gao, Hanwang Zhang

, Jianfei Cai:
Hierarchical Scene Graph Encoder-Decoder for Image Paragraph Captioning. 4181-4189
Poster Session G3: Vision and Language
- Yong Wang

, Wenkai Zhang, Qing Liu, Zhengyuan Zhang, Xin Gao, Xian Sun:
Improving Intra- and Inter-Modality Visual Relation for Image Captioning. 4190-4198 - Xiaoshuai Sun, Xuying Zhang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji:

Exploring Language Prior for Mode-Sensitive Visual Attention Modeling. 4199-4207 - Jiacheng Li

, Siliang Tang
, Juncheng Li, Jun Xiao, Fei Wu, Shiliang Pu, Yueting Zhuang:
Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling. 4208-4216 - Anwen Hu, Shizhe Chen, Qin Jin:

ICECAP: Information Concentrated Entity-aware Image Captioning. 4217-4225 - Jiayi Ji

, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji, Fuhai Chen, Jianzhuang Liu, Qi Tian:
Attacking Image Captioning Towards Accuracy-Preserving Target Words Removal. 4226-4234 - Ye Liu, Junsong Yuan, Chang Wen Chen

:
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection. 4235-4243 - Siyuan Pan, Ling Dai

, Xuhong Hou, Huating Li, Bin Sheng:
ChefGAN: Food Image Generation from Recipes. 4244-4252 - Fei Liu, Jing Liu, Xinxin Zhu, Richang Hong, Hanqing Lu:

Dual Hierarchical Temporal Convolutional Network with QA-Aware Dynamic Normalization for Video Story Question Answering. 4253-4261 - Omkar Gune, Biplab Banerjee, Subhasis Chaudhuri, Fabio Cuzzolin:

Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy Separation. 4262-4270 - Zipeng Xu, Fangxiang Feng, Xiaojie Wang, Yushu Yang, Huixing Jiang, Zhongyuan Wang:

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue. 4271-4279 - Xiaoye Qu, Pengwei Tang, Zhikang Zou, Yu Cheng, Jianfeng Dong, Pan Zhou, Zichuan Xu:

Fine-grained Iterative Attention Network for Temporal Language Localization in Videos. 4280-4288 - Zhipu Liu, Lei Zhang, Yang Yang:

Hierarchical Bi-Directional Feature Perception Network for Person Re-Identification. 4289-4298 - Zhongzhou Zhang, Lei Zhang:

Hard Negative Samples Emphasis Tracker without Anchors. 4299-4308 - Yankun Xi, Guoli Yan, Jing Hua, Zichun Zhong:

JointFontGAN: Joint Geometry-Content GAN for Font Generation via Few-Shot Learning. 4309-4317
Poster Session H3: Vision and Language
- Hua Qi, Qing Guo, Felix Juefei-Xu, Xiaofei Xie

, Lei Ma, Wei Feng, Yang Liu
, Jianjun Zhao:
DeepRhythm: Exposing DeepFakes with Attentional Visual Heartbeat Rhythms. 4318-4327 - Jinglin Liu, Yi Ren, Zhou Zhao, Chen Zhang, Baoxing Huai, Jing Yuan:

FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire. 4328-4336 - Jing Wang, Jinhui Tang, Jiebo Luo

:
Multimodal Attention with Image Text Spatial Relationship for OCR-Based Image Captioning. 4337-4345 - Yi Zhang

, Jitao Sang:
Towards Accuracy-Fairness Paradox: Adversarial Example-based Data Augmentation for Visual Debiasing. 4346-4354 - Botian Shi, Lei Ji, Zhendong Niu

, Nan Duan
, Ming Zhou, Xilin Chen:
Learning Semantic Concepts and Temporal Alignment for Narrated Video Procedural Captioning. 4355-4363 - Quan Meng, Jiakai Zhang

, Qiang Hu, Xuming He, Jingyi Yu:
LGNN: A Context-aware Line Segment Detector. 4364-4372 - Shengyu Zhang, Tan Jiang, Tan Wang, Kun Kuang, Zhou Zhao, Jianke Zhu, Jin Yu, Hongxia Yang, Fei Wu:

DeVLBert: Learning Deconfounded Visio-Linguistic Representations. 4373-4382 - Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, Jianfeng Gao:

Sequential Attention GAN for Interactive Image Editing. 4383-4391
Interactive Art Session
- Tiago Martins

, João Correia
, Sérgio Rebelo
, João Bicker
, Penousal Machado:
Portraits of No One: An Internet Artwork. 4392-4393 - Ruixue Liu, Shaozu Yuan, Meng Chen, Baoyang Chen

, Zhijie Qiu, Xiaodong He:
MaLiang: An Emotion-driven Chinese Calligraphy Artwork Composition System. 4394-4396 - Xiaohui Wang, Xia Liang, Miao Lu, Jingyan Qin:

First Impression: AI Understands Personality. 4397-4398 - Siyu Jin, Jingyan Qin, Wenfa Li:

Draw Portraits by Music: A Music based Image Style Transformation. 4399-4400 - Xiaohui Wang, Xiaoxue Ding, Jinke Li, Jingyan Qin:

Little World: Virtual Humans Accompany Children on Dramatic Performance. 4401-4402 - James She, Carmen Ng, Wadia Sheng:

Keep Running - AI Paintings of Horse Figure and Portrait. 4403-4404 - Siyu Hu, Bo Shui

, Siyu Jin, Xiaohui Wang:
AI Mirror: Visualize AI's Self-knowledge. 4405-4406
Brave New Ideas Session
- Tianlang Chen, Wei Xiong, Haitian Zheng, Jiebo Luo

:
Image Sentiment Transfer. 4407-4415 - Ali Rostami, Vaibhav Pandey, Nitish Nag, Vesper Wang, Ramesh C. Jain:

Personal Food Model. 4416-4424 - Christian von der Weth, Ashraf M. Abdul

, Shaojing Fan, Mohan S. Kankanhalli
:
Helping Users Tackle Algorithmic Threats on Social Media: A Multimedia Research Agenda. 4425-4434
Reproducibility Session
- Fan Yu, Dandan Wang, Haonan Wang, Tongwei Ren, Jinhui Tang, Gangshan Wu, Jingjing Chen

, Michael Riegler:
Reproducibility Companion Paper: Instance of Interest Detection. 4435-4438 - Xin Wang

, Bo Wu, Yueqi Zhong, Wei Hu, Jan Zahálka
:
Reproducibility Companion Paper: Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network. 4439-4443 - Quoc-Tuan Truong, Hady W. Lauw

, Martin Aumüller, Naoko Nitta:
Reproducibility Companion Paper: Visual Sentiment Analysis for Review Images with Item-Oriented and User-Oriented CNN. 4444-4447 - Tuan Hoang, Thanh-Toan Do, Ngai-Man Cheung, Michael Riegler, Jan Zahálka

:
Reproducibility Companion Paper: Selective Deep Convolutional Features for Image Retrieval. 4448-4452
Open Source Software
- Huaizheng Zhang

, Yuanming Li, Yizheng Huang, Yonggang Wen, Jianxiong Yin, Kyle Guan:
MLModelCI: An Automatic Cloud Platform for Efficient MLaaS. 4453-4456 - Huaizheng Zhang

, Yuanming Li, Qiming Ai, Yong Luo, Yonggang Wen, Yichao Jin, Ta Nguyen Binh Duong:
Hysia: Serving DNN-Based Video-to-Retail Applications in Cloud. 4457-4460 - Benyi Hu, Ren-Jie Song, Xiu-Shen Wei, Yazhou Yao, Xian-Sheng Hua, Yuehu Liu:

PyRetri: A PyTorch-based Library for Unsupervised Image Retrieval by Deep Convolutional Neural Networks. 4461-4464 - Ralph Gasser

, Luca Rossetto
, Silvan Heller, Heiko Schuldt
:
Cottontail DB: An Open Source Database System for Multimedia Retrieval and Analysis. 4465-4468 - Joseph Bethge, Christian Bartz, Haojin Yang, Christoph Meinel:

BMXNet 2: An Open Source Framework for Low-bit Networks - Reproducing, Understanding, Designing and Showcasing. 4469-4472 - Yuhao Cheng, Wu Liu, Pengrui Duan, Jingen Liu, Tao Mei

:
PyAnomaly: A Pytorch-based Toolkit for Video Anomaly Detection. 4473-4476 - Giuseppe Ribezzo, Luca De Cicco, Vittorio Palmisano, Saverio Mascolo:

TAPAS-360°: A Tool for the Design and Experimental Evaluation of 360° Video Streaming Systems. 4477-4480 - Miroslav Kratochvíl

, Frantisek Mejzlík, Patrik Veselý, Tomás Soucek, Jakub Lokoc:
SOMHunter: Lightweight Video Search System with SOM-Guided Relevance Feedback. 4481-4484
Demo Session I
- Samah Saeed Baraheem

, Trung-Nghia Le
, Tam V. Nguyen
:
Text-to-Image Synthesis via Aesthetic Layout. 4485-4487 - Zijun Sha, Zelong Zeng, Zheng Wang, Yoichi Natori, Yasuhiro Taniguchi, Shin'ichi Satoh:

Progressive Domain Adaptation for Robot Vision Person Re-identification. 4488-4490 - Paula Viana

, Pedro Carvalho
, Maria Teresa Andrade
, Pieter P. Jonker, Vasileios Papanikolaou
, Inês N. Teixeira, Luís Vilaça, José P. Pinto, Tiago Soares da Costa:
Semantic Storytelling Automation: A Context-Aware and Metadata-Driven Approach. 4491-4493 - Yanyi Zhang, Ming Kong

, Tianqi Zhao, Wenchen Hong, Qiang Zhu, Fei Wu:
ADHD Intelligent Auxiliary Diagnosis System Based on Multimodal Information Fusion. 4494-4496 - Jounsup Park, Mingyuan Wu, Eric Lee, Klara Nahrstedt, Yash Shah, Arielle Rosenthal, John O. Murray

, Kevin Spiteri, Michael Zink
, Ramesh K. Sitaraman
:
Video 360 Content Navigation for Mobile HMD Devices. 4497-4499 - Yuanfeng Song, Di Jiang, Xiaoling Huang, Yawen Li, Qian Xu, Raymond Chi-Wing Wong, Qiang Yang:

GoldenRetriever: A Speech Recognition System Powered by Modern Information Retrieval. 4500-4502 - Andrew C. Freeman

, Ketan Mayer-Patel:
Integrating Event Camera Sensor Emulator. 4503-4505 - Alex Lee, Chang-Uk Kwak, Jeong-Woo Son, Gyeong-June Hahm

, Minho Han, Sun-Joong Kim:
Scene-segmented Video Information Annotation System V2.0. 4506-4508 - Tan Tang, Junxiu Tang

, Jiewen Lai, Lu Ying, Peiran Ren, Lingyun Yu, Yingcai Wu:
SmartShots: Enabling Automatic Generation of Videos with Data Visualizations Embedded. 4509-4511
Demo Session II
- Sha Yu, Kevin McGuinness

, Patricia Moore, David Azcona, Noel E. O'Connor
:
A Smart-Site-Survey System using Image-based 3D Metric Reconstruction and Interactive Panorama Visualization. 4512-4514 - Ning Zhang, Tong Shen, Yue Chen, Wei Zhang, Dan Zeng, Jingen Liu, Tao Mei:

AI-SAS: Automated In-match Soccer Analysis System. 4515-4517 - Maarten Sukel, Stevan Rudinac, Marcel Worring

:
Detecting Urban Issues With the Object Detection Kit. 4518-4520 - Yaohua Bu, Weijun Li

, Tianyi Ma, Shengqi Chen
, Jia Jia, Kun Li, Xiaobo Lu:
Visual-speech Synthesis of Exaggerated Corrective Feedback. 4521-4523 - Gjorgji Strezoski, Lucas Fijen, Jonathan Mitnik, Dániel László, Pieter de Marez Oyens, Yoni Schirris, Marcel Worring

:
TindART: A Personal Visual Arts Recommender. 4524-4526 - Dhruv Verma, Kshitij Gulati, Vasu Goel, Rajiv Ratn Shah

:
Fashionist: Personalising Outfit Recommendation for Cold-Start Scenarios. 4527-4529 - Xuncheng Liu

, Jingyi Wang, Weizhan Zhang, Qinghua Zheng, Xuanya Li:
EmotionTracker: A Mobile Real-time Facial Expression Tracking System with the Assistant of Public AI-as-a-Service. 4530-4532 - Xuanyu Wang

, Yang Wang, Yan Shi, Weizhan Zhang, Qinghua Zheng:
AvatarMeeting: An Augmented Reality Remote Interaction System With Personalized Avatars. 4533-4535 - Haolin Ren, Zheng Wang, Zhixiang Wang, Lixiong Chen, Shin'ichi Satoh, Daning Hu:

An Interactive Design for Visualizable Person Re-Identification. 4536-4538
Demo Session III
- Filippo Mameli, Marco Bertini, Leonardo Galteri

, Alberto Del Bimbo:
Image and Video Restoration and Compression Artefact Removal Using a NoGAN Approach. 4539-4541 - Wentao Jiang, Si Liu, Chen Gao, Ran He, Bo Li, Shuicheng Yan:

Beautify As You Like. 4542-4544 - Jiawei Zuo, Yue Chen, Linfang Wang, Yingwei Pan, Ting Yao, Ke Wang, Tao Mei

:
iDirector: An Intelligent Directing System for Live Broadcast. 4545-4547 - Ali Rostami, Bihao Xu, Ramesh C. Jain:

Multimedia Food Logger. 4548-4549 - Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Tao Mei

:
A Cross-modality and Progressive Person Search System. 4550-4552 - Teo T. Niemirepo, Marko Viitanen

, Jarno Vanne
:
Binocular Multi-CNN System for Real-Time 3D Pose Estimation. 4553-4555 - Itsuki Hashimoto, Yuanyuan Wang

, Yukiko Kawai, Kazutoshi Sumiya:
An Interaction-based Video Viewing Support System using Geographical Relationships. 4556-4558 - Feijie Wu

, Ho Yin Yuen
, Henry C. B. Chan
, Victor C. M. Leung, Wei Cai
:
Infinity Battle: A Glance at How Blockchain Techniques Serve in a Serverless Gaming System. 4559-4561 - Ekin Gedik, Hayley Hung:

ConfFlow: A Tool to Encourage New Diverse Collaborations. 4562-4564
Grand Challenge: SMP Challenge
- Xin Lai, Yihong Zhang, Wei Zhang:

HyFea: Winning Solution to Social Media Popularity Prediction for Multimedia Grand Challenge 2020. 4565-4569 - Kai Wang, Penghui Wang

, Xin Chen, Qiushi Huang, Zhendong Mao, Yongdong Zhang:
A Feature Generalization Framework for Social Media Popularity Prediction. 4570-4574 - Weilong Chen, Feng Hong, Chenghao Huang, Shaoliang Zhang, Rui Wang, Ruobing Xie, Feng Xia, Leyu Lin, Yanru Zhang, Yan Wang:

Curriculum Learning for Wide Multimedia-Based Transformer with Graph Target Detection. 4575-4579 - Kele Xu

, Zhimin Lin, Jianqiao Zhao, Peichang Shi, Wei Deng, Huaimin Wang:
Multimodal Deep Learning for Social Media Popularity Prediction With Attention Mechanism. 4580-4584 - Chih-Chung Hsu

, Wen-Hai Tseng, Hao-Ting Yang, Chia-Hsiang Lin, Chi-Hung Kao:
Rethinking Relation between Model Stacking and Recurrent Neural Networks for Social Media Prediction. 4585-4589
Grand Challenge: Video Relation understanding & Pre-training for Video Captions Challenge
- Wentao Xie, Guanghui Ren, Si Liu:

Video Relation Detection with Trajectory-aware Multi-modal Features. 4590-4594 - Zhipeng Luo, Zhiguang Zhang, Yuehan Yao:

A Strong Baseline for Multiple Object Tracking on VidOR Dataset. 4595-4599 - Yiqing Huang

, Qiuyu Cai, Siyu Xu
, Jiansheng Chen:
XlanV Model with Adaptively Multi-Modality Feature Fusing for Video Captioning. 4600-4604 - Jingwen Chen, Hongyang Chao:

VideoTRM: Pre-training for Video Captioning Challenge 2020. 4605-4609 - Lanxiao Wang, Chao Shang

, Heqian Qiu, Taijin Zhao, Benliu Qiu
, Hongliang Li
:
Multi-stage Tag Guidance Network in Video Caption. 4610-4614
Grand Challenge: Human Centric Analysis I
- Jinlong Peng, Yueyang Gu, Yabiao Wang

, Chengjie Wang
, Jilin Li, Feiyue Huang:
Dense Scene Multiple Object Tracking with Box-Plane Matching. 4615-4619 - Ancong Wu, Chengzhi Lin, Bogao Chen, Weihao Huang, Zeyu Huang, Wei-Shi Zheng:

Transductive Multi-Object Tracking in Complex Events by Interactive Self-Training. 4620-4624 - Bing Shuai, Andrew G. Berneshawi, Manchen Wang, Chunhui Liu, Davide Modolo, Xinyu Li, Joseph Tighe:

Application of Multi-Object Tracking with Siamese Track-RCNN to the Human in Events Dataset. 4625-4629 - Shuning Chang, Li Yuan, Xuecheng Nie, Ziyuan Huang

, Yichen Zhou, Yunpeng Chen
, Jiashi Feng, Shuicheng Yan:
Towards Accurate Human Pose Estimation in Videos of Crowded Scenes. 4630-4634 - Lei Yuan, Shu Zhang, Fubiao Feng, Naike Wei, Huadong Pan:

Combined Distillation Pose. 4635-4639
Grand Challenge: Deep Video Understanding & BioMedia
- Fan Yu, Dandan Wang, Beibei Zhang, Tongwei Ren:

Deep Relationship Analysis in Video with Multimodal Feature Fusion. 4640-4644 - Matthias Baumgartner, Luca Rossetto

, Abraham Bernstein:
Towards Using Semantic-Web Technologies for Multi-Modal Knowledge Graph Construction. 4645-4649 - Vishal Anand, Raksha Ramesh, Ziyin Wang, Yijing Feng, Jiana Feng, Wenfeng Lyu, Tianle Zhu, Serena Yuan, Ching-Yung Lin:

Story Semantic Relationships from Multimodal Cognitions. 4650-4654 - Steven Alexander Hicks, Vajira Thambawita, Hugo Lewi Hammer, Trine B. Haugen, Jorunn M. Andersen, Oliwia Witczak, Pål Halvorsen, Michael A. Riegler:

ACM Multimedia BioMedia 2020 Grand Challenge Overview. 4655-4658 - Ming Feng

, Kele Xu
, Yin Wang:
A Quantitative Comparison of Different Machine Learning Approaches for Human Spermatozoa Quality Prediction Using Multimodal Datasets. 4659-4663
Grand Challenge: CitySCENE
- Kun Liu, Minzhi Zhu, Huiyuan Fu, Huadong Ma, Tat-Seng Chua:

Enhancing Anomaly Detection in Surveillance Videos with Transfer Learning from Action Recognition. 4664-4668 - Jie Wu, Yingying Li, Wei Zhang, Yi Wu, Xiao Tan, Hongwu Zhang, Shilei Wen, Errui Ding, Guanbin Li:

Modularized Framework with Category-Sensitive Abnormal Filter for City Anomaly Detection. 4669-4673 - Soumil Kanwal, Vineet Mehta, Abhinav Dhall:

Large Scale Hierarchical Anomaly Detection and Temporal Localization. 4674-4678 - Hui Lv

, Chunyan Xu, Zhen Cui:
Global Information Guided Video Anomaly Detection. 4679-4683
Grand Challenge: Human Centric Analysis II
- Li Yuan, Shuning Chang, Ziyuan Huang

, Yichen Zhou, Yupeng Chen, Xuecheng Nie, Francis E. H. Tay, Jiashi Feng, Shuicheng Yan:
A Simple Baseline for Pose Tracking in Videos of Crowed Scenes. 4684-4688 - Lumin Xu

, Ruihan Xu, Sheng Jin:
HiEve ACM MM Grand Challenge 2020: Pose Tracking in Crowded Scenes. 4689-4693 - Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang

, Yupeng Chen, Xuecheng Nie, Tao Wang
, Jiashi Feng, Shuicheng Yan:
Toward Accurate Person-level Action Recognition in Videos of Crowed Scenes. 4694-4698 - Yanbin Hao, Zi-Niu Liu, Hao Zhang, Bin Zhu

, Jingjing Chen
, Yu-Gang Jiang, Chong-Wah Ngo:
Person-level Action Recognition in Complex Events via TSD-TSM Networks. 4699-4702 - Tingtian Li

, Zixun Sun, Xiao Chen:
Group-Skeleton-Based Human Action Recognition in Complex Events. 4703-4707
Grand Challenge: AI Meets Beauty
- Jun Yu, Guochen Xie, Mengyan Li, Haonian Xie, Xinlong Hao, Fang Gao, Feng Shuang:

Attention Based Beauty Product Retrieval Using Global and Local Descriptors. 4708-4712 - Runming Yan, Yongchun Lin, Zhichao Deng, Liang Lei, Chudong Xu:

Multi-Feature Fusion Method Based on Salient Object Detection for Beauty Product Retrieval. 4713-4717 - Jingwen Hou, Sijie Ji, Annan Wang:

Attention-driven Unsupervised Image Retrieval for Beauty Products with Visual and Textual Clues. 4718-4722 - Fangxiang Feng, Tianrui Niu, Ruifan Li, Xiaojie Wang, Huixing Jiang:

Learning Visual Features from Product Title for Image Retrieval. 4723-4727 - Toan H. Vu, An Dang, Jia-Ching Wang:

Learning to Remember Beauty Products. 4728-4732 - Kele Xu

, Yuzhong Liu, Ming Feng, Jianqiao Zhao, Huaimin Wang, Hengxing Cai:
Multi-Scale Generalized Attention-Based Regional Maximum Activation of Convolutions for Beauty Product Retrieval. 4733-4737
Doctoral Symposium
- Mathieu Febvay:

Low-level Optimizations for Faster Mobile Deep Learning Inference Frameworks. 4738-4742 - Ha Thi Phuong Thao:

Deep Neural Networks for Predicting Affective Responses from Movies. 4743-4747 - Abhinav Shukla:

Learning Self-Supervised Multimodal Representations of Human Behaviour. 4748-4751 - Wen Guo

:
Multi-person Pose Estimation in Complex Physical Interactions. 4752-4755
Workshop Summaries
- Raphaël Troncy, Jorma Laaksonen

, Hamed R. Tavakoli, Lyndon J. B. Nixon, Vasileios Mezaris, Mohammad Hosseini:
AI4TV 2020: 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery. 4756-4757 - Tanaya Guha, Vlad Hosu

, Dietmar Saupe, Bastian Goldlücke, Naveen Kumar, Weisi Lin, Victor R. Martinez, Krishna Somandepalli, Shrikanth Narayanan, Wen-Huang Cheng, Kree McLaughlin, Hartwig Adam, John See
, Lai-Kuan Wong:
ATQAM/MAST'20: Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal Trends. 4758-4760 - Xavier Alameda-Pineda, Miriam Redi, Jahna Otterbacher, Nicu Sebe, Shih-Fu Chang:

FATE/MM 20: 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in MultiMedia. 4761-4762 - Wu Liu, Chuang Gan, Jingkuan Song, Dingwen Zhang, Wenbing Huang, John Smith:

HUMA'20: 1st International Workshop on Human-Centric Multimedia Analysis. 4763-4764 - Rainer Lienhart, Thomas B. Moeslund

, Hideo Saito:
MMSports'20: 3rd International Workshop on Multimedia Content Analysis in Sports. 4765-4766 - Alex Hauptmann, João Magalhães, Ricardo Gamelas Sousa, João Paulo Costeira

:
MuCAI'20: 1st International Workshop on Multimodal Conversational AI. 4767-4768 - Lukas Stappen, Björn W. Schuller, Iulia Lefter, Erik Cambria

, Ioannis Kompatsiaris:
Summary of MuSe 2020: Multimodal Sentiment Analysis, Emotion-target Engagement and Trustworthiness Detection in Real-life Media. 4769-4770 - Xinbo Gao, Patrick Le Callet, Jing Li, Zhi Li, Wen Lu, Jiachen Yang:

QoEVMA'20: 1st Workshop on Quality of Experience (QoE) in Visual Multimedia Applications. 4771-4772 - Valérie Gouet-Brunet, Margarita Khokhlova, Ronak Kosti, Liming Chen, Xu-Cheng Yin:

SUMAC 2020: The 2nd Workshop on Structuring and Understanding of Multimedia heritAge Contents. 4773-4774
Tutorials
- Xin Wang, Wenwu Zhu, Yonghong Tian, Wen Gao:

Multimedia Intelligence: When Multimedia Meets Artificial Intelligence. 4775-4776 - Andrea Cavallaro, Mohammad Malekzadeh, Ali Shahin Shamsabadi

:
Deep Learning for Privacy in Multimedia. 4777-4778 - Gerald Friedland:

Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data. 4779-4781 - Shuqiang Jiang, Weiqing Min:

Food Computing for Multimedia. 4782-4784 - Shayok Chakraborty:

Active Learning for Multimedia Computing: Survey, Recent Trends and Applications. 4785-4786 - Martin Alain, Emin Zerman

, Cagri Ozcinar:
Immersive Imaging Technologies: From Capture to Display. 4787-4788 - Zheng Wang, Wu Liu, Yusuke Matsui, Shin'ichi Satoh:

Effective and Efficient: Toward Open-world Instance Re-identification. 4789-4790 - Jen-Tzung Chien

:
Deep Bayesian Multimedia Learning. 4791-4793
Panels
- Jiaying Liu

, Wen-Huang Cheng, Klara Nahrstedt, Ramesh C. Jain, Elisa Ricci
, Hyeran Byun:
Coping with Pandemics: Opportunities and Challenges for AI Multimedia in the "New Normal". 4794-4795 - Susanne Boll, Hari Sundaram, Svetha Venkatesh, Martha A. Larson, Mohan S. Kankanhalli

:
The World has Changed - The World Needs to Change. What Multimedia has to Offer for Our Common Digital Future. 4796-4798
Keynote Talks
- Klara Nahrstedt:

360-Video Navigation for 360-Multimedia Delivery Systems: Research Challenges and Opportunities. 4799 - Itamar Friedman:

Cloud Drive Apps - Closing the Gap Between AI Research to Practice. 4800 - Dong Yu:

Building Digital Human. 4801 - Shuicheng Yan:

Neural Network Design for Multimedia: Bio-inspired and Hardware-friendly. 4802

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














