11-777 lecture 1.1 introduction

background

Recently, I find a good cources about multimodal machine learning. In this blog, I will study it and note my understanding.

O

master multimodal basic work

KR

  1. what is modality ?
  2. multimodal develop history
  3. main area in multimodal

1.what is modality ?

modality :

  • the way in which something happens or experienced.
  • it includes sensory form(touch,feel) or a certain type of information(image, speech).

Medium :

  • a means for storing or communicating information.
    在这里插入图片描述
    Here is examples of modalities:
    在这里插入图片描述

2. multimodal develop history

  1. The “behavioral” era (1970s until late 1980s)
    The McGurk Effect (1976)
    在这里插入图片描述

  2. The “computational” era (late 1980s until 2000)
    Audio-Visual Speech Recognition (AVSR)
    Affective Computing

  3. The “interaction” era (2000 - 2010)
    Human Multimodal Interaction ways.
    在这里插入图片描述

  4. The “deep learning” era (2010s until …)
    在这里插入图片描述

3. main areas in multimodal

multimodal has 5 core thories, 37 applicationes, 235 related work.
here are five areas.

1. Representation

Definition : Learning how to represent and summarize multimodal data in away
that exploits the complementarity and redundancy.
在这里插入图片描述
demo :
在这里插入图片描述

main framewrok :
在这里插入图片描述
coordinated representaions is aiming to max corrlelated and make uncorrelated ventors distincitly.
在这里插入图片描述

2. Alignment

find correspondences between elements of modalities.
在这里插入图片描述
Demo :

在这里插入图片描述

3. fusion

Definition: To join information from two or more modalities to perform a
prediction task.

  1. it is not talking about detail model name,But fcou on when, how, what to fusion.
    在这里插入图片描述
  2. Model-Based (Intermediate) Approaches
  1. Deep neural networks
  2. Kernel-based methods
  3. Graphical models

4. Translation

Definition: Process of changing data from one modality to another, where the
translation relationship can often be open-ended or subjective.
在这里插入图片描述

5. Co-Learning

Definition: Transfer knowledge between modalities, including their
representations and predictive models.

I will omit due I am not research it.

5. summary

在这里插入图片描述
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值