Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems

Download as PPTX, PDF

0 likes49 views

The document discusses mem2seq, a model designed to effectively integrate knowledge bases into task-oriented dialog systems by combining pointer networks with multi-hop attention mechanisms. It addresses limitations of previous models, enabling faster encoding of long sequences and generating responses by directly accessing memory. Experimental results validate its performance in enhancing dialogue generation capabilities.

Technology

2022. 06. 10
Mem2Seq: Effectively Incorporating Knowledge Bases into
End-to-End Task-Oriented Dialog Systems
Andrea Madotto, Chien-Sheng Wu, Pascale Fung
ACL 2018
Hongkyu Lim

Contents
• Overview
• Introduction
• Model Description
• Memory Encoder
• Memory Decoder
• Sentinel
• Memory Content
• Experimental Results
• Analysis and Discussion
• Conclusion

3
Overview
• In Task Oriented Dialog system, it is hard to combine Knowledge
base(KB).
• Struggling to combine KB to RNN hidden states
• Time consumption : using attention mechanism
• Mem2Seq is a solution to solve the issues.
• Mem2Seq is a model that combines pointer network and attention.

4
Introduction
• Task oriented dialog system is used to conduct particular objectives.
• It is essential to generate query with KB.
• Currently(2018), RNN based on hidden states has yielded good
performances.
•  But, there are still problems
• It is hard to comprehend KB and RNN hidden states
• Takes too long to process long sequences with attention

5
Introduction
• MemNN
• A Recurrent attention model to utilize large external memory
• Reports embedding to the external memory
• Reads the memory repeatedly with query vectors
• This approach enables…
• Remembers KB for longer than before
• Encodes long sequential dialog fast
• However…
• MemNN only chooses from the pool.
• It does not generate answers.

6
Model Description
• Mem2Seq
• Solves the limitations of MemNN
• Mem2Seeq relates concepts of pointer network to multi-hop attention mechanism.
• Mem2Seq copies words directly from KB
• Mem2Seq learns generating dynamic query to access to memory.

7
Model Description
• Mem2Seq(architecture)
• Composed of MemNN Encoder and memory decoder
• MemNN Encoder makes vectors for dialog reports
• Memory Decoder generates responses by reading and copying memory

8
Model Description
• Terms & Equations
• Sequence Tokens for dialog records
• $ is a special sign of token to generate words from memory content
• Tuple for Knowledge Base
• Concat of X and B

9
Model Description
• Memory Encoder
• 𝑈 is a word wise concatenation of dialog and sentinel token.
• The memory of MemNN is represented as
• 𝐶 is a vector mapped with token used in reading query vectors.
• Repeated for K hops.
• For each memory sequence, the model calculates attention weights at hop k.

10
Model Description
• Memory Encoder
• pk is responsible for memory selector to assign relations between memory
queries.
• The model reads memory ok through the sum of weights
• The result of the encoder is ok and it is the input of the decoder of Mem2Seq.

11
Model Description
• Memory Decoder
• Uses both dialog records and KB
• GRU modules receives previously generated words and query to generate new
queries every time step t.
• Query h0 is the result of the Encoder
• In every step, the decoder computes vocabulary distributions and memory
contents distributions
• The decoder generates tokens at the memory by pointing the input words.

12
Model Description
• Sentinel
• If memory has no required words, memory content distribution yields sentinel
words.
• Memory Content
• Dialog record is saved in the memory with respect to words.
• Speakers and time are added to each token.
• When saving KB, the token is based on subject, relations, and objects.
• KB is only used to consult on particular conversations.

14
Analysis and Discussion
• Memory Attention
• As shown in the picture, the
distribution of weights is very clear.

15
Conclusion
• Mem2Seq is a memory to sequence model for task—oriented dialog
system in end-to-end framework.
• Mem2Seq is combining multi-hop attention mechanism of end-to-end
memory network with pointer network.
• They validated the performance of Mem2Seq with experiments.

More Related Content

Similar to Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems (20)

PPTX

ChatbotLiam Bui

PPTX

Parallel Distributed Deep Learning on HPCC SystemsHPCC Systems

PPTX

Deep Learning Project.pptxTasnimRahman54

PPTX

lecture03_EmbeddedSoftware for BeginnersMahmoudElsamanty

PDF

PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...Ritu Arora

PPTX

241202_Thuy_Labseminar[Multi-View Mixture-of-Experts for Predicting Molecular...thanhdowork

PDF

Trends in DNN compressionKaushalya Madhawa

PPT

Reduced instruction set computersSyed Zaid Irshad

PPTX

Survey of Attention mechanismSwatiNarkhede1

PPTX

Morph : a novel acceleratorBaharJV

PPTX

Jms deep dive [con4864]Ryan Cuprak

PDF

Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...Lviv Data Science Summer School

PDF

社内勉強会資料_AnyGPT_Unified Multimodal LLM with Discrete Sequence ModelingNABLAS株式会社

PDF

Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)Sergey Karayev

PDF

Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Intel® Software

PPTX

Basic Structure of a Computer SystemAmirthavalli Senthil

PDF

CE412 -advanced computer Architecture lecture 1.pdfAdelAbougdera

PDF

embedded system-Memory_Organization_final.pdfSarveshPandey64

PPTX

Survey of Attention mechanism & Use in Computer VisionSwatiNarkhede1

PPTX

Embedded CKrunal Siddhapathak

ChatbotLiam Bui

Parallel Distributed Deep Learning on HPCC SystemsHPCC Systems

Deep Learning Project.pptxTasnimRahman54

lecture03_EmbeddedSoftware for BeginnersMahmoudElsamanty

PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...Ritu Arora

241202_Thuy_Labseminar[Multi-View Mixture-of-Experts for Predicting Molecular...thanhdowork

Trends in DNN compressionKaushalya Madhawa

Reduced instruction set computersSyed Zaid Irshad

Survey of Attention mechanismSwatiNarkhede1

Morph : a novel acceleratorBaharJV

Jms deep dive [con4864]Ryan Cuprak

Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...Lviv Data Science Summer School

社内勉強会資料_AnyGPT_Unified Multimodal LLM with Discrete Sequence ModelingNABLAS株式会社

Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)Sergey Karayev

Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Intel® Software

Basic Structure of a Computer SystemAmirthavalli Senthil

CE412 -advanced computer Architecture lecture 1.pdfAdelAbougdera

embedded system-Memory_Organization_final.pdfSarveshPandey64

Survey of Attention mechanism & Use in Computer VisionSwatiNarkhede1

Embedded CKrunal Siddhapathak

More from ivaderivader (20)

PPTX

Argument Miningivaderivader

PPTX

Papers at CHI23ivaderivader

PPTX

DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader

PPTX

So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality ivaderivader

PPTX

Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...ivaderivader

PPTX

Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...ivaderivader

PPTX

A Style-Based Generator Architecture for Generative Adversarial Networksivaderivader

PPTX

CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...ivaderivader

PPTX

Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualizationivaderivader

PPTX

Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...ivaderivader

PPTX

Neural Approximate Dynamic Programming for On-Demand Ride-Poolingivaderivader

PPTX

StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...ivaderivader

PPTX

Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTubeivaderivader

PPTX

Invertible Denoising Network: A Light Solution for Real Noise Removalivaderivader

PPTX

Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Networkivaderivader

PPTX

MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training ivaderivader

PPTX

Screen2Vec: Semantic Embedding of GUI Screens and GUI Componentsivaderivader

PPTX

Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...ivaderivader

PPTX

Natural Language to Visualization by Neural Machine Translationivaderivader

PPTX

Recommending What Video to Watch Next: A Multitask Ranking Systemivaderivader