SlideShare a Scribd company logo
2022. 06. 10
Mem2Seq: Effectively Incorporating Knowledge Bases into
End-to-End Task-Oriented Dialog Systems
Andrea Madotto, Chien-Sheng Wu, Pascale Fung
ACL 2018
Hongkyu Lim
Contents
• Overview
• Introduction
• Model Description
• Memory Encoder
• Memory Decoder
• Sentinel
• Memory Content
• Experimental Results
• Analysis and Discussion
• Conclusion
3
Overview
• In Task Oriented Dialog system, it is hard to combine Knowledge
base(KB).
• Struggling to combine KB to RNN hidden states
• Time consumption : using attention mechanism
• Mem2Seq is a solution to solve the issues.
• Mem2Seq is a model that combines pointer network and attention.
4
Introduction
• Task oriented dialog system is used to conduct particular objectives.
• It is essential to generate query with KB.
• Currently(2018), RNN based on hidden states has yielded good
performances.
•  But, there are still problems
• It is hard to comprehend KB and RNN hidden states
• Takes too long to process long sequences with attention
5
Introduction
• MemNN
• A Recurrent attention model to utilize large external memory
• Reports embedding to the external memory
• Reads the memory repeatedly with query vectors
• This approach enables…
• Remembers KB for longer than before
• Encodes long sequential dialog fast
• However…
• MemNN only chooses from the pool.
• It does not generate answers.
6
Model Description
• Mem2Seq
• Solves the limitations of MemNN
• Mem2Seeq relates concepts of pointer network to multi-hop attention mechanism.
• Mem2Seq copies words directly from KB
• Mem2Seq learns generating dynamic query to access to memory.
7
Model Description
• Mem2Seq(architecture)
• Composed of MemNN Encoder and memory decoder
• MemNN Encoder makes vectors for dialog reports
• Memory Decoder generates responses by reading and copying memory
8
Model Description
• Terms & Equations
• Sequence Tokens for dialog records
• $ is a special sign of token to generate words from memory content
• Tuple for Knowledge Base
• Concat of X and B
9
Model Description
• Memory Encoder
• 𝑈 is a word wise concatenation of dialog and sentinel token.
• The memory of MemNN is represented as
• 𝐶 is a vector mapped with token used in reading query vectors.
• Repeated for K hops.
• For each memory sequence, the model calculates attention weights at hop k.
10
Model Description
• Memory Encoder
• pk is responsible for memory selector to assign relations between memory
queries.
• The model reads memory ok through the sum of weights
• The result of the encoder is ok and it is the input of the decoder of Mem2Seq.
11
Model Description
• Memory Decoder
• Uses both dialog records and KB
• GRU modules receives previously generated words and query to generate new
queries every time step t.
• Query h0 is the result of the Encoder
• In every step, the decoder computes vocabulary distributions and memory
contents distributions
• The decoder generates tokens at the memory by pointing the input words.
12
Model Description
• Sentinel
• If memory has no required words, memory content distribution yields sentinel
words.
• Memory Content
• Dialog record is saved in the memory with respect to words.
• Speakers and time are added to each token.
• When saving KB, the token is based on subject, relations, and objects.
• KB is only used to consult on particular conversations.
13
Experimental Results
14
Analysis and Discussion
• Memory Attention
• As shown in the picture, the
distribution of weights is very clear.
15
Conclusion
• Mem2Seq is a memory to sequence model for task—oriented dialog
system in end-to-end framework.
• Mem2Seq is combining multi-hop attention mechanism of end-to-end
memory network with pointer network.
• They validated the performance of Mem2Seq with experiments.
Thank you

More Related Content

Similar to Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems (20)

PPTX
Chatbot
Liam Bui
 
PPTX
Parallel Distributed Deep Learning on HPCC Systems
HPCC Systems
 
PPTX
Deep Learning Project.pptx
TasnimRahman54
 
PPTX
lecture03_EmbeddedSoftware for Beginners
MahmoudElsamanty
 
PDF
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
Ritu Arora
 
PPTX
241202_Thuy_Labseminar[Multi-View Mixture-of-Experts for Predicting Molecular...
thanhdowork
 
PDF
Trends in DNN compression
Kaushalya Madhawa
 
PPT
Reduced instruction set computers
Syed Zaid Irshad
 
PPTX
Survey of Attention mechanism
SwatiNarkhede1
 
PPTX
Morph : a novel accelerator
BaharJV
 
PPTX
Jms deep dive [con4864]
Ryan Cuprak
 
PDF
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Lviv Data Science Summer School
 
PDF
社内勉強会資料_AnyGPT_Unified Multimodal LLM with Discrete Sequence Modeling
NABLAS株式会社
 
PDF
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Sergey Karayev
 
PDF
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Intel® Software
 
PPTX
Basic Structure of a Computer System
Amirthavalli Senthil
 
PDF
CE412 -advanced computer Architecture lecture 1.pdf
AdelAbougdera
 
PDF
embedded system-Memory_Organization_final.pdf
SarveshPandey64
 
PPTX
Survey of Attention mechanism & Use in Computer Vision
SwatiNarkhede1
 
PPTX
Embedded C
Krunal Siddhapathak
 
Chatbot
Liam Bui
 
Parallel Distributed Deep Learning on HPCC Systems
HPCC Systems
 
Deep Learning Project.pptx
TasnimRahman54
 
lecture03_EmbeddedSoftware for Beginners
MahmoudElsamanty
 
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
Ritu Arora
 
241202_Thuy_Labseminar[Multi-View Mixture-of-Experts for Predicting Molecular...
thanhdowork
 
Trends in DNN compression
Kaushalya Madhawa
 
Reduced instruction set computers
Syed Zaid Irshad
 
Survey of Attention mechanism
SwatiNarkhede1
 
Morph : a novel accelerator
BaharJV
 
Jms deep dive [con4864]
Ryan Cuprak
 
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Lviv Data Science Summer School
 
社内勉強会資料_AnyGPT_Unified Multimodal LLM with Discrete Sequence Modeling
NABLAS株式会社
 
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Sergey Karayev
 
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Intel® Software
 
Basic Structure of a Computer System
Amirthavalli Senthil
 
CE412 -advanced computer Architecture lecture 1.pdf
AdelAbougdera
 
embedded system-Memory_Organization_final.pdf
SarveshPandey64
 
Survey of Attention mechanism & Use in Computer Vision
SwatiNarkhede1
 

More from ivaderivader (20)

PPTX
Argument Mining
ivaderivader
 
PPTX
Papers at CHI23
ivaderivader
 
PPTX
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
ivaderivader
 
PPTX
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
ivaderivader
 
PPTX
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
ivaderivader
 
PPTX
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
ivaderivader
 
PPTX
A Style-Based Generator Architecture for Generative Adversarial Networks
ivaderivader
 
PPTX
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
ivaderivader
 
PPTX
Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
ivaderivader
 
PPTX
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
ivaderivader
 
PPTX
Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
ivaderivader
 
PPTX
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
ivaderivader
 
PPTX
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
ivaderivader
 
PPTX
Invertible Denoising Network: A Light Solution for Real Noise Removal
ivaderivader
 
PPTX
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
ivaderivader
 
PPTX
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
ivaderivader
 
PPTX
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
ivaderivader
 
PPTX
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
ivaderivader
 
PPTX
Natural Language to Visualization by Neural Machine Translation
ivaderivader
 
PPTX
Recommending What Video to Watch Next: A Multitask Ranking System
ivaderivader
 
Argument Mining
ivaderivader
 
Papers at CHI23
ivaderivader
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
ivaderivader
 
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
ivaderivader
 
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
ivaderivader
 
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
ivaderivader
 
A Style-Based Generator Architecture for Generative Adversarial Networks
ivaderivader
 
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
ivaderivader
 
Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
ivaderivader
 
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
ivaderivader
 
Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
ivaderivader
 
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
ivaderivader
 
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
ivaderivader
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
ivaderivader
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
ivaderivader
 
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
ivaderivader
 
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
ivaderivader
 
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
ivaderivader
 
Natural Language to Visualization by Neural Machine Translation
ivaderivader
 
Recommending What Video to Watch Next: A Multitask Ranking System
ivaderivader
 
Ad

Recently uploaded (20)

PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
July Patch Tuesday
Ivanti
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Ad

Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems

  • 1. 2022. 06. 10 Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems Andrea Madotto, Chien-Sheng Wu, Pascale Fung ACL 2018 Hongkyu Lim
  • 2. Contents • Overview • Introduction • Model Description • Memory Encoder • Memory Decoder • Sentinel • Memory Content • Experimental Results • Analysis and Discussion • Conclusion
  • 3. 3 Overview • In Task Oriented Dialog system, it is hard to combine Knowledge base(KB). • Struggling to combine KB to RNN hidden states • Time consumption : using attention mechanism • Mem2Seq is a solution to solve the issues. • Mem2Seq is a model that combines pointer network and attention.
  • 4. 4 Introduction • Task oriented dialog system is used to conduct particular objectives. • It is essential to generate query with KB. • Currently(2018), RNN based on hidden states has yielded good performances. •  But, there are still problems • It is hard to comprehend KB and RNN hidden states • Takes too long to process long sequences with attention
  • 5. 5 Introduction • MemNN • A Recurrent attention model to utilize large external memory • Reports embedding to the external memory • Reads the memory repeatedly with query vectors • This approach enables… • Remembers KB for longer than before • Encodes long sequential dialog fast • However… • MemNN only chooses from the pool. • It does not generate answers.
  • 6. 6 Model Description • Mem2Seq • Solves the limitations of MemNN • Mem2Seeq relates concepts of pointer network to multi-hop attention mechanism. • Mem2Seq copies words directly from KB • Mem2Seq learns generating dynamic query to access to memory.
  • 7. 7 Model Description • Mem2Seq(architecture) • Composed of MemNN Encoder and memory decoder • MemNN Encoder makes vectors for dialog reports • Memory Decoder generates responses by reading and copying memory
  • 8. 8 Model Description • Terms & Equations • Sequence Tokens for dialog records • $ is a special sign of token to generate words from memory content • Tuple for Knowledge Base • Concat of X and B
  • 9. 9 Model Description • Memory Encoder • 𝑈 is a word wise concatenation of dialog and sentinel token. • The memory of MemNN is represented as • 𝐶 is a vector mapped with token used in reading query vectors. • Repeated for K hops. • For each memory sequence, the model calculates attention weights at hop k.
  • 10. 10 Model Description • Memory Encoder • pk is responsible for memory selector to assign relations between memory queries. • The model reads memory ok through the sum of weights • The result of the encoder is ok and it is the input of the decoder of Mem2Seq.
  • 11. 11 Model Description • Memory Decoder • Uses both dialog records and KB • GRU modules receives previously generated words and query to generate new queries every time step t. • Query h0 is the result of the Encoder • In every step, the decoder computes vocabulary distributions and memory contents distributions • The decoder generates tokens at the memory by pointing the input words.
  • 12. 12 Model Description • Sentinel • If memory has no required words, memory content distribution yields sentinel words. • Memory Content • Dialog record is saved in the memory with respect to words. • Speakers and time are added to each token. • When saving KB, the token is based on subject, relations, and objects. • KB is only used to consult on particular conversations.
  • 14. 14 Analysis and Discussion • Memory Attention • As shown in the picture, the distribution of weights is very clear.
  • 15. 15 Conclusion • Mem2Seq is a memory to sequence model for task—oriented dialog system in end-to-end framework. • Mem2Seq is combining multi-hop attention mechanism of end-to-end memory network with pointer network. • They validated the performance of Mem2Seq with experiments.