BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...Nguyen Thanh Tu Collection
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptxPoojaSen20
Care of patients with elImination deviation.pptxAneetaSharma15
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptxPriyanshu Anand
An introduction to Prepositions for beginners.pptxdrsiddhantnagine
Gupta Art & Architecture Temple and Sculptures.pptxVirag Sontakke
Python-Application-in-Drug-Design by R D Jawarkar.pptxRahul Jawarkar
Command Palatte in Odoo 18.1 Spreadsheet - Odoo SlidesCeline George
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptxPriyanshu Anand
Introduction to pediatric nursing in 5th Sem..pptxAneetaSharma15
250310_JH_labseminar[CASER : Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding].pptx
1. CASER : Personalized Top-N
Sequential Recommendation via
Convolutional Sequence Embedding
Ju-Hee SHIM
Network Science Lab
Dept. of AI
The Catholic University of Korea
E-mail: [email protected]
Jiaxi Tang, Ke Wang
WSDM 2018
3. 3
INTRODUCTION
Motivation
Problems of Existing top-N Recommendation Models
• User’s general preferences are used as the basis for recommendations
• User’s general preferences : Reflect only static behavioral information of the user (ex. A person who likes
Samsung products is only recommended Samsung products, and a person who likes Apple products is
only recommended Apple products)
• However, these unidirectional models have the following limitations:
Simply recommend items related IPHONE
: losing the opportunity to recommend phone accessories
4. 4
INTRODUCTION
Motivation
Limitations of Traditional Markov Chain-Based Models
a) Point-Level : The probability of purchasing a specific item often increases when multiple past items are
combined -> fail to capture this effect (ex. A user who buys milk and butter it likely to purchase flour, but this
is not reflected)
b) Skip Behaviors : Unable to account for skipped behaviors -> Traditional models assume continuous influence,
but in reall-world data, “skip” frequently occur
5. 5
Architecture
Transforming User Sequences into a Matrix “IMAGE” :
• Applying CNN :
• Convert the traditional 1D item sequence into an L x d matrix.
• L : the most recent L items
• d : Embedding dimension
• Horizontal Filters : Learning Union-Level Sequential Patterns
• Capturing patterns where multiple item combinations influence behavior
• Vertical Filters : Learning Point-Level Sequential Patterns
• Similar to traditional Markov Chain approaches
• Adding User embedding :
• Incorporate User Embedding to model Long-term user preferences effectively
CASER
6. 6
Architecture
Transformer Layer:
• Consists of L bidirectional Transformer layers.
• Each layer refines the user behavior sequence
received from the previous layer to enhance
representation power.
• In each layer, all item representations influence and
update each other.
• Unlike RNN-based models, which pass information
only from past to future, Self-Attention enables global
interaction across all items in the sequence.
Method
7. 7
Architecture
Method
Embedding Look-up:
• Retrieving Past Item Embeddings :
• Locate L past item embeddings of user U in the latent space
• Stack these embedding to construct the final Embedding
matrix (E) for training
• Create an embedding table using d-dimensional latent factors
• Q(item), P(User)
8. 8
Architecture
Method
Convolutional Layers :
• Treat the embedding matrix (E) as an "image" and apply
convolutional layers to capture sequential patterns in
user behavior
• Consider sequential patterns as local features within the
image
• Utilize two types of convolutional filters:
• 1) Vertical Convolutional Layer :
• Captures point-level sequential patterns
• Computes a weighted sum over the latent
representations of the past L items
9. 9
Architecture
Method
Convolutional Layers :
• 2) Horizontal Convolutional Layer :
• Captures union-level patterns.
• Varies the filter height (h) to extract diverse
sequential features
• To extracted most Significant feature, using
max-pooling
10. 10
Architecture
Method
Fully-connected Layers :
• Concatenate the outputs from the horizontal and
vertical filters
• Feed the concatenated features into a fully-connected
layer to extract high-level abstract features
• Concatenate user embedding with the extracted
features to capture general user preferences -> Pass the
final representation to the output layer for prediction
11. 11
Architecture
Method
Network Training & Recommendation:
• Apply the sigmoid activation function to the output layer to transform the output value y into a
probability.
• Compute the likelihood across all sequences in the dataset for training
• Use the user’s last L item embeddings to compute y-values for all items
• Select the top-N items with the highest y-values for recommendatio
16. 16
Evaluation
Results
Ablation Study Results
• Caser model outperforms Fossil, GRU4Rec in terms of MAP, with the best performance observed at T =2,3.
• As the Markov Order L increases, performance improves and then plateaus; in sparse datasets,
excessively large L can lead to performance degradation.
• Markov Targets T contributes to performance improvement = Predicting multiple future items
simultaneously is more effective than predicting just one
17. 17
Evaluation
Results
Ablation Study Results
• Performance results based on the usage of each compontent
• P : personalization(user embedding), h: horizontal convolutional layer, v : vertical convolutional
layer
• The best performance is achieved when all three components are used together
18. 18
Conclusion
Conclusion
The author of this paper was proposing CASER, a novel approach to top-N sequential recommendation.
CASER captures information from point-level and union-level sequential patterns, skip behaviors, and long-
term user preferences.
A unique aspect of CASER is it’s attempt to interpret a user’s 1D sequence as a 2D image representation.
This approach could be particularly meaningful in industries where the sequential dependency of user
behavior is weak.