“I've known Lakshya since his tenure at Myntra and have been consistently impressed with his capabilities. He is an incredibly talented and hardworking data scientist who possesses a strong commitment to delivering measurable impact. At Microsoft, I saw him excel as a manager, where he provided patient, supportive, and highly insightful guidance to his team. Lakshya combines his deep knowledge and technical skills with an unwavering determination to achieve excellence, which I believe would make him a great asset to any team he joins.”
About
I’m passionate about solving large-scale, real-world problems in the web search domain…
Activity
-
Most interviews are decided in the first 10 minutes. Its the Bayesian Truth About ML Interviews. Interviewers don’t start with a blank slate. They…
Most interviews are decided in the first 10 minutes. Its the Bayesian Truth About ML Interviews. Interviewers don’t start with a blank slate. They…
Liked by Lakshya Kumar
-
It was an honour today to welcome Hon’ble Prime Minister Shri Narendra Modi ji at Bharat Mandapam, New Delhi, to inaugurate the first Emerging…
It was an honour today to welcome Hon’ble Prime Minister Shri Narendra Modi ji at Bharat Mandapam, New Delhi, to inaugurate the first Emerging…
Liked by Lakshya Kumar
-
All 3 co-founders of Mercor became the youngest self-made billionaires at 22, beating Mark Zuckerberg’s record from age 23 🤯 Brendan Foody, Adarsh…
All 3 co-founders of Mercor became the youngest self-made billionaires at 22, beating Mark Zuckerberg’s record from age 23 🤯 Brendan Foody, Adarsh…
Liked by Lakshya Kumar
Experience
Education
Publications
-
ListBERT: Learning to Rank E-commerce products with Listwise BERT
Sigir-Ecom'22
Efficient search is a critical component for an e-commerce platform with an innumerable number of products. Every day millions of users search for products pertaining to their needs. Thus, showing the relevant products on the top will enhance the user experience. In this work, we propose a novel approach of fusing a transformer-based model with various listwise loss functions for ranking e-commerce products, given a user query. We pre-train a RoBERTa model over a fashion e-commerce corpus and…
Efficient search is a critical component for an e-commerce platform with an innumerable number of products. Every day millions of users search for products pertaining to their needs. Thus, showing the relevant products on the top will enhance the user experience. In this work, we propose a novel approach of fusing a transformer-based model with various listwise loss functions for ranking e-commerce products, given a user query. We pre-train a RoBERTa model over a fashion e-commerce corpus and fine-tune it using different listwise loss functions. Our experiments indicate that the RoBERTa model fine-tuned with an NDCG based surrogate loss function(approxNDCG) achieves an NDCG improvement of 13.9% compared to other popular listwise loss functions like ListNET and ListMLE. The approxNDCG based RoBERTa model also achieves an NDCG improvement of 20.6% compared to the pairwise RankNet based RoBERTa model. We call our methodology of directly optimizing the RoBERTa model in an end-to-end manner with a listwise surrogate loss function as ListBERT. Since there is a low latency requirement in a real-time search setting, we show how these models can be easily adopted by using a knowledge distillation technique to learn a representation-focused student model that can be easily deployed and leads to ~10 times lower ranking latency.
Other authorsSee publication -
Neural Search: Learning Query and Product Representations in Fashion E-commerce
Sigir Ecom'21
Typical e-commerce platforms contain millions of products in the catalog. Users visit these platforms and enter search queries to retrieve their desired products. Therefore, showing the relevant products at the top is essential for the success of e-commerce plat- forms. We approach this problem by learning low dimension repre- sentations for queries and product descriptions by leveraging user click-stream data as our main source of signal for product relevance. Starting from GRU-based…
Typical e-commerce platforms contain millions of products in the catalog. Users visit these platforms and enter search queries to retrieve their desired products. Therefore, showing the relevant products at the top is essential for the success of e-commerce plat- forms. We approach this problem by learning low dimension repre- sentations for queries and product descriptions by leveraging user click-stream data as our main source of signal for product relevance. Starting from GRU-based architectures as our baseline model, we move towards a more advanced transformer-based architecture. This helps the model to learn contextual representations of queries and products to serve better search results and understand the user intent in an efficient manner. We perform experiments related to pre-training of the Transformer based RoBERTa model using a fash- ion corpus and fine-tuning it over the triplet loss. Our experiments on the product ranking task show that the RoBERTa model is able to give an improvement of 7.8% in Mean Reciprocal Rank(MRR), 15.8% in Mean Average Precision(MAP) and 8.8% in Normal- ized Discounted Cumulative Gain(NDCG), thus outperforming our GRU based baselines. For the product retrieval task, RoBERTa model is able to outperform other two models with an improvement of 164.7% in Precision@50 and 145.3% in Recall@50. In order to highlight the importance of pre-training RoBERTa for fashion domain, we qualitatively compare already pre-trained RoBERTa on standard datasets with our custom pre-trained RoBERTa over a fashion corpus for the query token prediction task. Finally, we also show a qualitative comparison between GRU and RoBERTa results for product retrieval task for some test queries. RoBERTa model can be utilized for improving the product search task and act as a good baseline that can be fine-tuned for various information retrieval tasks like query recommendations, query re-formulation, etc.
Other authorsSee publication -
Deep Contextual Embeddings for Address Classification in E-commerce
KDD AI For Fashion
E-commerce customers in developing nations like India tend to follow no fixed format while entering shipping addresses. Parsing such addresses is challenging because of a lack of inherent structure or hierarchy. It is imperative to understand the language of addresses, so that shipments can be routed without delays. In this paper, we propose a novel approach towards understanding customer addresses by deriving motivation from recent advances in Natural Language Processing (NLP). We also…
E-commerce customers in developing nations like India tend to follow no fixed format while entering shipping addresses. Parsing such addresses is challenging because of a lack of inherent structure or hierarchy. It is imperative to understand the language of addresses, so that shipments can be routed without delays. In this paper, we propose a novel approach towards understanding customer addresses by deriving motivation from recent advances in Natural Language Processing (NLP). We also formulate different pre-processing steps for addresses using a combination of edit distance and phonetic algorithms. Then we approach the task of creating vector representations for addresses using Word2Vec with TF-IDF, Bi-LSTM and BERT based approaches. We compare these approaches with respect to sub-region classification task for North and South Indian cities. Through experiments, we demonstrate the effectiveness of generalized RoBERTa model, pre-trained over a large address corpus for language modelling task. Our proposed RoBERTa model achieves a classification accuracy of around 90% with minimal text preprocessing for sub-region classification task outperforming all other approaches. Once pre-trained, the RoBERTa model can be fine-tuned for various downstream tasks in supply chain like pincode suggestion and geo-coding. The model generalizes well for such tasks even with limited labelled data. To the best of our knowledge, this is the first of its kind research proposing a novel approach of understanding customer addresses in e-commerce domain by pre-training language models and fine-tuning them for different purposes.
Other authorsSee publication -
When Numbers matter!!! Detecting Sarcasm in Numerical portions of Text
NAACL WASSA'19
Research in sarcasm detection spans almost a decade. However a particular form of sarcasm remains unexplored: sarcasm expressed through numbers, which we estimate, forms about 11% of the sarcastic tweets in our dataset. The sentence ‘Love waking up at 3 am’ is sarcastic because of the number. In this paper, we focus on detecting sarcasm in tweets arising out of numbers. Initially, to get an insight into the problem, we implement a rule- based and a statistical machine learning-based (ML)…
Research in sarcasm detection spans almost a decade. However a particular form of sarcasm remains unexplored: sarcasm expressed through numbers, which we estimate, forms about 11% of the sarcastic tweets in our dataset. The sentence ‘Love waking up at 3 am’ is sarcastic because of the number. In this paper, we focus on detecting sarcasm in tweets arising out of numbers. Initially, to get an insight into the problem, we implement a rule- based and a statistical machine learning-based (ML) classifier. The rule-based classifier conveys the crux of the numerical sarcasm problem, namely, incongruity arising out of numbers. The statistical ML classifier uncovers the indicators i.e., features of such sarcasm. The actual system in place, however, are two deep learning (DL) models, CNN and attention network that obtains an F-score of 0.93 and 0.91 on our dataset of tweets containing numbers. To the best of our knowledge, this is the first line of research investigating the phenomenon of sarcasm arising out of numbers, culminating in a detector thereof.
Other authorsSee publication -
Detecting Sarcasm in Numerical Portions of Text
arXiv
See publicationSarcasm occurring due to the presence of numerical portions in text has been quoted as an error made by automatic sarcasm detection approaches in the past. We present a first study in detecting sarcasm in numbers, as in the case of the sentence ‘Love waking up at 4 am’. We analyze the challenges of the problem, and present Rule-based, Machine Learning and Deep Learning approaches to detect sarcasm in numerical portions of text. Our Deep Learning approach outperforms four past works for sarcasm…
Sarcasm occurring due to the presence of numerical portions in text has been quoted as an error made by automatic sarcasm detection approaches in the past. We present a first study in detecting sarcasm in numbers, as in the case of the sentence ‘Love waking up at 4 am’. We analyze the challenges of the problem, and present Rule-based, Machine Learning and Deep Learning approaches to detect sarcasm in numerical portions of text. Our Deep Learning approach outperforms four past works for sarcasm detection and Rule-based and Machine learning approaches on a dataset of tweets, obtaining an F1-score of 0.93. This shows that special attention to text containing numbers may be useful to improve state-of-the-art in sarcasm detection
-
Sentiment Intensity Ranking among Adjectives using Sentiment bearing Word Embeddings
EMNLP-2017
Producing Continous Intensity Ranking of Adjectives that belongs to Framenet using Sentiment bearing word embbeddings. These word embeddings contain both the context as well as sentiment information. Sentiment information will help to separate the words like good and bad from each other. The result of this paper is that one can successfully obtain the continous intensity scale for adjectives for both the positive and negative category. The adjectives are taken from the different semantic…
Producing Continous Intensity Ranking of Adjectives that belongs to Framenet using Sentiment bearing word embbeddings. These word embeddings contain both the context as well as sentiment information. Sentiment information will help to separate the words like good and bad from each other. The result of this paper is that one can successfully obtain the continous intensity scale for adjectives for both the positive and negative category. The adjectives are taken from the different semantic categories in Framenet.
Other authorsSee publication -
Approaches for Computational Sarcasm Detection: A survey
CFILT: IIT Bombay
See publicationSentiment Analysis deals not only with the positive and negative sentiment detection in the text but it also considers the prevalence and challenges of sarcasm in sentiment-bearing text. Automatic Sarcasm detection deals with the detection of sarcasm in text. In the recent years,work in sarcasm detection gains popularity and has wide applicability in sentiment analysis. This paper complies the various approaches that are developed to tackle the problem of…
Sentiment Analysis deals not only with the positive and negative sentiment detection in the text but it also considers the prevalence and challenges of sarcasm in sentiment-bearing text. Automatic Sarcasm detection deals with the detection of sarcasm in text. In the recent years,work in sarcasm detection gains popularity and has wide applicability in sentiment analysis. This paper complies the various approaches that are developed to tackle the problem of sarcasm detection. In this paper, we describe Rule-based, Machine Learning and Deep Learning approaches for detecting sarcasm and also describes various datasets. We also give details of different features used by various sarcasm detection approaches from past upto the present.
Test Scores
-
Google Round D APAC Test 2016
Score: Rank 760
Languages
-
English
Full professional proficiency
-
Spanish
Elementary proficiency
-
Hindi
Native or bilingual proficiency
Recommendations received
7 people have recommended Lakshya
Join now to viewMore activity by Lakshya
-
Being a founder sometimes means talking a genius employee out of blowing up the entire team. Hiring smart people comes with a cost. They see things…
Being a founder sometimes means talking a genius employee out of blowing up the entire team. Hiring smart people comes with a cost. They see things…
Liked by Lakshya Kumar
-
My goal is now to do 28 pre-seed investments of 1.5 Cr this year, deploying about 45 Cr in Indian startups 4 months ago I was at 12 investments and…
My goal is now to do 28 pre-seed investments of 1.5 Cr this year, deploying about 45 Cr in Indian startups 4 months ago I was at 12 investments and…
Liked by Lakshya Kumar
-
𝐖𝐞𝐞𝐤𝐞𝐧𝐝 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠: 👉 What made 𝐃𝐞𝐞𝐩𝐒𝐞𝐞𝐤 𝐎𝐂𝐑 so popular? 👉 Did you know that annotators were paid 𝟓𝟎𝟎𝐊 𝐔𝐒𝐃 to…
𝐖𝐞𝐞𝐤𝐞𝐧𝐝 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠: 👉 What made 𝐃𝐞𝐞𝐩𝐒𝐞𝐞𝐤 𝐎𝐂𝐑 so popular? 👉 Did you know that annotators were paid 𝟓𝟎𝟎𝐊 𝐔𝐒𝐃 to…
Liked by Lakshya Kumar
-
This moment calls for a post indeed ❣️ June 23, 2017 : Full and Final Settlement October 31, 2025 : Salary credited This journey of receiving my…
This moment calls for a post indeed ❣️ June 23, 2017 : Full and Final Settlement October 31, 2025 : Salary credited This journey of receiving my…
Liked by Lakshya Kumar
-
99% of CUDA resources are noise. Here's the 1% that actually matters 👇 After burning 3 weeks on a memory-bound kernel (that I was trying to…
99% of CUDA resources are noise. Here's the 1% that actually matters 👇 After burning 3 weeks on a memory-bound kernel (that I was trying to…
Liked by Lakshya Kumar
-
TL;DR: The era of agentic organization – we equip LLMs/Agents with general organization capability through learning to organize with RL. The Era of…
TL;DR: The era of agentic organization – we equip LLMs/Agents with general organization capability through learning to organize with RL. The Era of…
Liked by Lakshya Kumar
-
We’re finally reaching the era of everyone training their own models based on open-source (versus relying on black box generalist APIs) and it is…
We’re finally reaching the era of everyone training their own models based on open-source (versus relying on black box generalist APIs) and it is…
Liked by Lakshya Kumar
-
#Hiring We’re building the future of digital advertising at Microsoft AI (Ads) —and looking for exceptional talent to be a part of same. If you're…
#Hiring We’re building the future of digital advertising at Microsoft AI (Ads) —and looking for exceptional talent to be a part of same. If you're…
Liked by Lakshya Kumar
-
📢 AACL 2025 Acceptance Announcement 📢 We are delighted to share that three of our recent papers have been accepted to IJCNLP & AACL 2025, hosted…
📢 AACL 2025 Acceptance Announcement 📢 We are delighted to share that three of our recent papers have been accepted to IJCNLP & AACL 2025, hosted…
Liked by Lakshya Kumar
-
Big milestone for Ola Electric! Our vehicles with 4680 Bharat Cell battery packs are now ARAI certified, delivering more range, better performance…
Big milestone for Ola Electric! Our vehicles with 4680 Bharat Cell battery packs are now ARAI certified, delivering more range, better performance…
Liked by Lakshya Kumar
-
Creative School needs your help! As many of you know, we are currently in the annual Giving season. If you are looking for a cause to support this…
Creative School needs your help! As many of you know, we are currently in the annual Giving season. If you are looking for a cause to support this…
Liked by Lakshya Kumar
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Lakshya Kumar in India
-
lakshya kumar
-
Lakshya Kumar
Passionate Coder | Class 12 Student | IIT-JEE Candidate | Aspiring Software Engineer
-
Lakshya Kumar
CK Birla | PwC | EM Normandie | IIM Calcutta '23 | OYO Rooms | IIT Kanpur'20
-
Lakshya Kumar
United Breweries Ltd. | Sales & Marketing Intern'24 - Havells India Limited(PPI) | MBA | IIM Visakhapatnam 2023-25 | Ex-Virtusa | Ex-Policy Bazaar | Ex-Infotech Solutions | Amity University(Gold Medalist)
-
Lakshya Kumar
Security Operations Analyst | Specializing in SOC Operations and AppSec | Focused on Blue Teaming, Troubleshooting and Network Security
615 others named Lakshya Kumar in India are on LinkedIn
See others named Lakshya Kumar