Active Product Sales Analysis using Matplotlib in Python
Last Updated :
10 Sep, 2024
Every modern company that engages in online sales or maintains a specialized e-commerce website now aims to maximize its throughput in order to determine what precisely their clients need in order to increase their chances of sales. The huge datasets handed to us can be properly analyzed to find out what time of day has the highest user activity in terms of transactions.
In this post, We will use Python Pandas and Matplotlib to analyze the insight of the dataset. We can use the column Transaction Date, in this case, to glean useful insights on the busiest time (hour) of the day. You can access the entire dataset here.
Stepwise Implementation
Step 1:
First, We need to create a Dataframe of the dataset, and even before that certain libraries have to be imported.
Python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Order_Details = pd.read_csv('Order_details(masked).csv')
Output:
Step 2:
Create a new column called Time that has the DateTime format after converting the Transaction Date column into it. The DateTime format, which has the pattern YYYY-MM-DD HH:MM:SS, can be customized however you choose. Here we're more interested in obtaining hours, so we can have an Hour column by using an in-built function for the same:
Python
# here we have taken Transaction
# date column
Order_Details['Time'] = pd.to_datetime(Order_Details['Transaction Date'])
# After that we extracted hour
# from Transaction date column
Order_Details['Hour'] = (Order_Details['Time']).dt.hour
Step 3:
We then require the "n" busiest hours. For that, we get the first "n" entries in a list containing the occurrence rates of the hours when the transaction took place. To further simplify the manipulation of the provided data in Python, we may utilize value counts for frequencies and tolist() to convert to list format. We are also compiling a list of the associated index values.
Python
# n =24 in this case, can be modified
# as per need to see top 'n' busiest hours
timemost1 = Order_Details['Hour'].value_counts().index.tolist()[:24]
timemost2 = Order_Details['Hour'].value_counts().values.tolist()[:24]
Step 4:
Finally, we stack the indices (hour) and frequencies together to yield the final result.
Python
tmost = np.column_stack((timemost1,timemost2))
print(" Hour Of Day" + "\t" + "Cumulative Number of Purchases \n")
print('\n'.join('\t\t'.join(map(str, row)) for row in tmost))
Step 5:
Before we can create an appropriate data visualization, we must make the list slightly more customizable. To do so, we gather the hourly frequencies and perform the following tasks:
Python
timemost = Order_Details['Hour'].value_counts()
timemost1 = []
for i in range(0,23):
timemost1.append(i)
timemost2 = timemost.sort_index()
timemost2.tolist()
timemost2 = pd.DataFrame(timemost2)
Step 6:
For data visualization, we will proceed with Matplotlib for better comprehensibility, as it is one of the most convenient and commonly used libraries. But, It is up to you to choose any of the pre-existing libraries like Matplotlib, Ggplot, Seaborn, etc., to plot the data graphically.
The commands written below are mainly to ensure that X-axis takes up the values of hours and Y-axis takes up the importance of the number of transactions affected, and also various other aspects of a line chart, including color, font, etc., to name a few.
Python
plt.figure(figsize=(20, 10))
plt.title('Sales Happening Per Hour (Spread Throughout The Week)',
fontdict={'fontname': 'monospace', 'fontsize': 30}, y=1.05)
plt.ylabel("Number Of Purchases Made", fontsize=18, labelpad=20)
plt.xlabel("Hour", fontsize=18, labelpad=20)
plt.plot(timemost1, timemost2, color='m')
plt.grid()
plt.show()
The results are indicative of how sales typically peak in late evening hours prominently, and this data can be incorporated into business decisions to promote a product during that time specifically.
Get the complete notebook link here
Colab Link : click here.
Dataset Link : click here.
Similar Reads
Python Tutorial - Learn Python Programming Language Python is one of the most popular programming languages. Itâs simple to use, packed with features and supported by a wide range of libraries and frameworks. Its clean syntax makes it beginner-friendly. It'sA high-level language, used in web development, data science, automation, AI and more.Known fo
10 min read
Support Vector Machine (SVM) Algorithm Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. It tries to find the best boundary known as hyperplane that separates different classes in the data. It is useful when you want to do binary classification like spam vs. not spam or
9 min read
Logistic Regression in Machine Learning Logistic Regression is a supervised machine learning algorithm used for classification problems. Unlike linear regression which predicts continuous values it predicts the probability that an input belongs to a specific class. It is used for binary classification where the output can be one of two po
11 min read
Learn Data Science Tutorial With Python Data Science has become one of the fastest-growing fields in recent years, helping organizations to make informed decisions, solve problems and understand human behavior. As the volume of data grows so does the demand for skilled data scientists. The most common languages used for data science are P
3 min read
File Handling in Python File handling refers to the process of performing operations on a file such as creating, opening, reading, writing and closing it, through a programming interface. It involves managing the data flow between the program and the file system on the storage device, ensuring that data is handled safely a
7 min read
Python Lambda Functions Python Lambda Functions are anonymous functions means that the function is without a name. As we already know the def keyword is used to define a normal function in Python. Similarly, the lambda keyword is used to define an anonymous function in Python. In the example, we defined a lambda function(u
6 min read
Python Quiz These Python quiz questions are designed to help you become more familiar with Python and test your knowledge across various topics. From Python basics to advanced concepts, these topic-specific quizzes offer a comprehensive way to practice and assess your understanding of Python concepts. These Pyt
3 min read
Python Keywords Keywords in Python are reserved words that have special meanings and serve specific purposes in the language syntax. Python keywords cannot be used as the names of variables, functions, and classes or any other identifier. Getting List of all Python keywordsWe can also get all the keyword names usin
2 min read
Top 65+ Data Science Projects with Source Code Dive into the exciting world of data science with our Top 65+ Data Science Projects with Source Code. These projects are designed to help you gain hands-on experience and sharpen your skills, whether youâre a beginner or looking to upscale your data science knowledge. Covering everything from trend
6 min read
Printing Pyramid Patterns in Python Pyramid patterns are sequences of characters or numbers arranged in a way that resembles a pyramid, with each level having one more element than the level above. These patterns are often used for aesthetic purposes and in educational contexts to enhance programming skills.Exploring and creating pyra
9 min read