How to Calculate AUC (Area Under Curve) in R?
Last Updated :
06 Aug, 2024
In this article, we will discuss how to calculate the AUC (Area under Curve) of the ROC (Receiver Operating Characteristic) curve in the R Programming Language.
What is the AUC-ROC curve?
The ROC (Receiver Operating Characteristic) curve helps us to visualize the true positive rate or true negative rate of a prediction based on some model. This helps us to assess how well a regression model has fitted the data. The AUC (Area under Curve) of this ROC curve helps us to determine the specificity and sensitivity of the model. The closer the AUC value is to the 1, the better the given model fits the data.
pROC Package in r to Calculate AUC-ROC
To create the ROC (Receiver Operating Characteristic) curve object in the R Language, we use the roc() function of the pROC package library. The pROC is an R Language package to display and analyze ROC curves. The roc() function takes the actual and predicted value as an argument and returns a ROC curve object as a result. Then, to find the AUC (Area under Curve) of that curve, we use the auc() function. The auc() function takes the roc object as an argument and returns the area under the curve of that roc curve.
Syntax:
roc_object <- roc( response, prediction )
Parameters:
- response: determines the vector that contains the actual data.
- prediction: determines the vector that contains the data predicted by our model.
Installing and Loading the Libraries
Let's first install the package using the install.packages() function and then we will load the same using the library() function in R Programming Language.
R
install.packages("pROC")
library(pROC)
The area under the ROC curve of a logistic regression model.
Initializing the Dataset
Now let's create a sample dataset and then we will use it to train a model and then calculate the AUC-ROC of the model using pROC.
R
# sample data frame
df_train <- data.frame( x= c(1,2,3,4,5),
y= c(1,5,8,15,26),
z=c(0,1,1,0,0))
df_test <- data.frame( x= c(6,7,8),
y= c(38,45,72),
z=c(0,1,0))
Training Model and Calculating ROC
Now let's train a glm() Â model on the dataset which we have created above and then make predictions using it. After we get the necessary predictions we will calculate the AUC-ROC value from it.
R
# fit logistic model
model <- glm(z ~ x+y, data=df_train)
# predicted data
prediction <- predict(model, df_test,
type="response")
# create roc curve
roc_object <- roc( df_test$z, prediction)
# calculate area under curve
auc(roc_object)
Output:
Area under the curve: 0.5
Metrics Package in R to Calculate AUC-ROC
The Metrics package contains implementation for approximately all the evaluation metrics which are used in the supervised machine-learning tasks whether it is related to the regression, time-series, or classification-related task. In the below code implementation, we can observe how easy it is to calculate auc() using the Metrics Package in R.
Installing and Loading the Libraries
Let's first install the package using the install.packages() function and then we will load the same using the library() function in R.
R
install.packages("Metrics")
library(Metrics)
Initializing the Vectors
Now let's create two imaginary vectors with the actual target values and the predicted probabilities for the respective classes.
R
# Actual Target variable
actual <- c(0, 0, 1, 1, 1, 0, 0)
# Predicted probabilities for
# the respective classes
predicted <- c(.1, .3, .4, .9,
0.76, 0.55, 0.2)
Calculating AUC-ROC
Syntax:
auc(actual, predicted)
where,
- actual - The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class.
- predicted - A vector containing probabilities predicted by the model of each example being 1.
R
Output:
0.916666666666667
MLtools Package in R to Calculate AUC-ROC
The mltools package contains helper functions that help majorly in exploratory data analysis. The main objective behind this function is that it provides highly optimized functions for speed and memory. In the below code implementation, we can observe how easy it is to calculate auc() using the Metrics Package in R.
Installing and Loading the Libraries
Let's first install the package using the install.packages() function and then we will load the same using the library() function in R.
R
install.packages("mltools")
library(mltools)
Initializing the Vectors
Now let's create two imaginary vectors with the actual target values and the predicted probabilities for the respective classes.
R
# Actual Target variable
actual <- c(0, 0, 1, 1, 1, 0, 0)
# Predicted probabilities for
# the respective classes
predicted <- c(.1, .3, .4, .9,
0.76, 0.55, 0.2)
Calculating AUC-ROC
Syntax:
auc_roc(actual, predicted, returnDT)
where,
- actual - The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class.
- predicted - A vector containing probabilities predicted by the model of each example being 1.
- returnDT - Returns a data.table object with False Positive Rate and True Positive Rate for plotting the ROC curve
R
auc_roc(predicted, actual)
Output:
0.916666666666667
R
auc_roc(predicted, actual, returnDT=TRUE)
Output:
TPR and FPR for the actual and predicted values
Similar Reads
How to Calculate AIC in R? It is important in the analysis of the given data as it offers a means of comparing more than one model and identifying the right one to use for further prediction and inference. in this article, we will discuss what AIC is and how to Calculate AIC in the R Programming Language. What is the Akaike I
4 min read
How to calculate the Area of Circle? Answer : Area of circle(A) is calculate by using formula , Area = Ï Ã (radius)2Geometry is a discipline that includes the study of plane shapes or flat shapes. A plane shape is a two-dimensional shape that does not consist of thickness as a dimension. Some of the plane shapes are rectangle, square,
4 min read
How to Calculate Cronbachâs Alpha in R? In this article, we will learn how to calculate Cronbach's Alpha in the R Programming Language. Cronbach's Alpha helps us to measure the internal consistency of a group of data. It is a coefficient of reliability. It helps us to validate the consistency of a questionnaire or survey. The Cronbach's A
3 min read
How to Calculate Autocorrelation in R? In this article, we will calculate autocorrelation in R programming language Autocorrelation is used to measure the degree of similarity between a time series and a lagged version of itself over the given range of time intervals. We can also call autocorrelation as  âserial correlationâ or âlagged c
2 min read
Calculate Standard Error in R In this article, we are going to see how to calculate standard error in R Programming Language. Mathematically we can calculate standard error by using the formula: standard deviation/squareroot(n) In R Language, we can calculate in these ways:Using sd() function with length functionBy using the st
2 min read
How to Calculate Cross Correlation in R? In this article we will discuss how to calculate cross correlation in R programming language. Correlation is used to get the relation between two or more variables. The result is 0, if there is no correlation between two variablesThe result is 1, if there is positive correlation between two variable
1 min read